Zurich's Digital Archives Are Full of Duplicate Images — and Officials Want Them Gone
City archivists, ETH Zurich researchers and cantonal administrators are calling for a systematic reckoning with redundant image data bloating public records.
City archivists, ETH Zurich researchers and cantonal administrators are calling for a systematic reckoning with redundant image data bloating public records.

Zurich's public institutions are sitting on millions of duplicate digital images — and a growing coalition of archivists, IT administrators and academic researchers says the problem has moved from housekeeping nuisance to genuine governance concern. The City Archive on Alfred-Escher-Strasse and the Cantonal Archive on Winterthurerstrasse are among the institutions under pressure to act before the redundancy problem compounds further.
The issue surfaced prominently this spring when a working group convened by the Stadtrat began auditing digital asset management across city departments. What they found, according to records filed with the city administration in April 2026, was a pattern of the same photographs, scans and planning documents appearing in three or more separate storage environments simultaneously — sometimes under different file names, sometimes not. Nobody invented this problem. It grew organically from two decades of department-by-department digitisation without a unified protocol.
The timing is not coincidental. Switzerland's revised Federal Act on Data Protection came fully into force in September 2023, placing stricter obligations on public bodies to know precisely what personal data they hold and where. Duplicate images — particularly those showing identifiable individuals from planning disputes, social welfare cases or historical urban documentation — create compliance headaches that administrators cannot simply defer. A single image stored in four locations means four potential breach vectors, four deletion obligations and four audit trails to maintain.
ETH Zurich's Institute for Information Security and Dependability has been examining the technical dimensions of the problem. Researchers there have pointed to what the field calls perceptual hashing — a method of fingerprinting image content rather than file metadata — as the most reliable automated approach to detecting true duplicates versus near-duplicates created by resizing or recompression. The institute's work, presented at a Zurich data governance symposium at the Volkshaus in February 2026, estimated that mid-sized European municipal archives routinely carry between 18 and 35 percent image redundancy by file count. Applying that range to Zurich's known holdings of roughly 4.2 million digitised image assets would imply somewhere between 750,000 and 1.5 million redundant files.
Cantons Zürich's Digital Services unit, headquartered near Walcheturm, has flagged the storage cost dimension. Commercial cloud storage rates for public bodies in Switzerland currently run at roughly CHF 0.023 per gigabyte per month under standard cantonal procurement contracts — modest per unit, but multiplied across redundant high-resolution scans, the aggregate bill is meaningful. More pressing, officials say, is staff time: manual deduplication of a document archive the size of Zurich's could absorb thousands of working hours without automated tooling.
The recommendations coming from multiple quarters converge on a few practical steps. The Stadtrat working group's April report calls for a canton-wide image registry — a central catalogue that assigns a unique persistent identifier to every image on first ingest, preventing duplicates from entering the system in the first place. That is the preventive layer. For the existing backlog, the group endorses a phased automated scan using perceptual hashing tools, followed by human review of any match flagged below a 98 percent confidence threshold.
Officials at the City Archive have separately proposed that Zurich adopt the same metadata standard already used by the Swiss Federal Archives in Bern — the ISAD(G) framework extended with digital provenance fields — to make deduplication audits machine-readable and repeatable. Without a common standard, each department effectively speaks a different language when describing what an image is and where it came from.
The practical path forward, according to the April working group report, involves a pilot project launching in the third quarter of 2026 within the Urban Planning and Construction department, which holds some of the largest photographic backlogs. If the pilot meets its targets — reducing verified redundancy in that department's holdings by at least 40 percent within six months — the methodology would roll out across remaining city departments through 2027. Institutions watching this process closely include Zentralbibliothek Zürich on Zähringerplatz, which manages its own parallel digitisation programme and faces identical questions about image governance. The outcome of the pilot will likely shape how every major Zurich public institution handles its digital image collections for the next decade.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Zurich
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News