Zurich's public institutions are sitting on a growing problem. Duplicate images — identical or near-identical digital files stored across multiple databases — are consuming server capacity, distorting search results and, in some cases, undermining the integrity of official records. The issue has moved quietly up the agenda in recent months, with archivists, urban planners and information scientists now openly debating how to fix it.
The timing matters. Zurich's cantonal administration has been expanding its digital infrastructure since at least 2023, when the city's Stadtarchiv on Neumarkt formally shifted to a cloud-hybrid document management system. That transition brought efficiency gains — but it also exposed longstanding habits of file duplication that had been invisible in older, siloed storage environments. When multiple departments upload the same aerial survey photograph, the same building permit scan or the same public event image to different repositories, the problems compound quickly.
What the Specialists Are Saying
Information scientists at ETH Zurich, whose Department of Computer Science sits on Rämistrasse, have been researching automated deduplication algorithms for several years. Their work — which has fed into broader European data governance discussions — suggests that large institutional image libraries can contain redundancy rates of between 15 and 30 percent, depending on how aggressively files are shared across internal units. That figure, while drawn from academic modelling rather than a Zurich-specific audit, gives a sense of the scale administrators may be dealing with.
At the Zentralbibliothek Zürich on Zähringerplatz, digital preservation staff have acknowledged — without providing precise internal data — that managing duplicate visual assets across its digitisation projects requires dedicated staff time. The library holds one of the largest historical image collections in the German-speaking world, and its ongoing digitisation work means new files enter the system continuously. When automated ingestion pipelines lack robust deduplication checks, identical scans can register as separate records.
Urban planners at the Stadtentwicklung Zürich office have a more practical concern. Planning decisions increasingly rely on georeferenced imagery — drone surveys of the Limmat corridor, thermal maps of Schwamendingen's housing stock, before-and-after documentation of construction zones. If duplicate files carry conflicting metadata timestamps, a planner pulling imagery for a Wohnungsnot-related development review could be working from an outdated version without knowing it. Housing pressures in districts like Altstetten and Oerlikon mean those reviews are happening fast, with little margin for data errors.
No Single Fix, But Pressure Is Building
Technology vendors operating in the Swiss market have been pitching deduplication solutions to public-sector clients for several years. Hash-based matching — where each image file is assigned a unique cryptographic fingerprint and duplicates are flagged automatically on upload — is now standard in many commercial content management platforms. But migrating legacy institutional archives to systems that enforce these checks retroactively is expensive and time-consuming, and Zurich's public procurement rules under the Beschaffungsrecht require competitive tender processes that can stretch beyond twelve months.
The Swiss Federal Archives in Bern issued internal guidance on digital asset management standards in 2024, and cantonal bodies are expected to align with those standards over a phased timeline. For Zurich's Stadtarchiv and the Zentralbibliothek, that alignment process is ongoing. Neither institution has published a public completion date.
For organisations managing their own image libraries — whether a Kreis 4 community association documenting neighbourhood events or a pharmaceutical research group near the Technopark on Technoparkstrasse — specialists recommend three immediate steps: conduct a baseline audit using freely available open-source deduplication tools, establish a single point of upload responsibility for visual assets, and apply consistent naming conventions before any migration project begins. Fixing the metadata problem after the fact costs significantly more than preventing it at ingestion.
The broader conversation about data quality in Zurich's public institutions is not going away. As the city continues building out its Smart City programme — with sensor networks, mobility data and environmental monitoring all generating image-adjacent datasets — the duplicate problem will only grow if governance frameworks do not keep pace with the technology.