Thousands of duplicate images are sitting inside Zurich's public digital archives, consuming server space, misleading researchers and slowing down catalogue searches that residents and institutions rely on daily. Archivists, data specialists and city officials are now openly debating how to clean them up — and who should pay for it.
The problem is not new, but it has grown sharper. The rapid digitisation push that followed the Covid-19 closures of 2020 and 2021 pushed enormous volumes of photographs, maps and documents into online repositories without consistent deduplication protocols. Libraries and archives that were digitising independently — often under separate city or cantonal budgets — ingested overlapping material without coordinating metadata standards. The result is a patchwork of collections where the same 1960s photograph of the Langstrasse, for example, might appear under three different filenames across two separate databases.
At the Stadtarchiv Zürich on Alfred-Escher-Strasse, staff have been flagging the issue internally since at least 2023. The archive holds records stretching back to the medieval period, and its digital portal now contains hundreds of thousands of items. According to archive professionals familiar with large municipal collections, duplication rates in rapidly digitised public archives commonly run between eight and fifteen percent of total holdings — a range that, if applied to Zurich's collections, would represent a significant administrative and storage burden. The Stadtarchiv has not published its own figures publicly.
What the Experts Are Saying
Digital preservation specialists at ETH Zürich, whose library on Rämistrasse manages one of Switzerland's largest academic image repositories, have pointed to the lack of a shared cantonal deduplication standard as the core structural flaw. Without an agreed hashing protocol — a technical method that assigns each image file a unique fingerprint to flag identical copies — institutions are left running manual checks or proprietary software that does not communicate across systems.
Professionals in the field argue that the financial case for action is straightforward. Cloud and on-premise storage costs have risen steadily across Swiss public institutions, and every redundant file compounds the problem. The Swiss Federal Archives in Bern moved toward a more systematic deduplication framework for federal-level holdings in 2024, offering a model that cantonal bodies in Zurich could, in principle, adapt. But cantonal adoption requires budget sign-off, and that conversation has not formally begun.
Representatives from the Zentralbibliothek Zürich on Zähringerplatz, which manages a substantial digitised photograph and map collection spanning several centuries, have participated in working groups on interoperability with other Swiss institutions. The library's collections overlap in certain periods with those of the Stadtarchiv, and the duplication runs in both directions. Librarians who work with these collections have described the manual reconciliation process as time-intensive, particularly when image metadata — captions, dates, rights information — differs between copies.
The Path Forward
City councillors on Zurich's culture and digital infrastructure committee are expected to discuss the matter in the autumn 2026 session, though no formal motion has yet been tabled. Advocates within the archival community want any solution to include an open-source deduplication tool that smaller district-level collections, such as the Quartierarchive operating in Aussersihl and Wiedikon, could also access without separate licensing costs.
The practical stakes extend beyond tidy databases. Journalists, genealogists, urban planners and academic researchers all use Zurich's digital archives as primary sources. When the same image appears under conflicting dates or with different rights designations, it introduces errors that can propagate through published work. One image of the Limmat riverbank documented in multiple entries with differing captions caused citation confusion in at least one ETH urban history project, according to people familiar with the research — though the university has not commented publicly on the specific case.
For now, the advice from digital preservation professionals is consistent: institutions should prioritise establishing shared metadata standards before the next major digitisation contract is awarded, and any new scanning project should include a deduplication audit as a contractual deliverable. The cost of building that requirement into future contracts is, by any measure, lower than cleaning up the backlog that has already accumulated on Alfred-Escher-Strasse and Rämistrasse.