Zurich's major public institutions are staring down a backlog of duplicate digital image files running into the hundreds of thousands, and the decisions made in the next twelve months will determine whether the city's cultural and administrative archives remain credible research tools or slide into expensive disorder. The pressure is sharpest at the Stadtarchiv Zürich on Neumarkt and at the Zentralbibliothek on Zähringerplatz, both of which have been digitising paper records at scale since the early 2010s and have accumulated layers of redundant scans in the process.
The timing is not accidental. A federal push to harmonise cantonal digital record-keeping under the Swiss e-Government strategy, which set 2025 as a benchmark year for interoperability standards, has left institutions scrambling to reconcile what they already hold with what the Confederation now requires. Duplicate image replacement — identifying a canonical version of a scanned document or photograph, retiring the redundant copies, and updating every internal link that pointed to the old file — sounds bureaucratic. The consequences of getting it wrong are not. A broken link in a legal land register image or a misidentified photograph in a public health record can trigger administrative disputes that cost far more to untangle than a proactive clean-up ever would.
What the Back-Catalogue Problem Actually Looks Like
The Zentralbibliothek alone holds digitised collections stretching back to projects begun under a 2011 Swiss National Science Foundation grant. Staff there have estimated internally — though no figure has been published officially — that post-migration audits regularly flag duplicate rates of between eight and fifteen percent in large batch-scan projects. That is a conservative estimate by the standards of similar European library digitisation programmes. The problem compounds when institutions share image metadata through Swisscollections, the national aggregation portal run by Swiss Library Service Platform, because a single duplicate propagates across every institution that harvests the feed.
ETH Zürich's library, headquartered on Rämistrasse, has been piloting automated deduplication software since early 2025 as part of its ongoing Research Data Management initiative. The approach uses perceptual hashing — a technique that identifies visually identical images even when file names, sizes or metadata differ — and flags candidates for human review rather than deleting automatically. That last safeguard matters enormously: an automated system that retires the wrong version of a historical map or a laboratory photograph can destroy irreplaceable evidence. The ETH pilot is widely watched by other Zurich institutions precisely because it combines speed with an audit trail.
The Decisions That Cannot Wait
Three choices are converging toward a deadline. First, the Stadtarchiv must decide by the end of the third quarter of 2026 whether to adopt a shared deduplication platform or run its own in-house process — a choice with significant budget implications given that the archive's digitisation unit operates on a cantonal allocation that, according to the Canton of Zurich's 2025 financial report, covers roughly CHF 1.2 million annually across all digital preservation activities. Second, the institutions need to agree on which copy of a duplicate image becomes the canonical record, since the Stadtarchiv, the Zentralbibliothek and ETH each apply different metadata schemas and none automatically yields to another. Third, and most politically sensitive, is the question of public access during the transition: if a researcher in Zürich-Wiedikon or a journalist working from the Pressehaus on Werdstrasse calls up a document that is mid-replacement, what do they see?
The Swiss Federal Archives in Bern have indicated they will publish updated guidelines on duplicate handling before the end of 2026, which gives Zurich institutions a narrow window to establish local practice before federal norms arrive and potentially override it. Institutions that have already documented their deduplication methodology will be better placed to argue for an approach that fits their specific collections. Those that have not will likely find the federal template applied to them by default.
For researchers, archivists and anyone who depends on Zurich's public image databases, the practical advice is straightforward: download and locally preserve any high-stakes file you are actively using now, note the exact URL and record identifier, and check back after any announced system migration. The clean-up is coming. The only real question is whether the city's institutions will lead it or be led by it.