Zurich Stadtarchiv administrators are facing a decision that has been quietly building since at least 2023: what to do with an estimated tens of thousands of duplicate digitised images spread across the city's overlapping institutional collections. The problem spans holdings at the Stadtarchiv on Neumarkt, the Zentralbibliothek on Zähringerplatz, and digitisation pipelines run in partnership with ETH Zürich's e-rara and e-manuscripta platforms. Multiple scans of the same photograph, map, or document now sit in different databases, consuming server capacity, confusing researchers, and complicating the city's ambitions for a unified open-access cultural portal.
The timing matters. Zurich's cantonal culture department has been pushing a digital-first public access agenda since the passage of the revised cantonal information law, which set a 2027 deadline for expanded digital access to public records. Institutions that cannot demonstrate clean, deduplicated archives risk losing preferential funding under the next cantonal culture grant cycle, with applications due in spring 2027. Getting the data in order is no longer a back-office IT question — it is a governance and financial one.
Where the Bottlenecks Are
The core technical difficulty is that no single authority owns the problem. The Zentralbibliothek operates under a separate board from the Stadtarchiv, and ETH Zürich's library directorate answers to the federal government in Bern rather than to Zurich city hall. Each institution digitised material independently, often using different metadata standards and file-naming conventions. A 1920s photograph of the Lindenhügel might appear under three catalogue IDs across three systems, with slightly different cropping, resolution, or rights annotations on each version.
Photographic duplicates are only part of it. The Baugeschichtliches Archiv, which holds the city's architectural photography collection and is housed inside the Stadthaus on Stadthausquai, contributed its own digitisation batch to the joint portal project in 2022. That batch overlapped substantially with material the Zentralbibliothek had already scanned. Staff at both institutions are aware of the overlap but, without a shared deduplication protocol, neither has moved decisively to resolve it.
Industry practice offers a reference point. The Europeana foundation, which aggregates digitised cultural content from institutions across thirty-plus European countries, published guidance in 2024 recommending that member institutions adopt perceptual hashing — a technique that generates a fingerprint for each image regardless of file format — as the baseline deduplication standard. Institutions that have implemented it report reducing redundant image records by between 15 and 40 percent, depending on the size and age of the collection. Zurich's institutions have not yet adopted a common standard.
The Decisions That Cannot Wait
Three choices will define how this plays out. First, someone has to take institutional lead. The most practical candidate is the Stadtarchiv, which already hosts the joint data infrastructure agreement signed in January 2024 by the city, the Zentralbibliothek, and the Baugeschichtliches Archiv. But that agreement explicitly deferred the deduplication question to a later working group, which has met twice and produced no binding protocol.
Second, the institutions must agree on a canonical version rule — which scan survives when duplicates are merged, and who holds the authoritative rights record. This matters especially for images with commercial licensing potential, where conflicting rights annotations across databases create legal exposure.
Third, there is money. A full deduplication project across all three collections would require dedicated technical staff for at least twelve months, plus software licensing. Informal estimates circulating in cantonal culture circles put the cost somewhere in the range of CHF 400,000 to CHF 600,000 — significant but not extraordinary against the backdrop of the city's annual CHF 1.2 billion culture and education budget.
The next formal opportunity to move is a planned joint board meeting between the Stadtarchiv and Zentralbibliothek scheduled for September 2026. If that meeting produces a mandate, a deduplication working group with real authority could be operational by early 2027, just in time to satisfy the cantonal deadline. If it produces another deferral, the institutions will be filing their grant applications with archives that still cannot tell a researcher — or an auditor — how many unique images they actually hold.