Zurich's municipal digital archive is carrying a problem that has compounded quietly for years: duplicate images — identical or near-identical photograph files stored multiple times across different servers — now account for a significant share of the storage burden at institutions including the Stadtarchiv Zürich on Alfred-Escher-Strasse and the Zentralbibliothek Zürich near the Predigerplatz. The immediate question is no longer whether to act, but how, and who pays.
The issue has sharpened because the city's broader digitisation push, accelerated under the Smart City Zürich programme, has created a flood of newly scanned material without a unified deduplication standard to filter it. Every department — from Hochbau to Stadtplanung — has uploaded visual assets independently, producing a sprawling patchwork of redundant files that complicate both retrieval and long-term preservation. Storage costs are not abstract: commercial cloud storage for institutions in Switzerland commonly runs above CHF 0.02 per gigabyte per month for compliant, Swiss-hosted solutions, and municipal archives are dealing with holdings now measured in hundreds of terabytes.
The Technical Fork in the Road
City archivists and IT procurement officers are weighing two broad approaches. The first is automated deduplication: software that identifies pixel-level or hash-matched copies and flags them for deletion or consolidation. Several European municipal archives, including those in Vienna and Hamburg, have piloted this route with mixed results — automated tools can misidentify near-duplicates that actually carry distinct metadata, such as different digitisation dates or provenance records, destroying archival value in the process.
The second approach is a manual-review hybrid, where algorithms surface candidates and a trained archivist makes the final call. This is slower and more expensive in the short term but is strongly favoured by the Swiss Association of Archivists, whose guidelines stress that no automated process should have unilateral deletion authority over public records. The Stadtarchiv Zürich operates under cantonal records law, which requires documented retention decisions — meaning any deletion, even of a duplicate, must be logged and justifiable.
ETH Zürich's Data Archive Services group, based on the Hönggerberg campus, has been developing machine-learning tools for exactly this kind of large-scale image triage. A collaboration with the city archive, formally proposed in late 2025, has yet to receive a confirmed budget line in the 2026 municipal accounts. That decision is expected before the Gemeinderat's September budget session.
Who Decides — and by When
The governance question is as thorny as the technical one. Zurich's direct democracy structure means any significant reallocation of cultural-heritage funds above a certain threshold could theoretically trigger a referendum, particularly if civil society groups argue that archival decisions should involve public input. The city's Kulturförderung budget for 2026 sits at CHF 44 million across all categories; how much of that can be redirected toward digital infrastructure without a formal vote is a question the Stadtrat has not yet answered publicly.
Practically, institutions have a narrow window. Migrating and cleaning archival data while systems are still relatively modular is far cheaper than attempting the same work after a full platform consolidation. The Zentralbibliothek is mid-way through a five-year digitisation contract that runs to December 2027, meaning any deduplication standard adopted now would need to be retrofitted into an existing workflow rather than built in from scratch.
The decisions ahead break down to three: which software standard the city adopts, who holds deletion authority under cantonal law, and whether the ETH partnership receives its funding in September. If the Gemeinderat approves a dedicated line item, implementation could begin by early 2027. If the budget session defers the question — a real possibility given competing housing and infrastructure demands in a city where average rental costs have risen sharply in districts like Altstetten and Oerlikon — the duplicate problem will continue to grow, and the cost of solving it will grow with it.