Zurich's Duplicate Image Problem: What Happens Next and the Key Decisions Ahead
City authorities, archivists and tech procurers face a pivotal fork in the road as redundant digital images pile up across public databases and cultural institutions.
City authorities, archivists and tech procurers face a pivotal fork in the road as redundant digital images pile up across public databases and cultural institutions.

Zurich's public institutions are sitting on a growing backlog of duplicate digital images — redundant files scattered across municipal servers, cultural archives and university repositories — and the window for orderly, cost-effective resolution is narrowing fast. The issue came into sharper relief this spring when the Stadt Zürich's digital governance unit completed an internal audit of assets held across multiple city departments, finding that a substantial share of catalogued image files were exact or near-exact copies consuming storage budget without adding archival value.
The timing matters. Zurich is midway through a multi-year push to consolidate its digital infrastructure under the Digitale Verwaltung Zürich programme, which coordinates technology strategy across city agencies. Decisions made in the next six to twelve months about how to handle duplicate content will shape the structure of those consolidated systems for a decade or more. Get it wrong, and redundant data will be baked into the new architecture; resolve it now, and the city saves both storage costs and the administrative overhead of managing parallel catalogues.
The problem is not confined to a single department. The Stadtarchiv Zürich on Neumarkt, which holds centuries of civic records, has been digitising paper collections at an accelerating pace since 2021. The ETH Zürich Library on Rämistrasse, one of Switzerland's largest academic libraries, maintains image collections tied to research projects that often duplicate holdings already lodged with the Zentralbibliothek Zürich on Zähringerplatz. Each institution applies its own metadata standards, meaning automated deduplication tools — which rely on consistent file tagging — struggle to match images that are identical in content but catalogued differently.
Cloud storage is not cheap. Enterprise-grade archival storage in Switzerland typically runs between CHF 0.02 and CHF 0.05 per gigabyte per month depending on redundancy tier and provider contract terms. For institutions holding tens of terabytes of image data, duplicate files translate directly into five-figure annual costs that serve no preservation purpose. The Digitale Verwaltung Zürich programme has a stated objective of reducing unnecessary data redundancy by the end of 2027, but stakeholders familiar with the process say the harder challenge is governance, not technology: who decides which copy is canonical, and who absorbs the cost of remediation?
The Swiss Federal Archives in Bern adopted a federated deduplication approach in 2023, establishing shared checksums across participating cantonal archives so that identical files could be flagged without requiring central control over local collections. Zurich's institutions have been watching that model closely. A comparable arrangement at the cantonal level here would require formal agreement among the Stadtarchiv, the Zentralbibliothek and the two universities — a process that involves cantonal data protection officers and, in the case of ETH Zürich, coordination with the federal government in Bern given the institution's federal status.
Three choices are pressing. First, city and cantonal authorities need to agree on a deduplication standard before the Digitale Verwaltung Zürich programme's next procurement cycle, expected in early 2027, locks in storage architecture. Second, institutions must decide whether to pursue automated hash-matching — fast but prone to missing near-duplicates that differ by a few pixels — or AI-assisted similarity detection, which is more accurate but more expensive and raises questions under Switzerland's Federal Act on Data Protection about how image content is analysed. Third, someone has to own the canonical record: without clear institutional responsibility, deduplicated files tend to re-proliferate as departments upload fresh copies from their own local drives.
For residents and researchers who use Zurich's public image collections — the Stadtarchiv's online portal logged more than 140,000 search sessions in 2024 — the practical stakes are straightforward: better-organised archives mean faster, more reliable searches and fewer dead links to files that were moved or deleted without coordination. For the institutions themselves, the stakes are financial and reputational. Zurich's claim to be a leader in smart, efficient urban governance depends partly on whether it can manage its own digital house. The decisions ahead are unglamorous, technical and largely invisible to the public. That is precisely why they tend to get deferred — and precisely why deferral, here, carries a real cost.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Zurich
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News