Duplicate images have quietly become one of the most stubborn headaches inside Zurich's public digital infrastructure. Cantonal archivists, data scientists at ETH Zurich, and city planning officials are pushing for a coordinated response after internal audits revealed that redundant image files are inflating storage costs, slowing database queries, and — in at least one documented case involving urban planning records — creating legal ambiguity over which version of a document is authoritative.
The issue gained sharper attention in mid-June 2026 when the city's Stadtarchiv Zürich, housed near Neumarkt in the Altstadt, circulated a working paper among cantonal departments flagging the scope of the problem. The paper, reviewed by The Daily Zurich, did not publish specific figures, but described the situation as requiring structural rather than piecemeal remediation. Staff shortages and legacy migration projects — many dating to digitisation drives between 2015 and 2020 — are cited as root causes.
What the Experts Are Saying
Researchers at ETH Zurich's Chair of Information Science have been studying perceptual hashing and near-duplicate detection algorithms for several years. Their work, which feeds into broader European data-governance discussions, is now being cited internally by cantonal IT staff as a practical template. The chair's published research identifies that in large heterogeneous image collections, duplicate rates of between 8 and 15 percent are common when files migrate across incompatible systems — a range that officials say maps uncomfortably well onto Zurich's own experience.
Professionals at the Zentralbibliothek Zürich on Zähringerplatz, which holds digitised historical map and photograph collections spanning several centuries, have separately raised the issue in correspondence with the cantonal Department of Education. The library's digital preservation team has flagged that automated deduplication tools require careful calibration when handling culturally significant material: an image deleted as a duplicate may in fact be a distinct variant with its own provenance value. That distinction — between a true technical duplicate and a near-identical document that carries independent historical weight — is at the heart of current expert disagreement.
The Swiss Federal Archives in Bern have dealt with analogous issues at the national level. A guidance document the Archives published in March 2025 recommended that all federal bodies adopt a two-stage deduplication protocol: automated flagging followed by human review for any file older than 30 years. Zurich's cantonal administration has not yet formally adopted that standard, though a spokesperson for the Stadtarchiv confirmed to this newspaper that discussions are ongoing.
Costs, Timelines, and What Comes Next
Storage is not cheap. Enterprise-grade archival cloud storage currently runs at roughly CHF 0.02 to CHF 0.04 per gigabyte per month for Swiss-hosted solutions, according to publicly available pricing from Swiss data centre operators. For institutions holding tens of thousands of high-resolution TIFF files — standard for archival-quality scans — even a 10 percent redundancy rate translates into thousands of francs wasted annually, with compounding costs as collections grow.
The city's IT services unit, Departement der Industriellen Betriebe, has been allocated a line item in the 2026 municipal budget for digital infrastructure review, though the specific sum was not publicly broken out in the published accounts. Decisions on deduplication tooling are expected before the end of the third quarter of this year.
For Zurich residents and researchers who rely on public digital collections — whether tracing property history in Hürlimann Areal redevelopment records or accessing photographs from the Langstrasse district's transformation over the past two decades — the practical advice from archivists is straightforward: always note the specific file identifier and repository URL when citing a digital document, since the clean-up process may alter file locations. The Stadtarchiv has said it will publish a migration notice at least 30 days before any major deduplication exercise begins. That window, experts say, is the minimum responsible standard — not a guarantee of continuity.