Zurich's public institutions are sitting on digital image archives bloated by duplicate files, and the people tasked with managing those systems are increasingly vocal about the operational damage it causes. IT directors, archivists and data governance specialists across the city say the problem has quietly compounded for years — and that the reckoning, driven in part by rising storage costs and new federal data retention guidelines effective January 2026, is now unavoidable.
The issue matters now for a specific reason. Switzerland's Federal Archives issued updated digital preservation standards at the start of this year requiring cantonal and municipal bodies to audit their holdings by the end of 2026. For a city the size of Zurich — which administers everything from planning records in Stadthaus Zurich on Stadthausquai to cultural holdings at the Stadtarchiv on Neumarkt — that means confronting backlogs that some departments have let accumulate since the early 2010s.
What the Specialists Are Saying
Experts in digital asset management describe duplicate images not as a trivial nuisance but as a compound liability. Each redundant file consumes server space, inflates backup cycles, slows retrieval systems and introduces version-control errors when images are edited or licensed. At ETH Zurich, whose library system on Rämistrasse manages one of the largest scientific image repositories in the German-speaking world, data governance teams have publicly discussed the need for automated deduplication tools as part of the institution's broader open-access strategy. The university's library holds digitised collections running to tens of millions of files across multiple departments.
The Swiss Memory Institutions network, which coordinates digital preservation efforts among archives, libraries and museums across the confederation, has flagged deduplication as a priority area in its 2025–2028 work programme. Practitioners within that network describe a common pattern: images are uploaded multiple times through different workflows — once by a photographer, again by a communications officer, again during a web migration — and no single system catches the overlap. Over a decade, a municipal archive can accumulate three or four copies of the same photograph with slightly different filenames, metadata and compression settings, making automated matching harder.
Zurich's Stadtarchiv, which holds records going back centuries and has been systematically digitising physical holdings since the early 2000s, is among the institutions that will face scrutiny under the new federal standards. Digital preservation work of this scale typically costs Swiss institutions between CHF 80 and CHF 200 per gigabyte when full migration, quality-checking and metadata enrichment are factored in, according to published benchmarks from the Swiss National Library. A collection with significant duplication inflates both the apparent volume and the real cost of any migration project.
Tools, Timelines and What Comes Next
Several Zurich-based technology firms have moved to address the market gap. Companies operating out of the Technopark Zurich on Technoparkstrasse, which houses dozens of software and data-services startups, have in recent months pitched deduplication and image-fingerprinting solutions directly to public-sector clients. The pitch is straightforward: perceptual hashing algorithms can identify visually identical or near-identical images even when file names, formats and metadata differ — flagging them for human review before deletion or consolidation.
The practical advice from digital archivists is consistent: institutions should not attempt bulk deletion without a review layer. Automated tools can misidentify near-duplicates — two photographs of the same building taken seconds apart, for instance — as redundant when both may have archival value. The recommended workflow involves flagging, human spot-checking of a sample, and then staged removal with a defined retention window before permanent deletion.
For Zurich's public bodies, the federal deadline of December 31, 2026 is the forcing function. Institutions that cannot demonstrate a credible audit plan risk complications with federal co-financing of future digitisation projects. For a city that has invested heavily in smart-city infrastructure and positions digital governance as a civic strength, letting its image archives remain bloated with duplicates is, as archivists working in this space have put it plainly, a solvable problem that simply requires the will to start.