Zurich's municipal and institutional image archives contain tens of thousands of duplicate photographs — some files stored three or four times across different servers — and the effort to replace or consolidate them is now a formal line item in several public budgets. The problem did not appear overnight.
The roots stretch back to the early 2010s, when the city and its flagship institutions launched aggressive digitisation drives without agreeing on a single technical standard. The Stadtarchiv Zürich, the cantonal library on Zähringerplatz, and dozens of smaller municipal departments each built their own systems. Files were exported, re-uploaded, renamed, and duplicated every time a project changed hands or a contractor delivered a new batch of scans. By the mid-2020s, storage overhead had ballooned well beyond original projections.
How the Fragments Piled Up
The pattern was consistent across institutions. A photograph commissioned for a planning document in Kreis 5 would be saved to a departmental server, then emailed to a communications office, then uploaded again to a public-facing portal, each copy carrying slightly different metadata or file resolution. Multiply that by years of projects and thousands of images, and the result is an archive that functions more like a series of overlapping piles than a coherent library.
ETH Zurich, which manages one of the largest scientific image repositories in Switzerland, began auditing its own holdings in 2023 after storage costs on its central data infrastructure crossed a threshold that triggered an internal review. The university has not published final figures from that audit, but the exercise prompted other Zurich institutions to look harder at their own systems. The cantonal administration followed with a working group that reported to the Stadtrat in early 2025.
The housing crisis — Wohnungsnot — provides an unlikely parallel. Just as apartments in Zurich sat technically occupied but practically underused while the waitlist for affordable units grew, digital storage was being consumed by redundant data while institutions paid licensing fees for capacity they would not have needed had the original filing discipline been tighter. Stadtwerke Zürich, the city utility, estimated in its 2024 annual report that infrastructure inefficiencies of various kinds cost the municipality measurable sums annually, though image duplication was not broken out as a separate category.
The Push Toward Systematic Replacement
The practical consequences of duplicate images go beyond storage costs. When the same photograph exists in multiple versions, rights clearances become complicated. A picture of Langstrasse taken for a 2018 tourism campaign might sit in five different folders with five different usage tags — one marked cleared for commercial use, another marked internal only. When a designer pulls the wrong copy, the institution risks a licensing dispute. Several Swiss municipalities have faced exactly that scenario in the past three years, according to guidance published by the Swiss Federal Archives in Bern.
Zurich's current approach, outlined in communications from the Stadtarchiv and picked up by trade press in the archival sector, involves running deduplication software against the largest shared drives, flagging files that share a pixel fingerprint regardless of filename, and then routing confirmed duplicates through a replacement workflow before deletion. The city set a target to complete the first pass on its core municipal collections by the end of 2026.
For institutions on Rämistrasse and Heimplatz — the cultural corridor running from the Kunsthaus to the university district — the timetable is slightly longer. Several venues there rely on legacy content management systems that do not interact cleanly with modern deduplication tools, meaning manual review remains unavoidable for a significant portion of the backlog.
The practical advice from archivists and digital asset managers working in this space is straightforward: institutions that have not yet audited their image holdings should begin with the highest-traffic folders first — communications and press assets — since those generate the most duplicates and carry the highest licensing risk. A naming convention enforced at the point of upload, rather than applied retrospectively, cuts the duplication rate sharply. That lesson, learned expensively in Zurich over the past decade, is now informing how the next generation of digitisation contracts is being written.