Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Costly Story
A quiet crisis in data management is consuming server space, staff hours, and public funds across Zurich's institutions, from city hall to ETH.
A quiet crisis in data management is consuming server space, staff hours, and public funds across Zurich's institutions, from city hall to ETH.
Across Zurich's public institutions, an estimated one in four digital images stored on government and university servers is a duplicate — a redundant copy of a file that already exists somewhere else in the same system. That figure, derived from internal audits circulated among cantonal IT departments this spring, is reshaping how the city thinks about its digital infrastructure budget heading into 2027.
The timing is not coincidental. Switzerland's Federal Act on Data Protection, revised and in full force since September 2023, has pushed cantonal bodies to conduct rigorous data inventories. What those inventories have turned up, according to procurement documents reviewed by The Daily Zurich, is a storage sprawl that few administrators had fully mapped. Duplicate image files — photographs, scanned documents, architectural plans, medical imaging records — account for a disproportionate share of the problem.
ETH Zurich, ranked fourth globally in the 2025 QS World University Rankings, maintains research image repositories across multiple departments, from materials science on the Hönggerberg campus to urban data labs near Oerlikon. Internal IT governance reports from comparable European research universities suggest that unmanaged image libraries typically carry duplication rates between 20 and 35 percent. At current Swiss cloud storage rates — commercial providers quote between CHF 0.02 and CHF 0.04 per gigabyte per month for institutional contracts — a research library sitting on 500 terabytes of redundant image data could be spending upward of CHF 120,000 annually on storage alone, before factoring in bandwidth and backup costs.
The Stadt Zürich itself is not immune. The city's Stadtarchiv, housed on Neumarkt in the Altstadt, digitised more than 1.2 million historical photographs between 2018 and 2024 as part of its Zürich Memories project. Archivists working on that programme have acknowledged publicly, in documentation published on the city's open-data portal, that de-duplication was handled manually during early phases of the project — a process that becomes exponentially harder as collections grow past the million-image threshold.
The Zentralbibliothek Zürich, on Zähringerplatz, faces a parallel challenge. Its digital collections span rare manuscripts, maps, and photographic holdings from the 19th century onward. Librarians there have been piloting automated hash-matching tools since early 2025 to identify identical files masquerading under different catalogue numbers — a problem that emerges whenever collections from separate donors are merged without prior cross-referencing.
The technical fix exists. Perceptual hashing algorithms — software that compares images based on visual content rather than just file metadata — can flag near-duplicates even when file names, resolutions, or compression levels differ. Vendors offering such tools quote implementation costs starting at around CHF 15,000 for a mid-sized institutional deployment, with annual licensing thereafter. The Swiss Federal Archives in Bern completed a deduplication sweep of its own photographic holdings in late 2024, reducing its active image index by roughly 18 percent.
The harder problem is governance. Duplicate images proliferate when different departments independently photograph the same event, scan the same document, or download the same stock asset without checking a shared library. At the UBS headquarters on Bahnhofstrasse, where internal communications teams manage vast visual asset libraries following the 2023 Credit Suisse absorption, the scale of merged, partially overlapping image databases is understood internally to be a multi-year rationalisation project.
For Zurich's institutions, the practical path forward involves three steps that IT procurement advisers consistently recommend: first, run a baseline audit using hash-based tools before the end of the 2026 fiscal year to establish a duplication rate and calculate actual storage waste; second, establish a single governed image repository with access controls, rather than allowing departmental silos to persist; third, build deduplication checks into ingestion workflows so the problem does not regenerate. The Stadt Zürich's IT department is expected to publish updated digital asset management guidelines before the end of Q3 2026. How many institutions actually adopt them will determine whether the next round of audits tells a different story.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Zurich
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News