Zurich's public and academic institutions collectively store an estimated 40 to 60 percent of their digital image archives as exact or near-exact duplicates, according to data-management researchers at ETH Zurich's Institute for Information Security, which has been studying redundancy in Swiss institutional storage systems since 2023. That figure, unremarkable in isolation, translates into something harder to ignore when you put a price on it: server infrastructure costs for redundant image data across Swiss federal institutions alone run into the tens of millions of francs annually.
The issue matters now because several converging pressures have pushed it from a technical nuisance to a genuine fiscal and governance concern. The Swiss Confederation's federal archive digitisation program, which accelerated after 2021, has flooded municipal and cantonal repositories with newly scanned material. Zurich's own Stadtarchiv on Neumarkt, along with the Zentralbibliothek Zürich on Zähringerplatz, have both expanded their digital holdings substantially in the past three years. When scanning projects proceed without rigorous deduplication protocols, the same historical photograph can end up stored under four or five different filenames, in multiple resolution variants, across separate departmental servers.
What the Storage Bills Actually Show
Storage is cheap — until it isn't. A single high-resolution archival scan of a 19th-century print runs between 80 and 120 megabytes in TIFF format. Multiply that by the Zentralbibliothek's publicly stated figure of over 3.5 million digitised items, apply a conservative 35 percent duplication rate, and the redundant data load exceeds 98 petabytes in raw terms. Cloud and on-premises hybrid storage at institutional rates in Switzerland typically costs between CHF 8 and CHF 15 per terabyte per month, depending on contract tier and redundancy requirements. The arithmetic is uncomfortable.
ETH Zurich's Computer Science department released a working paper in March 2026 examining perceptual hashing algorithms — tools that identify visually identical or near-identical images even when file metadata differs. The paper tested the approach against a sample dataset from a Swiss cantonal archive and found that automated deduplication reduced storage consumption by 31 percent in a controlled environment, with a false-positive rate of under 0.4 percent. That precision matters enormously in archival contexts, where accidentally flagging two genuinely distinct but visually similar photographs as duplicates and deleting one would represent an irreversible cultural loss.
The city of Zurich's own IT services division, Informatik Stadt Zürich on Hagenholzstrasse in Oerlikon, began a pilot deduplication review of the municipal photo database in January 2026. The project covers roughly 1.2 million images collected by city departments between 1995 and 2024, ranging from construction permits to public event documentation. Early internal assessments, cited in a canton-level digital governance report published in April 2026, put the duplication rate in that specific collection at 44 percent.
What Comes Next for Institutions and Individuals
The practical response is taking shape along two tracks. At the institutional level, the Swiss Federal Archives in Bern issued updated guidelines in May 2026 requiring all federally funded digitisation projects above CHF 500,000 to include a deduplication audit as a funded deliverable, not an optional add-on. Zurich's cantonal cultural institutions are expected to align with those standards by the end of 2026.
For smaller organisations — the dozens of neighbourhood historical societies, professional photographers, and design studios clustered around Zurich West's Kreis 5 and the creative businesses near Viadukt — the picture is less structured. Open-source tools such as dupeGuru and rmlint handle consumer-scale deduplication at no cost, and several Zurich-based IT consultancies operating out of the Technopark on Technoparkstrasse now offer archival audits starting at CHF 1,200 for collections under 500 gigabytes.
The underlying message from the data is simple: digital storage feels free until an institution reaches the scale at which redundancy becomes a budget line. Zurich's archival institutions are already past that threshold. The question for 2026 and beyond is whether deduplication becomes standard practice from the moment of ingestion, or whether the audit bills keep compounding.