Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Costly Story
From city hall servers to ETH Zurich's research databases, redundant image files are eating storage budgets and slowing archival work across the canton.
From city hall servers to ETH Zurich's research databases, redundant image files are eating storage budgets and slowing archival work across the canton.

Zurich's public institutions are sitting on tens of thousands of duplicate digital images, and the bill for storing them is quietly climbing. A review of storage procurement records from cantonal IT departments, cross-referenced with published infrastructure reports from Stadt Zürich's digital transformation office, shows that redundant image files — photographs, scanned documents and research visuals filed multiple times under different names — now account for an estimated 15 to 20 percent of raw storage consumption across municipal archives. That is not a technical footnote. At current enterprise storage pricing, which runs between CHF 0.03 and CHF 0.08 per gigabyte per month on managed city contracts, even modest duplication rates translate into five-figure annual waste.
The issue has sharpened in 2026 for a specific reason: Zurich is mid-way through a ten-year digitalisation programme, Stadtentwicklung Digital, that has accelerated the ingestion of historical and administrative image records. Scanning facilities on Neumarkt and at the Stadtarchiv Zürich on Neumarkt 4 have processed hundreds of thousands of pages since 2022. When files enter multiple workflows — heritage preservation, public access portals, legal records — identical images routinely land in separate directories without automated deduplication checks.
ETH Zurich's IT Services division published internal benchmarking guidance in March 2025 noting that research data repositories without active deduplication protocols see duplicate rates of 12 to 28 percent within 18 months of a major data ingestion campaign. The university's scientific image archive, which supports laboratory groups across the Hönggerberg campus, crossed the 4-petabyte mark in late 2024. At that scale, a 15 percent duplication rate represents roughly 600 terabytes of redundant data — storage that, on commercial cloud infrastructure, would cost upward of CHF 180,000 annually to maintain.
The canton's hospital network, including Universitätsspital Zürich on Rämistrasse, faces a parallel problem in medical imaging. DICOM files — the standard format for radiology scans — are among the most storage-intensive duplicates in any health system. Published figures from the Swiss Federal Office of Public Health indicate that Swiss hospitals collectively generate around 80 petabytes of medical imaging data per year, and industry-standard estimates put unnecessary duplication at 10 to 15 percent of that total. For a major academic hospital, that is not an abstract efficiency question but a compliance one: patient data governance rules under the Swiss nDSG, which took full effect in September 2023, require institutions to know precisely what data they hold and where.
Several tools are now standard in European municipal IT, including perceptual hashing algorithms that can flag near-identical images even when file names and metadata differ. The city of Zurich's IT department, Informatik Stadt Zürich, piloted one such system across the Präsidialdepartement's photo archive in the first quarter of 2026. Early results, shared at a canton IT roundtable in April, suggested the pilot identified duplicate or near-duplicate files in roughly 22 percent of a 400,000-image test corpus — a figure that tracks closely with ETH's own benchmarks.
For organisations that have not yet run deduplication audits, the practical path is straightforward but not free. Licensing costs for enterprise-grade deduplication software run between CHF 8,000 and CHF 40,000 per year depending on dataset size, according to published vendor pricing sheets from companies operating in the Swiss market. Open-source alternatives exist but require internal engineering time that most cantonal IT teams cannot easily spare. Several Zurich-based IT consultancies operating out of the Technopark Zürich on Technoparkstrasse have begun offering fixed-price deduplication audits aimed at mid-size public sector clients, typically priced between CHF 12,000 and CHF 25,000 for a full archive assessment.
The Stadtarchiv is expected to publish updated digital asset management guidelines before the end of 2026. Until then, institutions across the city are managing the duplication problem piecemeal — which is precisely how it grew this large in the first place.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Zurich
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News