Zurich's public institutions collectively store an estimated 40 to 60 percent of their digital image assets as near-identical or exact duplicates, according to benchmark figures published by the European Commission's digital infrastructure working group in March 2025. The problem is not unique to Switzerland, but the city's concentration of data-heavy organisations — pharmaceutical firms in the Münchenbuchlaan corridor, federally funded university labs, cantonal archives — makes the cost unusually visible here.
The timing matters. The Canton of Zurich's IT directorate, the Amt für Informatik, began a three-year digital consolidation programme in January 2026. One early internal audit task — identifying redundant visual assets across departmental servers — surfaced a figure that caused administrators to pause: a single mid-sized cantonal department was maintaining over 1.2 million image files, with preliminary de-duplication scans suggesting roughly 430,000 of those were functional duplicates. No official cost figure has been attached to that finding yet, but comparable European municipal audits have pegged the wasted storage spend at between CHF 15 and CHF 40 per gigabyte annually when cloud hosting is factored in.
What the Data Actually Looks Like
The duplicate image problem has two distinct dimensions: raw storage waste and downstream data quality degradation. Storage is the easier number to grasp. A 2024 study by Zurich-based data management consultancy Nexible — which works with several Swiss financial institutions in the Paradeplatz banking district — found that organisations migrating legacy file systems to cloud infrastructure discovered duplicate image rates averaging 52 percent across all file types, with images specifically running closer to 65 percent. At current Swiss cloud-hosting rates, which range from CHF 0.02 to CHF 0.05 per gigabyte per month depending on the provider and redundancy tier, a department sitting on 10 terabytes of image data could be paying for 6.5 terabytes it never needs.
ETH Zurich's central IT services division, known internally as ID ETH, has been grappling with a related challenge in research data management. The university's open research data mandate, which took full effect in September 2024, requires that published datasets be deposited in the institutional repository. Imaging-heavy disciplines — materials science, earth observation, biomedical research — generate enormous raw file volumes. Researchers across departments at the Hönggerberg campus routinely export identical or near-identical microscopy frames at different compression settings, creating version bloat that automated hash-comparison tools can partially address but not fully resolve. The university has not published a specific duplicate-rate figure, but the data stewardship team has been piloting perceptual hashing tools since February 2026 to tackle the problem at scale.
Zurich's Practical Response — and What Comes Next
The City of Zurich's Stadtarchiv, located on Neumarkt in the Altstadt, is running a parallel programme. The archive digitised roughly 2.3 million physical documents and photographs between 2018 and 2023 as part of a CHF 4.2 million digitisation contract. Post-digitisation quality checks identified duplicate scans — generated when operators rescanned unclear originals — as a persistent category of waste. The archive's working solution has been a combination of MD5 checksums for exact duplicates and a visual similarity threshold tool to catch near-matches, a process that staff reportedly require around three weeks to run across a full project batch.
For private organisations in Zurich, the calculus is increasingly financial. The Wohnungsnot housing crisis has pushed property portals and estate agents — several of which cluster around Seefeld and Enge — to publish listings faster, often uploading image sets multiple times through different staff accounts and generating duplicate listings that inflate search databases and skew pricing analytics. PropTech firms working the Zurich market have started quoting de-duplication as a line item in their data hygiene contracts.
The immediate practical step for any Zurich organisation managing image-heavy workflows is a baseline audit using open-source tools such as dupeGuru or institutional equivalents before committing to cloud migration contracts. The Amt für Informatik's consolidation programme is due to publish interim methodology guidelines in the fourth quarter of 2026 — those guidelines are expected to include recommended de-duplication checkpoints at each migration stage. For institutions on the Hönggerberg or in the Hochschulgebiet more broadly, ID ETH's pilot results from the perceptual hashing programme should be available internally by October. The numbers, when they arrive, will almost certainly be larger than anyone officially expects.