The Daily Zurich

Zurich news, every day

News

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Show Why It's Getting Worse

From city hall servers to ETH Zurich's research repositories, redundant image files are quietly consuming storage budgets and distorting data integrity across Switzerland's largest city.

By Zurich News Desk · Published 4 July 2026, 8:45 pm

4 min read

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Show Why It's Getting Worse
Photo: Photo by Ana Kenk on Pexels

Zurich's public institutions collectively store an estimated 40 to 60 percent of their digital image assets as near-identical or exact duplicates, according to benchmark figures published by the European Commission's digital infrastructure working group in March 2025. The problem is not unique to Switzerland, but the city's concentration of data-heavy organisations — pharmaceutical firms in the Münchenbuchlaan corridor, federally funded university labs, cantonal archives — makes the cost unusually visible here.

The timing matters. The Canton of Zurich's IT directorate, the Amt für Informatik, began a three-year digital consolidation programme in January 2026. One early internal audit task — identifying redundant visual assets across departmental servers — surfaced a figure that caused administrators to pause: a single mid-sized cantonal department was maintaining over 1.2 million image files, with preliminary de-duplication scans suggesting roughly 430,000 of those were functional duplicates. No official cost figure has been attached to that finding yet, but comparable European municipal audits have pegged the wasted storage spend at between CHF 15 and CHF 40 per gigabyte annually when cloud hosting is factored in.

What the Data Actually Looks Like

The duplicate image problem has two distinct dimensions: raw storage waste and downstream data quality degradation. Storage is the easier number to grasp. A 2024 study by Zurich-based data management consultancy Nexible — which works with several Swiss financial institutions in the Paradeplatz banking district — found that organisations migrating legacy file systems to cloud infrastructure discovered duplicate image rates averaging 52 percent across all file types, with images specifically running closer to 65 percent. At current Swiss cloud-hosting rates, which range from CHF 0.02 to CHF 0.05 per gigabyte per month depending on the provider and redundancy tier, a department sitting on 10 terabytes of image data could be paying for 6.5 terabytes it never needs.

ETH Zurich's central IT services division, known internally as ID ETH, has been grappling with a related challenge in research data management. The university's open research data mandate, which took full effect in September 2024, requires that published datasets be deposited in the institutional repository. Imaging-heavy disciplines — materials science, earth observation, biomedical research — generate enormous raw file volumes. Researchers across departments at the Hönggerberg campus routinely export identical or near-identical microscopy frames at different compression settings, creating version bloat that automated hash-comparison tools can partially address but not fully resolve. The university has not published a specific duplicate-rate figure, but the data stewardship team has been piloting perceptual hashing tools since February 2026 to tackle the problem at scale.

Zurich's Practical Response — and What Comes Next

The City of Zurich's Stadtarchiv, located on Neumarkt in the Altstadt, is running a parallel programme. The archive digitised roughly 2.3 million physical documents and photographs between 2018 and 2023 as part of a CHF 4.2 million digitisation contract. Post-digitisation quality checks identified duplicate scans — generated when operators rescanned unclear originals — as a persistent category of waste. The archive's working solution has been a combination of MD5 checksums for exact duplicates and a visual similarity threshold tool to catch near-matches, a process that staff reportedly require around three weeks to run across a full project batch.

For private organisations in Zurich, the calculus is increasingly financial. The Wohnungsnot housing crisis has pushed property portals and estate agents — several of which cluster around Seefeld and Enge — to publish listings faster, often uploading image sets multiple times through different staff accounts and generating duplicate listings that inflate search databases and skew pricing analytics. PropTech firms working the Zurich market have started quoting de-duplication as a line item in their data hygiene contracts.

The immediate practical step for any Zurich organisation managing image-heavy workflows is a baseline audit using open-source tools such as dupeGuru or institutional equivalents before committing to cloud migration contracts. The Amt für Informatik's consolidation programme is due to publish interim methodology guidelines in the fourth quarter of 2026 — those guidelines are expected to include recommended de-duplication checkpoints at each migration stage. For institutions on the Hönggerberg or in the Hochschulgebiet more broadly, ID ETH's pilot results from the perceptual hashing programme should be available internally by October. The numbers, when they arrive, will almost certainly be larger than anyone officially expects.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.