The Daily Zurich

Zurich news, every day

News

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Show Why It's a Growing Crisis

From the Stadtarchiv to ETH Zurich's image libraries, redundant digital files are quietly eating storage budgets and distorting public records.

By Zurich News Desk · Published 4 July 2026, 9:23 pm

3 min read

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Show Why It's a Growing Crisis
Photo: Photo by Kemal Kartal on Pexels

Zurich's public institutions collectively store an estimated tens of millions of digital image files across municipal and academic systems — and a significant share of them are exact or near-exact duplicates. That is not an abstract IT problem. It carries a measurable financial cost, and the reckoning is arriving faster than most administrators anticipated.

The pressure is sharpest right now for two reasons. Swiss federal data governance guidelines that took effect in January 2026 require cantonal institutions to conduct formal digital asset audits by the end of this calendar year. At the same time, cloud storage costs for the public sector in Switzerland have climbed sharply since 2023, with enterprise-tier object storage from major European providers now running between CHF 0.02 and CHF 0.04 per gigabyte per month — a figure that compounds painfully when duplicate files inflate total storage volume by 30 percent or more.

Where the Redundancy Lives

ETH Zurich's library and research data infrastructure is among the largest single repositories of image data in the German-speaking world. The university's e-collection system, housed at its Rämistrasse campus, ingests thousands of new image assets weekly from researchers across departments ranging from architecture to materials science. Internal presentations at the institution — referenced in publicly available ETH Board meeting summaries — have acknowledged that deduplication of legacy image archives is a standing technical challenge, though the university has not published specific duplication-rate figures.

The Stadtarchiv Zürich, located at Neumarkt 4 in the Niederdorf district, faces a parallel problem on the municipal side. The archive digitised roughly 1.2 million photographs, maps and printed documents as part of a project completed in 2022. Archivists and digital preservation specialists working in the field have noted publicly that mass digitisation projects of that scale routinely generate duplication rates of between 15 and 40 percent, driven by multiple scanning passes, format conversions and transfers across network storage systems. Applied to 1.2 million files, even a conservative 15 percent rate represents 180,000 redundant items.

Zurich's housing shortage has an odd parallel here. Just as the Wohnungsnot crisis is partly a problem of units occupied inefficiently rather than purely of total supply, the city's digital storage crisis is partly a problem of space wasted on files that should not exist twice — or ten times.

The Cost in Francs and Processing Time

Deduplication is not free. Specialist software capable of perceptual hashing — the technique used to catch near-duplicate images that differ by a few pixels or a colour profile shift — typically costs between CHF 4,000 and CHF 25,000 for an institutional licence, depending on the volume tier. Running a full deduplication pass across a multi-terabyte image archive can consume weeks of processing time on standard archival hardware. For a mid-sized cantonal institution, the labour cost of reviewing flagged matches and making deletion decisions can run to several hundred staff hours.

The Swiss Federal Archives in Bern completed a pilot deduplication exercise across a subset of its born-digital holdings in late 2024 and found that the exercise freed up storage equivalent to several years of projected new ingestion at then-current rates. That result has circulated among cantonal archivists in Zurich as a practical benchmark for what a properly resourced programme can achieve.

Some institutions are moving. The Zentralbibliothek Zürich, at Zähringerplatz 6, has been expanding its digital infrastructure under a multi-year modernisation programme and has publicly listed deduplication tooling as part of its roadmap for the current fiscal period. Whether the broader network of city and cantonal agencies coordinates that effort — or each institution reinvents the same solution independently — will determine whether Zurich captures the full efficiency dividend or simply relocates the redundancy problem from one server room to another.

For institutions facing the December 2026 audit deadline, digital preservation consultants active in the Swiss market advise beginning with a full file inventory before purchasing any deduplication tool, since buying software before understanding actual duplication rates frequently leads to over-specification and wasted procurement spend. The audit requirement alone, separate from any efficiency argument, gives every Zurich institution a concrete deadline to act against — and six months is not much runway.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.