The Daily Zurich

Zurich news, every day

News

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Are Staggering

A wave of redundant visual data is clogging public and private databases across the city, costing institutions real money and raising hard questions about how Switzerland manages its digital heritage.

By Zurich News Desk · Published 4 July 2026, 8:28 pm

3 min read

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Are Staggering
Photo: Photo by Elijah Cobb on Pexels

Zurich's public institutions are sitting on a quiet data crisis. Across municipal archives, university libraries, and corporate image banks, duplicate digital images now account for a measurable share of total storage overhead — and the bill for managing that redundancy is climbing steadily. Estimates from European digital preservation research suggest that between 20 and 35 percent of images held in large institutional repositories are functionally identical or near-identical copies, a figure that translates directly into wasted server capacity and staff hours.

The issue has sharpened in 2026 for a specific reason. The ongoing consolidation of digital assets following the UBS-Credit Suisse merger has forced IT teams to reconcile image libraries from two of the world's largest private banks, both headquartered within a few hundred metres of Paradeplatz. When two institutions with decades of marketing materials, internal communications archives, and compliance photography merge, the duplicate problem compounds exponentially. Technology managers involved in comparable European bank mergers have described deduplication exercises running to millions of individual file comparisons.

What the Data Actually Shows

The scale of the problem becomes clearer when you look at specific institutions. ETH Zürich, consistently ranked among the world's top ten technical universities, operates one of Switzerland's largest research image repositories. Its library system manages collections running into the tens of millions of digital objects. Industry benchmarks for repositories of that size — drawn from the Digital Preservation Coalition's published guidelines — put the share of duplicate or near-duplicate files at roughly 22 to 28 percent under standard ingestion workflows that lack automated deduplication at the point of upload.

Storage costs in Swiss data centres are not trivial. Commercial colocation pricing in the greater Zurich area — in facilities clustered around the Glatttal corridor near Wallisellen and in the industrial zones off Thurgauerstrasse in Oerlikon — runs between CHF 80 and CHF 140 per terabyte per month for enterprise-grade redundant storage, according to publicly available pricing sheets from Swiss data centre operators. An archive holding 500 terabytes with a 25 percent duplication rate is effectively paying for 125 terabytes it does not need. At mid-range pricing, that is roughly CHF 11,250 per month in avoidable expenditure — more than CHF 130,000 per year.

The Stadt Zürich's own digital infrastructure programme, operating under the city's Smart City Zürich initiative launched formally in 2018, has flagged data quality and redundancy as a standing agenda item. Municipal digitisation projects, including the ongoing work to bring historical records from the Stadtarchiv on Neumarkt into searchable digital form, have had to build deduplication steps into their workflows from the outset precisely because earlier scanning rounds produced overlapping file sets.

Why Automated Detection Has Limits

Replacing or removing duplicate images is not simply a matter of running a comparison script. Near-duplicate detection — images that are visually identical but differ in resolution, colour profile, or metadata — requires more sophisticated tooling than exact hash-matching. ETH Zürich's computer vision research groups have published work on perceptual hashing and content-based image retrieval, techniques directly applicable to this problem, though the gap between academic method and institutional deployment remains wide.

Pharmaceutical companies based in the greater Zurich and Basel corridor, including firms with significant research campuses in Schlieren and Dübendorf, face a regulatory dimension on top of the cost question. Duplicate images in clinical trial documentation or product approval filings can trigger compliance queries from Swissmedic, Switzerland's therapeutic products authority. A single contested image record in a regulatory submission can stall processes measured in months, not days.

For institutions ready to act, the practical starting point is an audit. Several Zurich-based IT consultancies now offer storage analytics services specifically scoped to image repositories, and the Federal Office of Communications has published guidance on digital asset management standards applicable to public bodies. Running a baseline deduplication audit before the end of 2026 — before year-end budget cycles close — positions organisations to quantify the problem precisely and present a cost-saving case for remediation investment in 2027 planning rounds. The numbers, once visible, tend to make the argument themselves.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.