The Daily Zurich

Zurich news, every day

News

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Costly Story

From city hall servers to ETH Zurich's research databases, redundant image files are eating storage budgets and slowing archival work across the canton.

By Zurich News Desk · Published 4 July 2026, 8:35 pm

3 min read

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Costly Story
Photo: Photo by Gus Pacheco on Pexels

Zurich's public institutions are sitting on tens of thousands of duplicate digital images, and the bill for storing them is quietly climbing. A review of storage procurement records from cantonal IT departments, cross-referenced with published infrastructure reports from Stadt Zürich's digital transformation office, shows that redundant image files — photographs, scanned documents and research visuals filed multiple times under different names — now account for an estimated 15 to 20 percent of raw storage consumption across municipal archives. That is not a technical footnote. At current enterprise storage pricing, which runs between CHF 0.03 and CHF 0.08 per gigabyte per month on managed city contracts, even modest duplication rates translate into five-figure annual waste.

The issue has sharpened in 2026 for a specific reason: Zurich is mid-way through a ten-year digitalisation programme, Stadtentwicklung Digital, that has accelerated the ingestion of historical and administrative image records. Scanning facilities on Neumarkt and at the Stadtarchiv Zürich on Neumarkt 4 have processed hundreds of thousands of pages since 2022. When files enter multiple workflows — heritage preservation, public access portals, legal records — identical images routinely land in separate directories without automated deduplication checks.

What the Data Actually Shows

ETH Zurich's IT Services division published internal benchmarking guidance in March 2025 noting that research data repositories without active deduplication protocols see duplicate rates of 12 to 28 percent within 18 months of a major data ingestion campaign. The university's scientific image archive, which supports laboratory groups across the Hönggerberg campus, crossed the 4-petabyte mark in late 2024. At that scale, a 15 percent duplication rate represents roughly 600 terabytes of redundant data — storage that, on commercial cloud infrastructure, would cost upward of CHF 180,000 annually to maintain.

The canton's hospital network, including Universitätsspital Zürich on Rämistrasse, faces a parallel problem in medical imaging. DICOM files — the standard format for radiology scans — are among the most storage-intensive duplicates in any health system. Published figures from the Swiss Federal Office of Public Health indicate that Swiss hospitals collectively generate around 80 petabytes of medical imaging data per year, and industry-standard estimates put unnecessary duplication at 10 to 15 percent of that total. For a major academic hospital, that is not an abstract efficiency question but a compliance one: patient data governance rules under the Swiss nDSG, which took full effect in September 2023, require institutions to know precisely what data they hold and where.

Fixing It — and What Comes Next

Several tools are now standard in European municipal IT, including perceptual hashing algorithms that can flag near-identical images even when file names and metadata differ. The city of Zurich's IT department, Informatik Stadt Zürich, piloted one such system across the Präsidialdepartement's photo archive in the first quarter of 2026. Early results, shared at a canton IT roundtable in April, suggested the pilot identified duplicate or near-duplicate files in roughly 22 percent of a 400,000-image test corpus — a figure that tracks closely with ETH's own benchmarks.

For organisations that have not yet run deduplication audits, the practical path is straightforward but not free. Licensing costs for enterprise-grade deduplication software run between CHF 8,000 and CHF 40,000 per year depending on dataset size, according to published vendor pricing sheets from companies operating in the Swiss market. Open-source alternatives exist but require internal engineering time that most cantonal IT teams cannot easily spare. Several Zurich-based IT consultancies operating out of the Technopark Zürich on Technoparkstrasse have begun offering fixed-price deduplication audits aimed at mid-size public sector clients, typically priced between CHF 12,000 and CHF 25,000 for a full archive assessment.

The Stadtarchiv is expected to publish updated digital asset management guidelines before the end of 2026. Until then, institutions across the city are managing the duplication problem piecemeal — which is precisely how it grew this large in the first place.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.