The Daily Zurich

Zurich news, every day

News

The Hidden Cost of Duplicate Images: What Zurich's Digital Archive Numbers Actually Reveal

From city hall databases to ETH Zurich's research repositories, redundant image files are quietly consuming server space, budget, and staff hours across the canton.

By Zurich News Desk · Published 4 July 2026, 9:28 pm

3 min read

The Hidden Cost of Duplicate Images: What Zurich's Digital Archive Numbers Actually Reveal
Photo: Photo by Mâide Arslan on Pexels

Zurich's public institutions are sitting on a measurable and largely unaddressed problem: duplicate images embedded across digital archives, municipal databases, and institutional websites are collectively occupying hundreds of terabytes of redundant storage — and the figures, drawn from IT audits conducted across several cantonal bodies, point to a cost that runs into the low seven figures annually once server licensing, energy, and staff remediation time are factored in.

The issue has gained fresh urgency in 2026 because several major digitisation drives are now converging at once. The Stadtarchiv Zürich on Neumarkt completed a large-scale document upload phase in the first quarter of this year. ETH Zurich's research data management office expanded its open-access repository requirements for doctoral candidates starting in January 2026, meaning thousands of new image-heavy submissions are arriving monthly. And the city's own portal, Stadt Zürich Digital, launched a consolidated citizen services interface last autumn that pulled image assets from at least four legacy databases — creating duplication by structural design.

What the Numbers Show

Duplicate image files — defined as identical or near-identical image data stored under different filenames or in separate folder trees — typically account for between 20 and 40 percent of total image storage in large institutional repositories, according to published benchmarks from European digital infrastructure bodies. For a repository holding 50 terabytes of image data, that range translates to 10 to 20 terabytes of files that could, in principle, be safely eliminated or deduplicated without any loss of content.

Storage pricing on Swiss enterprise cloud contracts currently runs between CHF 0.02 and CHF 0.05 per gigabyte per month depending on redundancy tier — meaning 15 terabytes of genuinely unnecessary duplication costs roughly CHF 300 to CHF 750 per month in raw storage alone. Multiply that across a dozen cantonal and municipal entities, add in the electricity overhead at Zurich's data centre facilities in Hürlimann Areal and at the city-operated infrastructure nodes near Altstetten, and the annual figure becomes difficult to ignore.

The more stubborn cost is human. IT staff at institutions such as the Zentralbibliothek Zürich on Zähringerplatz spend measurable portions of their working week manually resolving image metadata conflicts — a problem that arises precisely because duplicate files often carry different timestamps, licensing tags, or access permissions, even when the pixel data is identical. One published workflow analysis from a comparable northern European municipal archive — the Helsinki City Archives, in a 2024 report — found that deduplication reduced metadata conflict resolution time by 34 percent in the twelve months following implementation.

The Remediation Gap in Zurich

Switzerland's federal data management guidelines, updated in 2023 under the framework of the Bundesgesetz über den Datenschutz revision cycle, do not yet mandate automated deduplication audits for cantonal institutions. That regulatory gap means the decision to act is discretionary. Some cantonal IT offices have begun piloting deduplication tools — the Kanton Zürich's central IT unit, KITT, has referenced image-hash comparison protocols in internal documentation circulated in late 2025 — but no binding canton-wide standard is in force.

The practical consequence is patchwork compliance. An institution like ETH Zurich, which publishes its data management plans publicly and operates under Swiss National Science Foundation grant conditions, has stronger structural incentives to address the problem than a smaller municipal body managing, say, planning permit image archives in a district office in Schwamendingen or Oerlikon.

For organisations looking to act, the steps are sequential and increasingly well-documented: first, run a perceptual hash audit across the full image repository to identify near-duplicates as well as exact matches; second, establish a canonical file registry that designates a single authoritative version of each image; third, replace remaining instances with pointers or symbolic links rather than discrete copies. Tools capable of running this process across mixed-format archives — JPEG, TIFF, PNG, WebP — exist as open-source options and are in active use at several European national libraries. The initial audit alone, run against a 50-terabyte repository, typically completes within 72 hours on standard server hardware. The savings begin immediately after.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.