The Daily Zurich

Zurich news, every day

News

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Damaging Story

From city hall servers to ETH Zurich's research repositories, redundant image files are consuming storage budgets and slowing down Switzerland's push toward leaner public-sector data infrastructure.

By Zurich News Desk · Published 4 July 2026, 8:44 pm

3 min read

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Damaging Story
Photo: Photo by Malte Luk on Pexels

Duplicate images now account for an estimated 23 percent of all stored visual assets across Swiss public-sector digital archives, according to a benchmarking survey of cantonal IT departments published in May 2026 by the Swiss Federal Archives in Bern. For a city like Zurich, which manages dozens of digitised collections spanning everything from urban planning records in Stadtarchiv Zürich to photographic holdings at the Zentralbibliothek on Zähringerplatz, that figure translates directly into wasted money and slower retrieval systems.

The timing matters. Switzerland's Federal Council approved a revised e-Government strategy in January 2026 that commits cantons to a 30-percent reduction in redundant data holdings by the end of 2028. Zurich, as the country's most populous canton and its financial centre, is under particular pressure to demonstrate progress. The UBS–Credit Suisse merger fallout has already put Swiss institutions under a microscope for operational inefficiency; public-sector IT is the next frontier regulators and auditors are watching closely.

What the Numbers Actually Look Like on Zurich's Servers

At Stadtarchiv Zürich, which holds more than 4.5 million digitised records and images, internal audits conducted in late 2025 found that near-duplicate photographs — images that differ only in resolution, watermark or minor colour correction — were stored in parallel across at least three separate content management systems. Each high-resolution archival image averages roughly 45 megabytes. Multiply that across tens of thousands of duplicate pairs and the dead storage runs into terabytes.

ETH Zurich's research data infrastructure presents a different but related problem. The university's main data repository, ETH Research Collection, hosted more than 1.1 million files as of its 2025 annual report. Researchers uploading datasets frequently deposit multiple versions of the same image during iterative experiments, with deletion rates remaining low because file removal triggers compliance review. The result is a sprawl of redundant visual data that the library directorate described in its 2025 report as a growing maintenance burden, without yet attaching a specific remediation cost.

Storage pricing adds urgency. Enterprise cold-storage contracts for Swiss public institutions currently run at approximately CHF 0.018 per gigabyte per month for tier-two archival services, based on published rates from Swiss data centre operators in the Zurich-West corridor near Hardbrücke. That sounds trivial in isolation. Scale it to a canton managing petabytes of material and the annual bill climbs sharply — estimates from the cantonal IT office, published in its 2025 budget annex, put redundant storage costs for Zurich alone at over CHF 1.2 million per year, a figure the office said it wants to halve before the 2028 federal deadline.

Detection Tools and What Comes Next

The practical response is taking shape at Binzmühlestrasse, where the cantonal statistics office has been piloting a perceptual hashing tool since March 2026. Perceptual hashing assigns a compact digital fingerprint to each image based on visual content rather than file metadata, meaning that two photographs of the Limmatquai taken at different resolutions still register as duplicates. Early results from the pilot, presented at a Zurich digital-government forum in June, showed a 31-percent reduction in flagged redundant files within the first 90 days of scanning a 600,000-image test corpus.

The Zentralbibliothek is watching that pilot closely. Its own digitisation programme, which received CHF 800,000 in cantonal funding for the 2025–2027 cycle, is currently ingesting historical newspaper photography and postcard collections. Librarians there have adopted a manual deduplication protocol for now, reviewing flagged items before deletion to avoid accidentally purging images that differ in legally significant ways — caption metadata, rights status, or archival provenance.

For institutions not yet in any pilot programme, the federal guidance issued in May 2026 recommends beginning with a baseline audit using open-source tools before committing to vendor contracts. The deadline for cantons to submit preliminary deduplication plans to Bern is October 1, 2026 — leaving Zurich's various agencies roughly 90 days to get their inventories in order and demonstrate, in numbers, that they are serious about cleaning house.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.