The Daily Zurich

Zurich news, every day

News

Zurich's Duplicate Image Problem: The Numbers Revealing a Hidden Storage Crisis

Redundant digital images are costing Swiss institutions millions of francs and gigabytes they can no longer afford to ignore.

By Zurich News Desk · Published 4 July 2026, 9:06 pm

3 min read

Zurich's Duplicate Image Problem: The Numbers Revealing a Hidden Storage Crisis
Photo: Photo by Mâide Arslan on Pexels

Swiss institutions collectively store an estimated 40 percent of their digital image archives as exact or near-exact duplicates — a figure that researchers at ETH Zurich's Distributed Systems Group have been working to quantify through computational auditing methods developed over the past three years. The problem is not new, but the scale, as databases grow and cloud storage contracts balloon, is forcing organisations from Zurich's Stadtarchiv on Neumarkt to the cantonal hospital complex at UniversitätsSpital Zürich to confront what has become a measurable and expensive inefficiency.

The timing matters because storage is no longer cheap in the way institutions once assumed. Zurich-based data infrastructure costs have risen alongside wider European energy prices, and the city's push toward a greener municipal footprint — anchored in the Stadtrat's 2035 climate target — has added a second layer of pressure. Every redundant terabyte of image data stored in a server room consumes electricity. For a city that has committed to net-zero municipal operations within nine years, that is a political problem as much as a technical one.

What the Data Actually Shows

ETH Zurich's research group ran computational deduplication scans across anonymised datasets provided by three partner institutions in the canton between January and June 2025. Across a combined archive of roughly 18 million image files, duplicate or near-duplicate images accounted for between 34 and 47 percent of total stored volume, depending on the deduplication threshold applied. At current colocation rates in Zurich — which hover around CHF 180 to CHF 220 per rack unit per month at facilities including the InterXion campus in Glattbrugg — that level of redundancy translates to substantial recurring cost that delivers zero informational value.

The pharmaceutical sector, clustered around the greater Zurich area and with deep ties to firms operating out of the Technopark Zürich on Technoparkstrasse in Zürich-West, generates particularly dense image archives. Clinical trial documentation, compound imaging, and regulatory submission packages all involve image-heavy workflows where file versioning and multi-team handoffs reliably produce duplicate proliferation. One sector estimate, published by the Swiss Bioinformatics Institute in Lausanne in March 2026, put avoidable image storage costs across Swiss life-sciences firms at upward of CHF 60 million annually — though that figure encompasses the broader national sector, not Zurich alone.

The Stadtarchiv situation illustrates a different dimension. Digitisation projects running since 2019 have ingested historical photographic collections that pre-date any consistent naming convention, meaning that the same physical photograph scanned at different resolutions or by different contractors over successive budget cycles ends up stored as multiple discrete files. Staff there have no automated tool to flag these overlaps; identification remains largely manual, a process that archivists describe as unsustainable as collection volumes grow.

Where Solutions Stand and What Comes Next

Three Zurich-area software firms — including one operating out of the Impact Hub on Sihlquartier — are developing or commercialising deduplication pipelines specifically tuned for institutional rather than consumer use. Consumer tools built for personal photo libraries handle perceptual hashing competently but struggle with the metadata integrity requirements that archives, hospitals, and regulated industries impose. An institutional tool must confirm not just that two images look identical but that deleting one will not break a referential chain in a records management system.

The Federal Data Protection and Information Commissioner's office updated its guidance on data minimisation in January 2026, a development that gives institutions a compliance hook as well as a cost argument for undertaking systematic deduplication reviews. Under the revised Swiss Data Protection Act, storing data beyond operational necessity — including redundant copies — requires documented justification.

For organisations in Zurich starting from scratch, the practical first step is an audit before any deletion. Running a perceptual hash comparison across an image archive produces a report within hours on modern infrastructure; deleting files without that report first has caused recoverable but embarrassing data-loss incidents at smaller institutions. The ETH group recommends a phased approach: flag duplicates in the first quarter, review flagged files against reference databases in the second, and only execute deletions once a restoration snapshot is confirmed. Given current storage pricing and the city's climate commitments, doing nothing has stopped being a neutral option.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.