The Daily Zurich

Zurich news, every day

News

Zurich's Digital Archives Are Full of the Same Photo Twice — Here's What Officials and Experts Are Saying About It

From city hall servers to university image libraries, duplicate digital assets are costing institutions time and money, and the pressure to fix the problem is finally producing results.

By Zurich News Desk · Published 4 July 2026, 9:00 pm

3 min read

Zurich's Digital Archives Are Full of the Same Photo Twice — Here's What Officials and Experts Are Saying About It
Photo: Photo by Mâide Arslan on Pexels

Zurich's public institutions are sitting on a quiet but expensive problem. Across municipal databases, university research archives, and the digital collections of cultural foundations along the Limmat, the same images are being stored, managed, and backed up multiple times over — and the bill for that redundancy is starting to draw serious attention from administrators and IT specialists alike.

The issue has come into sharper focus this summer as the City of Zurich's Stadtarchiv, based on Neumarkt in the Altstadt district, finalises a multi-year digitisation push that has seen tens of thousands of historical photographs converted from analogue originals. Staff working through the collection have flagged that a significant share of newly scanned assets duplicate images already held in the archive's existing digital catalogue — in some cases the same photograph appearing under three or four separate file entries.

Why the Problem Is Harder to Solve Than It Looks

Duplicate image replacement sounds straightforward: identify the copies, keep the best version, delete the rest. In practice, the process is considerably more complicated. Metadata attached to different versions of the same image often diverges — different file names, different resolution tags, different rights annotations. Deleting the wrong file can strip out contextual information that took years to compile.

ETH Zurich's Image Archive, one of the most substantial institutional collections in the German-speaking world, has been developing automated deduplication tools since at least 2023 as part of its broader digital infrastructure programme. Specialists working in computational archiving have noted publicly that purely hash-based matching — the simplest approach, which flags files with identical binary data — catches only a fraction of real-world duplicates, because images are routinely re-exported, cropped slightly, or compressed differently at each stage of a workflow, producing files that look identical to a human eye but register as distinct to a basic algorithm.

Perceptual hashing, which compares visual content rather than binary data, is the technique now being recommended by digital preservation bodies including Memoriav, the Swiss association for the safeguarding of audiovisual heritage, which is headquartered in Bern but works closely with Zurich institutions. Memoriav published updated guidance on the topic in late 2025 as part of its ongoing series of technical recommendations for Swiss cultural memory organisations.

Costs, Capacity, and the Pressure to Act

Storage costs are not trivial at institutional scale. Cloud and on-premises backup pricing for high-resolution archival image files — typically TIFF files at 400 DPI or above — can run to several thousand Swiss francs annually per terabyte for properly redundant, geographically distributed backup configurations. When collections run into the hundreds of thousands of files, eliminating genuine duplicates can meaningfully reduce those costs.

Zentralbibliothek Zürich, whose main building sits on Zähringerplatz, has a digital collections team that has been working through a comparable deduplication exercise for its photographic holdings. Librarians and archivists in the field have spoken in general terms at professional conferences about the tension between thoroughness and speed: a careful manual review of flagged duplicates takes time that understaffed teams do not always have, but automated deletion without human oversight carries real risks of data loss.

The broader context matters here too. Zurich is in the middle of a city-wide push toward more efficient use of public IT infrastructure, part of the Smart City Zurich programme run out of the Department of Buildings. More efficient digital asset management — including deduplication — has been identified within that programme as one of several areas where administrative savings are achievable without cuts to public-facing services.

For institutions still working through the problem, specialists in digital preservation advise a phased approach: run perceptual hash comparisons to generate a candidate list, route flagged pairs to a human reviewer for a final decision, and document every deletion with a log that preserves the original metadata. The process is slow the first time. Once the backlog is cleared, maintaining a duplicate-free collection going forward is considerably cheaper than the initial clean-up — and far less expensive than continuing to pay for redundant storage indefinitely.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.