The Daily Zurich

Zurich news, every day

News

Zurich Archivists and Designers Race to Solve a Digital Duplicate Crisis

A surge in duplicate image files is clogging institutional databases across the city, prompting a coordinated response from ETH Zurich, the Stadtarchiv, and several creative agencies in Kreis 5.

By Zurich News Desk · Published 4 July 2026, 8:28 pm

3 min read

Zurich Archivists and Designers Race to Solve a Digital Duplicate Crisis
Photo: Photo by Mehmet Turgut Kirkgoz on Pexels

Thousands of redundant image files have been quietly choking the digital storage systems of some of Zurich's largest cultural and research institutions, and this week a working group finally put a number on the problem. ETH Zurich's library services division confirmed it has identified more than 340,000 duplicate image assets accumulated since its digitisation drive began in earnest in 2019. The clean-up effort, which entered an active phase on Monday, is the most significant audit of the institution's digital image holdings to date.

The timing matters. Zurich's institutions are not alone. Across Europe, cultural memory organisations that rushed to digitise during the pandemic years are now confronting storage bloat, inflated licensing costs, and retrieval errors caused by near-identical files sitting under different metadata tags. For a city that positions ETH Zurich as a global research flagship — the institute ranked ninth in the QS World University Rankings 2025 — the integrity of its digital infrastructure carries reputational weight beyond the server room.

What Happened This Week

The immediate trigger was a procurement review. ETH's central IT services, based at Rämistrasse 101, flagged that cloud storage expenditure for the university's image archive had risen by roughly 18 percent year-on-year, a figure that prompted an internal audit beginning in late June. By Wednesday, the audit team had confirmed that duplicate imagery — many files the result of multiple format exports from the same source scan — accounted for a disproportionate share of that cost growth.

The Stadtarchiv Zürich, housed at Neumarkt 4 in the Altstadt, launched a parallel review the same week. Archivists there are working through approximately 1.2 million digitised photographs, maps, and plans held in its public collections. The challenge is not simply finding doubles. Many duplicates exist in slightly different resolutions or with marginal colour corrections, making automated detection tools unreliable without human verification at key decision points.

Several graphic design studios in the Kreis 5 district — notably around the Viadukt arches along Viaduktstrasse — have been grappling with the same problem at a commercial scale. Asset management firm PixelSorted, which serves about 40 agencies in German-speaking Switzerland, told clients in a newsletter circulated on 1 July that it would begin rolling out duplicate-detection tooling built on perceptual hashing algorithms by mid-August. The cost to subscribe to the enhanced service tier is set at CHF 290 per month for studios under ten staff.

Why Automated Tools Alone Won't Fix It

The core difficulty is definitional. A pixel-perfect duplicate is trivial to catch. But institutions like the Stadtarchiv routinely hold versions of the same photograph printed at different points in time, scanned from different physical copies, each with documentary value of its own. Deleting the wrong file can destroy provenance. ETH's library team has reportedly adopted a three-stage triage model — auto-flag, human review, archive-versus-delete decision — that it will present to the Swiss Federal Archives in Bern later this month as a possible template for national adoption.

The Swiss Federal Archives manages records under the Archivierungsgesetz of 1998, which does not yet contain explicit provisions for born-digital or duplicate-digital materials. A consultation on updated guidance was launched in spring 2026, with responses due by 31 August. Zurich's institutions are expected to submit coordinated input drawing on this week's audit findings.

For designers and archivists alike, the practical path forward involves three immediate steps: running a baseline audit using open-source tools such as dupeGuru before any paid solution is contracted, establishing a written deduplication policy that defines what counts as a true duplicate versus a version, and tagging surviving files with provenance metadata before the originals are removed. The Stadtarchiv has said it will publish its internal methodology as a public-domain document once the current review concludes, likely in September. That guidance could prove useful well beyond the Altstadt.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.