The Daily Zurich

Zurich news, every day

News

Zurich Archivists and Researchers Tackle Duplicate Image Crisis in City's Digital Collections

A surge in duplicated photographs across municipal and institutional databases is prompting urgent coordinated action from ETH Zurich and the city's main archives this week.

By Zurich News Desk · Published 4 July 2026, 9:16 pm

3 min read

Zurich Archivists and Researchers Tackle Duplicate Image Crisis in City's Digital Collections
Photo: Photo by Kemal Kartal on Pexels

Zurich's digital heritage managers are moving faster than usual. This week, the city's main archival institutions confirmed they are deploying new automated detection tools to address a growing problem of duplicate images clogging shared databases — a technical and curatorial headache that has quietly ballooned over the past eighteen months.

The issue matters because it sits at the intersection of two pressures converging on Zurich's institutional memory right now. First, a major digitisation push across the Stadtarchiv Zürich, housed near Neumarkt in the Altstadt, has flooded shared repositories with tens of thousands of scanned photographs, maps and documents since 2024. Second, ETH Zürich's Image Archive — one of the largest scientific image repositories in German-speaking Europe — has been integrating collections from partner universities, pulling in material that sometimes duplicates what already exists locally. When duplicates pile up undetected, search results degrade, storage costs climb and cataloguers waste hours re-describing the same image twice.

What Happened This Week

On Tuesday, the ETH Zürich Library confirmed it had begun a phased rollout of a perceptual hashing system — a technique that converts images into compact numerical fingerprints to flag near-identical files — across its digitised holdings. The process is being applied first to the roughly 1.2 million photographs held in the ETH-Bibliothek's Bildarchiv, which documents Swiss scientific, engineering and architectural history going back to the mid-nineteenth century. Stadtarchiv Zürich staff, meanwhile, met with technical partners this week at their offices on Neumarkt to align metadata standards so that deduplicated records can eventually be cross-searched without the two institutions running redundant versions of the same file.

The practical consequences of uncontrolled duplication are not trivial. In one internal review completed in late June, archivists found that a batch of around 4,000 photographs taken during the construction of the Zürichsee waterfront promenade in the 1930s existed in at least three separate versions across two institutional servers — some with conflicting dates and location tags. Each duplicate had been independently catalogued, meaning the error multiplied rather than cancelled itself out. Similar problems have been flagged in the Zentralbibliothek Zürich, located on Zähringerplatz, where staff have been working since spring 2026 to reconcile a backlog of mislabelled scans imported from a now-discontinued cantonal digitisation grant program.

Why the Timing Matters

The current push is also driven by money. Swiss federal digitisation subsidies channelled through Memoriav, the national association for the preservation of audiovisual heritage, require participating institutions to demonstrate clean, non-duplicated metadata before the next funding cycle opens in early 2027. Institutions that cannot show a verified deduplicated catalogue risk losing access to grants that, in previous cycles, have run to several hundred thousand Swiss francs per recipient.

There is a broader European dimension too. The Europeana network — which aggregates digital cultural heritage from institutions across the continent — has tightened its ingestion rules this year, rejecting uploads where duplicate detection scores fall below a defined threshold. Zurich's institutions contribute regularly to Europeana, and a failure to clean house before the autumn submission window would mean fewer Swiss items surfacing in pan-European searches.

For researchers working at ETH Zürich's main campus on Rämistrasse or at the Zentralbibliothek, the practical advice right now is straightforward: if you are building a project that depends on image metadata from any of the city's major digital collections, treat the current catalogues as provisional. Archivists expect the first round of deduplicated, verified records to be available through the swisscollections portal by late September 2026. Anyone relying on older exports should re-query after that date to avoid basing work on records that may already have been merged, corrected or retired.

The Stadtarchiv is also asking researchers who spot obvious duplicates while working in its online portal to flag them using a feedback form introduced in April. It is a modest crowdsourcing measure, but in a city that takes direct participation seriously, it fits the culture.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.