The Daily Zurich

Zurich news, every day

News

Zurich Takes a Systematic Approach to Duplicate Images in City Archives — and Other Metropolises Are Watching

As municipal digitisation drives flood public databases with redundant visual records, Zurich's archivists are testing automated detection tools that Amsterdam and Vienna have so far only discussed.

By Zurich News Desk · Published 4 July 2026, 8:45 pm

3 min read

Zurich Takes a Systematic Approach to Duplicate Images in City Archives — and Other Metropolises Are Watching
Photo: Photo by Natalia Sevruk on Pexels

Zurich's Stadtarchiv has begun a structured programme to identify and remove duplicate digital images from its public records database, a problem that has quietly ballooned since the city accelerated its digitisation push in 2022. The effort, centred on the Neumarkt-based archive facility, involves semi-automated flagging software that cross-references metadata and pixel-hash values to surface identical or near-identical image files before they propagate further through the city's open-data portal on data.stadt-zuerich.ch.

The timing is not accidental. Swiss municipalities face a federal deadline under the revised Archivierungsgesetz — the national archiving law updated in 2024 — that requires cantonal and city-level institutions to demonstrate data hygiene standards by the end of 2026. Duplicate image files are not merely a storage nuisance; they distort search results, inflate reported collection sizes, and can create legal complications when images carry differing licensing tags despite being visually identical. For a city whose administrative culture prizes precision, the problem landed badly once it was properly quantified.

What Zurich Is Actually Doing

The programme involves two institutional partners. The Stadtarchiv is running the detection pipeline on historical photographic collections — some dating to the late nineteenth century — while ETH Zürich's chair for information science has provided an advisory framework for the hash-comparison methodology. The Pestalozzistrasse repository of the Zentralbibliothek Zürich is separately auditing its digitised postcard and map collections, where duplication rates from repeated batch-scanning runs have been described internally as a known structural issue.

City officials responsible for digital infrastructure have not published a final duplicate count, but the open-data portal currently lists more than 1.4 million individual image assets across all municipal collections, a figure that grew by roughly 18 percent between 2023 and early 2026 as scanning contracts were fulfilled. Even a duplication rate of five percent — considered conservative by archival standards — would imply tens of thousands of redundant files requiring review or deletion.

The practical challenge is not purely technical. Many duplicate images carry different descriptive metadata applied by different archivists at different times, meaning an automated delete would discard potentially useful cataloguing work. The current protocol at the Stadtarchiv involves human review of any flagged pair where metadata diverges, a step that slows throughput but protects institutional knowledge.

How Zurich Compares With Amsterdam, Vienna, and Singapore

Zurich's approach sits notably ahead of several comparable cities in Europe and beyond. Amsterdam's Stadsarchief, which manages one of the continent's largest urban photographic collections, announced a duplicate-review initiative in late 2024 but has not yet publicly reported results or methodology. Vienna's Wienbibliothek im Rathaus acknowledged the duplication problem in a 2025 annual report but framed it as a medium-term priority rather than an active remediation project.

Singapore's National Archives, often cited as a benchmark for Asian public digitisation programmes, has invested heavily in automated deduplication since 2021, and its public portal explicitly reports a verified unique-asset count rather than a raw file count — a transparency standard that Zurich's data.stadt-zuerich.ch does not yet match. That gap has drawn quiet attention from the Swiss Federal Archives in Bern, which is evaluating whether to recommend Singapore's reporting model as a best-practice template for Swiss institutions.

London's Metropolitan Archives took a different route, contracting a private vendor in 2023 to run a one-time deduplication sweep rather than building in-house capacity. The approach cleared the backlog faster but left the institution without a repeatable process for future ingestion cycles — a trade-off that archival professionals elsewhere have noted with some caution.

For Zurich residents who use the open-data portal to access historical neighbourhood photographs — popular searches include Kreis 4, the Langstrasse area, and pre-war Aussersihl — the practical effect of the clean-up will be cleaner search returns and more reliable licensing information. The Stadtarchiv has indicated it expects to publish a progress report on the deduplication programme by the fourth quarter of 2026, which would be the first public accounting of how large the duplicate problem actually was and how much of it has been resolved. That report, whenever it arrives, will be worth reading closely.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.