Zurich's Digital Archives Strike Back Against the Duplicate Image Problem
A wave of image-deduplication work sweeping through Swiss cultural institutions this week is reshaping how Zurich manages its vast visual heritage — and who gets to access it.
A wave of image-deduplication work sweeping through Swiss cultural institutions this week is reshaping how Zurich manages its vast visual heritage — and who gets to access it.

Swiss digital archivists confirmed this week that a coordinated push to identify and remove duplicate image files from public-facing databases has cleared more than 40,000 redundant entries from the combined collections held by Stadt Zürich's Stadtarchiv on Neumarkt and the Zentralbibliothek Zürich on Zähringerplatz. The cleanup, months in preparation, went live on Monday, July 1.
The timing matters. Zurich's cultural institutions have spent the better part of three years digitising physical collections at accelerated pace, a drive partly funded through the city's 2023–2026 digital transformation budget. Speed created clutter. The same photograph, scanned at different resolutions or uploaded by different departments, ended up catalogued as separate items — inflating search results, confusing researchers, and consuming server storage that carries real costs. With those budgets now under review ahead of the 2027 municipal spending cycle, institutions have strong financial motivation to demonstrate efficiency.
The process is more technically demanding than it sounds. Simple file-name matching catches almost nothing; archivists rely on perceptual hashing algorithms that compare images pixel-by-pixel at a structural level, flagging near-identical files even when resolution, colour profile, or metadata differ. ETH Zurich's Scientific IT Services unit has been advising several institutions on the method since early 2025, drawing on computer vision work developed in-house at the university's main campus on Rämistrasse.
The Stadtarchiv's digital collection alone held roughly 320,000 images as of June, according to figures the institution published in its 2025 annual report. Staff estimate that between eight and twelve percent of those entries were duplicates of some kind — a proportion consistent with what peer institutions in Basel and Bern have reported after similar audits. Removing confirmed duplicates does not mean destroying originals: the policy at both the Stadtarchiv and the Zentralbibliothek is to retain the highest-resolution master file and formally delist the rest, keeping a reconciliation log for researchers who need to trace provenance.
The practical effect for users of the public search portal is cleaner, faster results. A researcher querying images of the 1910 Landesausstellung site along the lake, for instance, was previously liable to encounter the same postcard scan appearing four or five times under different accession numbers. That kind of noise made automated research tools — the kind increasingly used by historians and journalists pulling data in bulk — unreliable.
The duplicate problem is not unique to Zurich, but the city's scale makes it a test case. The Swiss Federal Archives in Bern and the Memoriav association, which coordinates audiovisual preservation across the country, have both signalled that they will watch Zurich's reconciliation-log approach closely before deciding whether to adopt it nationally.
There is a democratic dimension, too. Switzerland's tradition of direct access to public records means citizens can and do request archival materials for referendum campaigns, planning disputes, and legal proceedings. When the same document or image appears under multiple catalogue entries, it creates ambiguity about which version is authoritative. The canton of Zurich's Öffentlichkeitsgesetz — the cantonal transparency law — places obligations on institutions to provide clear, definitive records. Duplicate entries, archivists argue, are a quiet but genuine compliance risk.
The deduplication sprint wraps up its first phase on July 31. The Stadtarchiv has scheduled a public information session at its Neumarkt premises for July 22, where staff will walk through the methodology and explain what researchers can expect to find changed in the online catalogue. Institutions running their own digitisation projects — local historical societies, parish archives, the Baugeschichtliches Archiv on Neumarkt — have been invited to attend and consider whether similar audits make sense for their own holdings. For anyone who relies on Zurich's public image databases, booking a slot is time well spent.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Zurich
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News