The Daily Zurich

Zurich news, every day

News

Zurich Archivists Battle Thousands of Duplicate Images in Digital Collections

A coordinated effort this week to clean up thousands of redundant photographs in Zurich's municipal and institutional archives is exposing the messy reality behind the city's digital preservation push.

By Zurich News Desk · Published 4 July 2026, 8:36 pm

3 min read

Zurich Archivists Battle Thousands of Duplicate Images in Digital Collections
Photo: Arber, E. A. Newell (Edward Alexander Newell), 1870-1918 / Public domain (Wikimedia Commons)

Zurich's archive managers moved this week to tackle a problem that has quietly grown for years: tens of thousands of duplicate images clogging the city's digital heritage systems, making reliable retrieval difficult and inflating storage costs at a time when budgets are under pressure.

The issue came to a head on Tuesday, when representatives from Stadtarchiv Zürich on Alfred-Escher-Strasse and the Zentralbibliothek on Zähringerplatz met to coordinate a joint deduplication protocol — the first formal attempt to align their separate workflows since both institutions migrated to upgraded digital asset management platforms in 2024.

Why It Matters Now

The timing is not coincidental. Both institutions have spent the past 18 months ingesting large batches of newly digitised materials — glass-plate negatives from the early 20th century, municipal planning photographs from the postwar Zürich-Nord expansion, and thousands of images donated by private collectors. Ingesting material at speed is how duplicates multiply. A single photographic negative scanned at different resolutions by two separate departments can appear as four or five distinct file entries, none of them flagged as redundant.

ETH Zürich's Institute for Information Security and Cryptography has separately been developing automated image-hashing tools that can identify visually identical files even when their metadata differs — a common problem when images cross institutional boundaries. Researchers there have been in informal discussions with Stadtarchiv staff about a pilot deployment, though no contract has been signed.

The scale of the problem is not trivial. Stadtarchiv Zürich holds a digitised collection that, according to its publicly available annual report for 2024, runs to more than 1.2 million image files. Archive professionals generally estimate that deduplication exercises in comparable European municipal collections remove between eight and fifteen percent of files as genuine duplicates — which would put Zurich's potential redundancy count somewhere between 96,000 and 180,000 files. Those figures, applied locally, are illustrative rather than confirmed; Stadtarchiv has not published its own duplication rate.

For users, the consequence is practical and frustrating. A researcher at the Kunsthaus Zürich on Heimplatz trying to license a historical image of Bahnhofstrasse for a 2026 exhibition catalogue can currently encounter the same photograph listed under three different accession numbers, each with slightly different rights metadata. Clarifying which entry is authoritative can take days.

What the Deduplication Drive Involves

This week's protocol work centres on agreeing a shared file-naming convention and a unified hash-based identification standard before either institution runs bulk deletions. The caution is deliberate. A poorly managed deduplication sweep in Vienna's Wienbibliothek in 2022 accidentally removed master-quality files while retaining lower-resolution copies — a cautionary episode that Swiss archivists cite regularly in professional forums.

The Zentralbibliothek confirmed in a brief public notice posted to its website on Wednesday that it is running a test batch of approximately 40,000 image files through a deduplication review this month, with results to be assessed before any permanent deletions are authorised. Staff have been asked to flag any case where two apparently identical files carry different provenance notes, since provenance differences can mean the files are genuinely distinct records even if the image content looks the same.

Digital preservation professionals across Switzerland are watching. The Swiss National Library in Bern runs its own digitised image holdings under different standards, and a successful Zürich pilot could become the basis for a national recommendation through Memoriav, the Swiss association for audiovisual heritage, which has been pressing for more interoperability across cantonal institutions since at least 2023.

For institutions and researchers working with these collections, the practical advice for the coming weeks is straightforward: if you are relying on a specific image accession number from either Stadtarchiv or Zentralbibliothek for publication or licensing purposes, confirm its status before going to press. Both institutions have indicated that no files will be permanently deleted before September at the earliest, but metadata attached to individual records may be updated or merged as the review proceeds. Direct contact with the respective reading-room teams on Alfred-Escher-Strasse and Zähringerplatz is the safest route to getting a definitive answer.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.