The Daily Zurich

Zurich news, every day

News

Zurich's Digital Archives Push Forward on Duplicate Image Problem — Here's What Changed This Week

Swiss institutions racing to clean up overlapping photo records are hitting both breakthroughs and bureaucratic walls as a city-wide digitisation drive enters a critical phase.

By Zurich News Desk · Published 4 July 2026, 9:16 pm

3 min read

Zurich's Digital Archives Push Forward on Duplicate Image Problem — Here's What Changed This Week
Photo: Photo by Elijah Cobb on Pexels

A coordinated effort to purge thousands of duplicate images from Zurich's public digital collections moved into a new operational stage this week, with the Stadtarchiv Zürich and ETH-Bibliothek both confirming internal reviews of their holdings are underway. The immediate trigger: a joint working session held Tuesday at the ETH Zürich main building on Rämistrasse that brought together archivists, computer scientists and cantonal records officials to agree on shared detection protocols.

The problem has been quietly accumulating for years. As Zurich's cultural institutions accelerated their digitisation programs — particularly after the 2020-era funding injections tied to the city's Smart Zurich strategy — photo collections from different sources were ingested into overlapping databases without consistent deduplication checks. The result is redundant storage costs, cataloguing confusion, and, crucially, broken metadata chains that make historical photographs harder to find and attribute correctly.

What the Week's Sessions Actually Produced

Tuesday's working session, which ran for roughly five hours according to the agenda circulated to participating institutions, produced a draft technical specification for a shared hashing standard. The approach would assign each image a unique perceptual fingerprint, allowing automated systems to flag near-identical files even when file names, formats or compression levels differ. The Schweizerisches Nationalmuseum, which holds a substantial Zurich-related photographic collection at its Museumstrasse site, was represented in the discussions and is expected to run a pilot comparison of its digital holdings against the Stadtarchiv's catalogue before the end of August 2026.

For the ETH-Bibliothek, the stakes are particularly high. The library's e-manuscripta and e-rara platforms together host hundreds of thousands of digitised items, and an internal audit completed in spring 2026 identified a duplication rate of roughly 4.2 percent across image-based holdings — meaning tens of thousands of files that either replicate existing records or overlap substantially with partner institution uploads. Staff at the Hauptbibliothek on Zähringerplatz have been working through a backlog of flagged items since April.

The financial dimension matters too. Cloud storage costs for Swiss public-sector institutions have risen sharply since 2023, and the canton of Zurich's IT directorate has signalled that departments will face tighter per-gigabyte allocations from the 2027 budget cycle. Eliminating verified duplicates is now framed internally not just as a cataloguing hygiene measure but as a cost-containment priority. Rough estimates from comparable European digitisation programs suggest that duplication rates in the 3-5 percent range can translate to annual storage costs running into six figures for large collections — though Zurich institutions have not published their own figures publicly.

Why This Matters Beyond the Archives

The push is not confined to institutional back-offices. The city's open-data portal, accessible through the Stadt Zürich Open Government Data platform, draws on several of these same image repositories to populate public-facing historical maps and neighbourhood documentation tools — resources used by schools, journalists and researchers. Duplicated or misattributed images in the source databases surface as errors in those public tools, undermining the portal's credibility.

The Zürich-based digital preservation nonprofit IG digitale Langzeitarchivierung has been lobbying for a cantonal-level deduplication standard since at least 2024, arguing that voluntary coordination between institutions is too slow and that a binding technical framework is needed. Whether the Tuesday working session's draft specification eventually becomes that framework will depend on sign-off from the Stadtrat and the cantonal Bildungsdirektion — a process that, given Zurich's direct-democracy procedural requirements, is unlikely to conclude before early 2027.

For anyone using Zurich's digital collections in the meantime, archivists at the Stadtarchiv on Neumarkt have confirmed that queries about specific image duplications can be submitted through the existing public research request system. The ETH-Bibliothek has also indicated it will publish a progress update on its deduplication work in its next quarterly report, expected in September 2026.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.