The Daily Zurich

Zurich news, every day

News

Zurich Researchers and Archivists Push Forward on Duplicate Image Detection This Week

New tools for identifying and replacing redundant digital images are gaining traction across Zurich's cultural institutions and tech sector, with ETH Zurich and the Stadtarchiv among those driving adoption.

By Zurich News Desk · Published 4 July 2026, 8:45 pm

3 min read

Zurich Researchers and Archivists Push Forward on Duplicate Image Detection This Week
Photo: Photo by Mâide Arslan on Pexels

A quiet but consequential shift is underway in how Zurich's institutions manage their digital collections. This week, archivists, software developers and academic researchers across the city have been testing and refining automated tools designed to detect duplicate images in large digital repositories — and, critically, to replace degraded or redundant copies with higher-quality originals. The push reflects years of accumulated pressure on institutions sitting on sprawling, poorly deduplicated image libraries.

The timing is not arbitrary. Switzerland's Federal Archives Law, last revised substantively in 2009, is again under discussion in Bern, with a consultation period running through late summer 2026. Cantonal institutions in Zurich — including the Stadtarchiv on Alfred-Escher-Strasse and the Zentralbibliothek at Zähringerplatz — have been quietly auditing their holdings ahead of any new federal guidance on digital asset standards. Duplicate image accumulation has emerged as one of the more costly and unglamorous problems in that audit work.

What the Tools Actually Do

The core technical challenge is not simply spotting identical files. Modern duplicate-detection pipelines have to recognise near-duplicates: images that were scanned twice at different resolutions, photographs that were cropped or slightly re-exposed, or JPEG copies of original TIFFs that have degraded over successive saves. Perceptual hashing — a method that generates a compact fingerprint from image content rather than raw file data — has become the standard approach. ETH Zurich's Computer Vision Lab, based on Rämistrasse, has been publishing work on improved perceptual hashing benchmarks this year, with a preprint circulated internally this week drawing interest from institutions in Berlin and Vienna.

The replacement question is harder than the detection question. Once a duplicate is flagged, a decision framework has to determine which copy is the authoritative version, whether the metadata chains are intact, and whether deleting the lower-quality copy creates any provenance gaps. For the Zentralbibliothek, which holds digitised images dating to nineteenth-century glass plate originals, that provenance question is not trivial. A single mis-identified replacement could sever the documented chain of custody for an irreplaceable object.

Storage costs are a concrete driver here. Enterprise cold-storage pricing in Switzerland has fallen substantially over the past five years, but institutions running collections in the hundreds of terabytes still face meaningful bills. One widely cited industry figure puts the annual cost of redundant image storage across mid-sized European cultural institutions at several million euros in aggregate, though precise Zurich-specific figures are not publicly available. What is known is that the Stadtarchiv processed over 1.2 million new digital image objects in 2024 alone, according to figures published in its annual report for that year — a volume at which even a five percent duplication rate represents a significant management burden.

What Comes Next for Zurich's Institutions

Practically speaking, the institutions most affected are preparing for a two-phase approach. Phase one — automated flagging of probable duplicates using perceptual hashing and file-metadata cross-referencing — is already operational in pilot form at ETH Zurich's library services division on Rämistrasse. Phase two, which involves human review of flagged pairs and authorised replacement or deletion, is scheduled to roll out more broadly before the end of the third quarter of 2026.

For smaller cultural organisations without dedicated digital preservation staff — the kind found along Limmatquai or in the cluster of independent galleries in Zurich-West — the practical advice from archivists this week has been to prioritise metadata hygiene before any bulk deduplication run. Removing a duplicate without preserving the associated rights information, acquisition date or source notation can create legal and provenance problems that are harder to fix than the original storage inefficiency.

The broader policy conversation will continue in Bern through September. Zurich's cantonal institutions are expected to submit formal responses to the federal consultation before the August deadline, and the outcomes of this week's pilot audits will likely inform those submissions directly. The unglamorous mechanics of image management, it turns out, are increasingly a matter of institutional record and public accountability.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.