The Daily Zurich

Zurich news, every day

News

Zurich Archives Push Forward on Duplicate Image Cleanup as Digital Catalogue Hits Critical Scale

A week of technical fixes and policy decisions at city-level institutions signals a turning point in how Zurich manages its growing digital visual archives.

By Zurich News Desk · Published 4 July 2026, 9:16 pm

3 min read

Zurich Archives Push Forward on Duplicate Image Cleanup as Digital Catalogue Hits Critical Scale
Photo: Photo by Magda Ehlers on Pexels

Zurich's main public institutions are racing to resolve a problem that has quietly ballooned inside their digital collections: thousands of duplicate images clogging catalogues, distorting search results, and eating into storage budgets that were never designed to absorb this kind of waste. This week, the issue moved from back-office headache to institutional priority.

The Zurich City Archive, housed in a building off Neumarkt in the Altstadt, confirmed it has been running a deduplication audit across its photographic holdings since June. Staff there are cross-referencing scanned historical prints against digital acquisitions made after 2019, a process that has already flagged several hundred redundant image files in the first phase alone. The audit draws on open-source image-fingerprinting tools — specifically perceptual hashing software — that compare visual content rather than just file names or metadata tags.

Why This Week Changed the Conversation

The timing is not incidental. ETH Zurich's Institute for Information Security published a technical assessment in late June examining data integrity risks inside Swiss public sector digital repositories. The report, circulated internally before public release, described duplicate-image accumulation as a systemic issue that compounds over time when institutions migrate data between platforms without cleaning legacy content first. ETH Zurich researchers have been working with the Swiss Federal Archives in Bern on protocols that could eventually become a national standard — but individual city institutions are not waiting.

The Zentralbibliothek Zürich, on Zähringerplatz, took a separate step this week by updating its internal content management guidelines to require a mandatory deduplication check before any batch upload to its digitised collections portal. The library manages more than 5 million catalogue entries across text and image formats, and its digital team has been dealing with the downstream effects of three separate platform migrations carried out between 2015 and 2023. Those migrations left clusters of duplicate image files in multiple subject areas, particularly in its historical map and postcard collections.

Storage costs are part of what is pushing urgency here. Cloud storage pricing for public institutions in Switzerland has risen significantly over the past two years, and duplicates compound the problem directly. A single high-resolution archival scan can run to 80 megabytes or more. Multiply that across hundreds of undetected duplicates and the waste becomes material. The Zentralbibliothek has not published its specific storage spend, but comparable institutions in Germany have reported that deduplication projects reduced storage consumption by between 12 and 18 percent in pilot programmes completed in 2024 and 2025.

What Comes Next for Institutions and Users

Researchers and members of the public who use Zurich's digital image portals may begin to notice cleaner search results in the coming weeks as the Neumarkt archive rolls out the first tranche of its cleaned catalogue. The practical effect should be fewer duplicate hits when searching by subject tag or date range — a persistent frustration flagged in user feedback collected by the City Archive over the past two years.

Longer term, both institutions are watching whether the ETH-Federal Archives protocol work produces binding guidance. If the Swiss Federal Chancellery adopts a national standard, city-level archives in Zurich, Basel, and Bern would likely be expected to align their own deduplication procedures within a defined transition period. That process, if it follows the pattern of earlier Swiss digital governance initiatives, could take 18 to 24 months from publication to implementation deadline.

For now, the practical advice to anyone accessing digitised collections through the Zentralbibliothek's online portal or the city's own e-archive is to refine search parameters with specific date ranges rather than broad subject terms. The duplicates that remain in the system until the audit concludes tend to cluster around the most heavily indexed subject categories — urban development, wartime photography, and transport history among them. Narrower queries return cleaner results in the interim.

The audit at the Neumarkt site is expected to complete its first phase by the end of July.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.