The Daily Zurich

Zurich news, every day

News

Zurich Archives Push to Tackle Duplicate Image Crisis as Digital Collections Near Breaking Point

A week of intensive audits at city institutions has exposed the scale of redundant digital assets clogging storage systems and slowing public access to historical records.

By Zurich News Desk · Published 4 July 2026, 9:16 pm

3 min read

Zurich Archives Push to Tackle Duplicate Image Crisis as Digital Collections Near Breaking Point
Photo: Photo by Mâide Arslan on Pexels

Zurich's major cultural memory institutions moved this week to confront a problem that has quietly accumulated for years: tens of thousands of duplicate digital images sitting across fragmented servers, consuming storage budgets and making collections harder to search. The Stadt Zürich Stadtarchiv on Alfred-Escher-Strasse and the Zentralbibliothek Zürich on Zähringerplatz both confirmed ongoing internal audits aimed at identifying and replacing redundant image files with clean, single canonical versions linked across databases.

The timing is not coincidental. A rolling deadline set under the city's Digitale Verwaltung Zürich programme requires participating civic bodies to demonstrate clean, deduplicated asset registers by the end of the third quarter of 2026. Institutions that fail the audit face a freeze on new digitisation project funding — a serious consequence at a moment when demand for online historical access has surged.

Why Duplicate Images Became Such a Problem

The duplication crisis has roots in more than a decade of ad hoc scanning campaigns. Between 2010 and 2022, individual departments within the Stadtarchiv digitised their own holdings using different naming conventions, resolution standards and metadata schemas. When those collections were later migrated onto shared platforms, the same photograph — a street scene from Langstrasse in 1923, for example, or an aerial view of the Limmat taken in the 1950s — might exist in four or five versions, each with a slightly different filename and incomplete provenance data.

The Zentralbibliothek estimates that duplicate or near-duplicate image files account for roughly 18 percent of its total digital image holdings, based on a preliminary internal scan completed in late June 2026. Resolving that proportion would free an estimated 14 terabytes of primary storage, reducing annual infrastructure costs and improving search result quality for researchers using the library's public portal.

ETH Zürich's Data Archive unit, based at the main campus on Rämistrasse, has been running a parallel deduplication exercise since May. The university uses a perceptual hashing workflow — software that detects visually identical or near-identical images even when file sizes or formats differ — to flag candidates for review before any file is deleted. That cautious approach reflects a hard lesson: in 2023, an automated cleanup at a European institution not based in Switzerland resulted in the accidental deletion of unique historical photographs that had been incorrectly flagged as duplicates.

What the Audits Found This Week

Staff at the Stadtarchiv completed a block review of approximately 40,000 image records this week, covering holdings from the Stadtentwicklung collection. Preliminary findings show that around 6,200 of those files share pixel-level or near-pixel-level matches with at least one other file in the same repository. The next step is human review of the flagged pairs — an archivists' team rather than an automated script makes the final deletion or merge decision.

The Zentralbibliothek, meanwhile, piloted a new replacement workflow on Friday. Rather than deleting a duplicate outright, cataloguers now create a redirect record that points legacy URLs — including those already embedded in published academic papers and external websites — to the retained canonical file. This matters because broken image links in citations are increasingly common as old digitisation URLs expire, and the library's own analysis found that at least 1,200 external academic papers hosted on Swiss university repositories contain direct image links to its collections.

For Zurich residents and researchers, the practical change should become visible through the city's online portal Zürich Geschichte by autumn 2026. Search results for historical photographs are expected to reduce duplication noise significantly, and image download times should improve as storage infrastructure is consolidated onto fewer, better-maintained servers. Institutions involved in the audits plan to publish a joint methodology document through the Verein Schweizerischer Archivarinnen und Archivare by September, offering a replicable framework for smaller cantonal archives across German-speaking Switzerland that face the same challenge but lack dedicated technical staff to address it.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.