The Daily Zurich

Zurich news, every day

News

Zurich's Digital Archives Have a Duplicate Image Problem — Here's What Officials and Experts Are Saying

From cantonal records offices to ETH Zurich's research databases, the city's institutions are grappling with a growing crisis of redundant imagery that wastes storage, distorts catalogues, and costs real money.

By Zurich News Desk · Published 4 July 2026, 8:45 pm

3 min read

Zurich's Digital Archives Have a Duplicate Image Problem — Here's What Officials and Experts Are Saying
Photo: Photo by Marcel Biegger on Pexels

Zurich's public and academic institutions are sitting on a problem that has quietly ballooned over the past decade: duplicate images clogging digital archives, inflating storage costs, and undermining the integrity of records that range from urban planning documents to scientific datasets. The issue moved from background nuisance to front-of-mind concern this spring, when the cantonal government's digital infrastructure working group flagged it as a budget-relevant inefficiency ahead of the 2027 fiscal planning cycle.

The timing is pointed. Swiss federal data governance standards updated in January 2026 now require public bodies to demonstrate active deduplication protocols as part of routine audits. For institutions that have not yet complied, the window is narrowing. The State Chancellery of the Canton of Zurich has confirmed it is reviewing its document management systems, though officials have not provided a public timeline for full compliance.

Who Is Raising the Alarm — and Where

The loudest voices come from two corners of the city. At ETH Zurich on Rämistrasse, data scientists working within the university's Scientific IT Services division have described duplicate image accumulation as a structural byproduct of collaborative research workflows, where multiple teams independently archive the same experimental photographs or satellite imagery without cross-referencing. The university's own storage infrastructure runs to petabyte scale, and even a modest duplication rate compounds rapidly at that volume.

Across town, the Stadtarchiv Zürich on Neumarkt has faced parallel pressures. The archive, which holds digitised photographic collections spanning more than a century, began a systematic deduplication review in late 2025 after a cataloguing project revealed that some images had been scanned and filed more than three times under different reference numbers. Archivists have described the correction process as time-intensive but essential for public access quality.

Independent digital preservation specialists based at the Zentralbibliothek Zürich on Zähringerplatz have been consulting with several municipal departments on implementing perceptual hashing — a technique that identifies visually similar images even when file names or metadata differ. The library has hosted two practitioner workshops on the subject since March 2026, drawing participants from city planning, the Bauarchiv, and the health department.

What the Data Shows

Precise city-wide figures are not yet public, but the contours of the problem are visible in procurement records. The Canton of Zurich's IT directorate allocated CHF 4.2 million in its 2025 budget for general digital storage expansion — a figure that specialists say could be meaningfully reduced if deduplication were applied systematically before new capacity is purchased. European benchmarks cited in a 2025 study by the Geneva-based International Council on Archives suggest that poorly managed public digital collections carry duplication rates of between 15 and 30 percent, depending on the digitisation era and workflow maturity.

For Zurich's pharmaceutical and research corridor — which stretches from the Hönggerberg campus down through the Technopark on Technoparkstrasse — the stakes extend beyond storage bills. Duplicate images in clinical or materials research archives can introduce errors into training datasets for machine learning models, a concern that has grown sharper as both ETH spin-offs and established firms embed AI tools into their analytical pipelines.

The housing shortage pressing on Zurich's Wohnungsnot crisis adds an indirect dimension: urban planners at the Amt für Städtebau rely on georeferenced photographic records to track building stock changes in densifying neighbourhoods like Altstetten and Oerlikon. Duplicate or mislabelled imagery in those records can delay permitting reviews, a bottleneck that residents and developers have complained about for years.

The practical path forward, according to specialists consulted by the Zentralbibliothek, involves three steps: an institution-by-institution audit using automated detection tools, a shared deduplication protocol agreed across cantonal agencies, and a procurement rule requiring vendors of document management software to demonstrate native deduplication capability. Whether the 2027 budget process provides the political push to turn those recommendations into binding policy is the question officials in the Rathaus on Limmatquai will need to answer before the year is out.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.