The Daily Zurich

Zurich news, every day

News

Zurich Archives Overhaul: What Happened This Week in the City's Duplicate Image Crisis

A push to clean up redundant digital images across Zurich's public institutions is exposing deeper questions about data governance, archival costs, and who pays for the mess.

By Zurich News Desk · Published 4 July 2026, 9:16 pm

3 min read

Zurich Archives Overhaul: What Happened This Week in the City's Duplicate Image Crisis
Photo: Photo by Natalia Sevruk on Pexels

Zurich's municipal digital archive system flagged more than 340,000 duplicate image files during a routine audit completed this week, according to city administration records reviewed by The Daily Zurich. The finding, which covers holdings managed under the Stadtarchiv Zürich on Alfred-Escher-Strasse, has prompted an emergency working group to convene before the end of July with a mandate to set binding de-duplication standards across all cantonal departments.

The timing matters. Zurich has spent the past three years consolidating its digital infrastructure following lessons drawn from the UBS-Credit Suisse merger aftermath, which showed how duplicated and unaudited data holdings can quietly balloon IT costs and create compliance headaches. City authorities have also been under pressure from the cantonal parliament to demonstrate measurable efficiency gains before the next budget cycle, when housing and climate programs are competing hard for every franc.

Where the Problem Is Concentrated

The bulk of the duplicate files — roughly 60 percent by file count — trace back to two sources: the digitisation pipeline at the Zentralbibliothek Zürich on Zähringerplatz, which accelerated scanning volumes sharply during the 2022–2024 pandemic recovery grant period, and a separate ingest system used by the city's urban planning office in Stadthaus on Stadthausquai. Both institutions operated with incompatible metadata schemas, meaning automated deduplication tools failed to catch matches that a human reviewer would spot immediately.

ETH Zurich's Data Science Lab, which has a formal cooperation agreement with the cantonal IT directorate, has been asked to contribute an automated image-fingerprinting tool it developed for research datasets. The lab's system uses perceptual hashing rather than simple file-name matching, catching near-identical images that differ only in compression artefacts or minor cropping. That distinction turns out to matter enormously in archive contexts: a 2025 internal review cited by the working group's terms of reference found that pure filename-based deduplication would have eliminated only about 12 percent of actual redundancies.

Storage costs are the immediate driver. Zurich currently pays for roughly 4.2 petabytes of managed archive storage across its primary and backup systems, a figure drawn from the cantonal IT services annual report published in March 2026. Eliminating confirmed duplicates could, by the working group's preliminary estimate, reduce active storage demand by 18 to 22 percent — a saving with direct budget implications at a moment when the city's Wohnungsnot housing program and the Klimaplan 2040 climate initiative are each pressing for additional administrative resources.

What Comes Next for Institutions and the Public

The working group is expected to deliver a draft protocol by 25 July 2026. Under the current proposal, institutions would be required to run perceptual-hash checks on any new image batch before ingest, and to submit quarterly deduplication reports to the cantonal data office. Non-compliance would trigger a cost-recovery mechanism, meaning departments that generate avoidable storage overhead would see that expense reflected in their own operational budgets rather than absorbed centrally.

For members of the public who use the Zentralbibliothek's online portal to access historical photographs of Zurich — images of Bahnhofstrasse in the 1950s, flood records from the Limmat, construction documentation from the Kreis 5 regeneration — the practical effect will be a cleaner, faster search interface. Librarians at the Zähringerplatz reading room confirmed this week that a test batch of 8,000 deduplicated records was already live in the catalogue, with search result noise visibly reduced.

The broader lesson the working group wants to institutionalise is straightforward: deduplication is not a one-time cleanup but a workflow discipline. Departments that want to avoid the cost-recovery penalty should begin auditing their ingest pipelines now, before the July 25 deadline sets the new standard in stone. The cantonal IT directorate has posted a guidance document on its intranet portal and is holding drop-in sessions at Stadthaus every Thursday through the month.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.