How Zurich's Digital Archives Got Buried Under Thousands of Duplicate Images — and What Went Wrong
A slow-burning data management crisis inside the city's public institutions reveals years of missed warnings and no clear plan to fix it.
A slow-burning data management crisis inside the city's public institutions reveals years of missed warnings and no clear plan to fix it.

Zurich's municipal digital infrastructure is sitting on a problem that has been years in the making. Across city departments, university libraries, and cultural institutions, duplicate image files have accumulated to a point where archivists, IT managers, and records officers are now being forced to confront the scale of the disorder. The trigger is a quiet but significant shift: public sector digitisation programs funded under the federal eGovernment Switzerland initiative began formal audits in early 2026, and the results have been uncomfortable reading.
The problem did not arrive overnight. It is the product of how Zurich's institutions adopted digital storage in waves — first in the mid-2000s, then again around 2012 when cloud migration became fashionable, and a third time during the 2020 pandemic scramble when paper-based workflows collapsed and staff uploaded anything they could find. Each wave deposited new copies of existing files onto new systems, with few deletion protocols and almost no deduplication software running in the background.
ETH Zurich's library system, one of the most heavily used research repositories in continental Europe, began flagging the issue internally as early as 2021. The Stadt Zürich Stadtarchiv on Neumarkt, which holds centuries of civic records, faced its own version of the problem after the 2019 rollout of a new content management system that imported legacy databases without cleaning them first. Neither institution has publicly quantified the backlog, but archivists working in both facilities have described the deduplication task as a multi-year project in publicly available procurement documents from 2024 and 2025.
The housing shortage, banking sector restructuring after the UBS-Credit Suisse merger of 2023, and pharmaceutical compliance requirements from companies operating out of the Zürich Nord corridor have all pushed the same pressure onto local IT teams: store more, store faster, and store it redundantly. The result is that storage costs for the city's larger departments have grown substantially, and the duplicate image problem sits at the centre of that expense. Digital storage is cheap per gigabyte — hovering around CHF 0.02 per GB per month on standard commercial cloud contracts in Switzerland as of 2025 — but at institutional scale, with millions of unexamined image files, even fractional costs compound.
The Swiss Federal Archives in Bern updated its technical recommendations for cantonal bodies in March 2026, specifically calling out image file duplication as a risk category for both cost management and long-term data integrity. Cantonal institutions in Zurich are expected to align with those recommendations by the end of the fourth quarter of 2026. That deadline is functioning as a forcing mechanism that earlier, softer guidance never managed to be.
Institutions that have been through the process elsewhere in Switzerland, including archives in Basel-Stadt and the canton of Vaud, found that automated deduplication tools could identify redundant files quickly, but that human review of flagged images — to confirm nothing historically significant was deleted — was the bottleneck. Basel's cantonal archive completed a comparable project in 14 months. Zurich's institutions, which operate at considerably larger scale, are planning for 18 to 24 months of active remediation work.
For organisations along Zurich's Langstrasse cultural corridor and the Zürich West redevelopment zone, where smaller creative and media institutions have grown up alongside the larger public bodies, the audit pressure is less formal but still present. Many of those organisations rely on the same federated cloud platforms and face the same structural problem: images uploaded in haste, never reviewed, never removed.
Practically, institutions starting this process now are advised to begin with a storage inventory rather than deletion. Knowing what exists — and how many copies of it exist — is the prerequisite for any deduplication strategy. The tools to do that work are widely available and, in several cases, open-source. The harder challenge is institutional: setting clear ownership for the decision of what to keep, and committing the staff time to make those calls before the federal deadline arrives.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Zurich
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News