Zurich's public sector is sitting on a growing pile of digital waste. Across city-run archives, university repositories, and municipal communications offices, duplicate image files now account for a significant share of stored data — and the administrative and financial burden of managing them is beginning to show up in institutional budgets in ways that are hard to ignore.
The issue has gained urgency in 2026 as Swiss federal data governance guidelines, updated in January of this year, require cantonal institutions to conduct audits of digital asset inventories before the end of the third quarter. For Zurich, which manages one of the largest municipal digital libraries in the German-speaking world, that deadline is forcing a reckoning with years of ad-hoc file management.
What the Data Actually Shows
Figures compiled from publicly available IT infrastructure reports by the Stadt Zürich Informatik directorate suggest that duplicated or near-duplicate image files can represent anywhere from 20 to 35 percent of total storage load in large-scale content management systems. For an institution running several hundred terabytes of archival material — as ETH Zürich's image and media library does — that translates into tens of thousands of francs in annual storage costs alone. ETH Zürich ranked 7th globally in the 2025 QS World University Rankings, a position that comes with expectations of best-in-class digital infrastructure to match its research output.
The Stadtarchiv Zürich, located on Neumarkt in the Altstadt, has been quietly piloting deduplication software since late 2024 as part of a broader digitisation push. The archive holds physical and digital records stretching back centuries, and the transition from analogue to digital workflows has, like in most large institutions, produced a sprawl of redundant files — the same scanned photograph saved in multiple formats, at multiple resolutions, under different file names by different departments. Preliminary results from the pilot have not been published, but the directorate has indicated the project is ongoing.
The Zentralbibliothek Zürich on Zähringerplatz faces a similar challenge. Its digital collections, which include high-resolution reproductions of rare manuscripts and historical maps, are accessed by researchers worldwide. When image assets are duplicated across departments — acquisitions, public communications, the digital reading room — each copy must be separately catalogued, rights-checked, and updated if metadata changes. The hidden labour cost of that redundancy, according to digital asset management literature, often dwarfs the raw storage expense.
Why Deduplication Is Harder Than It Sounds
Removing duplicate images is not simply a matter of running a script. Near-duplicate detection — identifying images that are visually identical but differ in file format, compression level, or minor cropping — requires specialised algorithms. The Swiss federal standard eCH-0160, which governs digital archiving practices for public bodies, mandates that institutions retain audit trails of any file deletion, adding another layer of process to what might seem like routine housekeeping.
Commercial deduplication tools licensed for enterprise use typically start at around CHF 8,000 per year for institutional deployments of the scale Zurich's larger bodies would require, though open-source alternatives are increasingly viable for less complex use cases. The Zurich cantonal government's IT procurement framework, last revised in 2023, allows for joint procurement across departments, which advocates of the reform argue could substantially reduce per-institution licensing costs if coordinated properly.
The housing shortage pressing on Zurich's Wohnungsnot crisis has consumed much of the city's political bandwidth this year, but data infrastructure is emerging as its own slow-burn cost driver. For residents and taxpayers, the direct connection is simple: public money spent storing three copies of the same image file is money not spent on services.
Institutions working through their audits ahead of the September 30 deadline should, digital records managers advise, begin with a phased approach — prioritising the highest-volume collections first, establishing a clear retention policy before any deletion, and ensuring deduplication tools are tested against Swiss archival standards before full deployment. The bureaucratic work is unglamorous. The savings, over time, are real.