The Daily Zurich

Zurich news, every day

News

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Are Startling

A growing body of data reveals how redundant image files are quietly consuming storage budgets, distorting search results, and complicating the city's push toward leaner, smarter digital infrastructure.

By Zurich News Desk · Published 4 July 2026, 9:00 pm

3 min read

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Are Startling
Photo: Photo by Magda Ehlers on Pexels

Zurich's public institutions collectively store tens of millions of digital image files — and a significant share of them are exact or near-exact copies of each other. That is the core finding driving a quiet but urgent conversation inside the city's document management and IT procurement circles this year, as budget cycles tighten and the Canton of Zurich accelerates its e-government digitalisation program.

Duplicate image replacement — the systematic identification, consolidation, and deletion of redundant visual assets — sounds like a niche IT task. The data behind it, however, tells a larger story about how Swiss public bodies manage digital sprawl, and what it costs them when they don't.

The Scale of the Problem in Zurich

Industry benchmarks for large institutional image libraries consistently place duplication rates between 20 and 40 percent of total stored assets. For a city the size of Zurich, which serves roughly 450,000 residents and runs dozens of administrative departments, that range translates into a substantial volume of redundant data sitting on servers across buildings from the Stadthaus on Stadthausquai to the main offices of Stadtentwicklung Zürich near Lindenhof.

Storage costs matter here. Enterprise-grade managed cloud storage in Switzerland runs between CHF 0.02 and CHF 0.05 per gigabyte per month for institutions operating under Swiss data-residency rules — a requirement that rules out many cheaper international providers. A library of 10 terabytes carrying a 30 percent duplication load wastes roughly 3 TB of billable space every month. Across a multi-year contract, that compounds quickly.

ETH Zurich's IT Services division has published internal guidance encouraging research groups to audit image repositories before migrating to new storage environments, citing redundancy as one of the top three causes of migration cost overruns. The university's main data centre, located on the Hönggerberg campus, processes research image datasets that run into the petabyte range across active projects.

Zürich Stadtarchiv, the city's official records office operating from its premises near the Rathaus, faces a parallel challenge with digitised historical photograph collections. Scanning programs from the early 2010s sometimes produced multiple file versions — different resolutions, colour profiles, or naming conventions — for the same original image, a legacy that deduplication software is only now being deployed to address systematically.

What the Data Actually Shows — and What Comes Next

Automated duplicate-detection tools generally work on two methods: exact hash matching, which flags pixel-for-pixel identical files, and perceptual hashing, which catches near-duplicates such as slightly cropped or recompressed versions of the same image. The latter category is where institutional libraries tend to have the larger problem. Perceptual duplicates can account for twice as many redundant files as exact copies, according to technical documentation published by several European archival standards bodies.

For Zurich's housing and urban planning departments — already stretched by the city's acute Wohnungsnot crisis and the political pressure to digitise permitting workflows faster — the practical consequence is slower search performance in document management systems. When a planner at Amt für Städtebau searches for imagery associated with a specific development site in Schwamendingen or Altstetten, duplicate entries inflate result sets and introduce version-confusion that adds time to already tight review cycles.

The financial case for systematic deduplication is straightforward: institutions that have completed major cleanup projects in comparable European cities report storage footprint reductions of 15 to 35 percent, with associated cost savings that typically recover the project's implementation cost within 18 months.

For Zurich institutions weighing when to act, the window is narrowing. The Canton of Zurich's broader e-government infrastructure roadmap anticipates significant migration activity through 2027 and into 2028. IT administrators at cantonal agencies have been advised in internal planning documents to complete data hygiene audits — including image deduplication — before migration, not after. Doing it in the wrong order means paying to move the clutter, then paying again to clean it up on the new system. The arithmetic on that choice is not complicated.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.