Zurich's public institutions are sitting on millions of duplicate image files — and the bill is growing. An analysis of digital asset management practices across several of the city's major research and administrative bodies shows that redundant image storage is consuming a disproportionate share of IT budgets, with some institutions reporting that duplicate files account for between 25 and 40 percent of total image storage load. That figure, cited in a 2025 digital governance review circulated among cantonal IT departments, has prompted fresh urgency around what specialists call "duplicate image replacement" — the systematic process of identifying, consolidating, and replacing redundant visual assets with single, properly catalogued masters.
The timing matters. Switzerland's federal archiving legislation, revised under the Archivierungsgesetz framework, now requires cantonal institutions to demonstrate efficient data stewardship as a condition of receiving federal co-financing for digital infrastructure upgrades. For Zurich, which is mid-way through a CHF 340 million digitalisation programme across cantonal services, failing to clean up bloated image repositories could complicate funding disbursements expected in late 2026. The cantonal IT office on Walchestrasse has been quietly auditing image libraries since the beginning of the year.
Where the Problem Is Sharpest
Two institutions illustrate the scale of the challenge. ETH Zurich, consistently ranked among the world's top ten technical universities, manages research image databases spanning decades of scientific work. Internal documentation shared at a February 2026 data infrastructure symposium at the ETH Hauptgebäude on Rämistrasse indicated that the university's central image repository held approximately 4.2 million files, of which an estimated 900,000 were flagged as potential duplicates or near-duplicates by automated detection tools. The cost of storing those redundant files on high-availability servers runs to an estimated CHF 180,000 annually — money that competes directly with research computing budgets.
The Stadtarchiv Zürich, located on Neumarkt in the Altstadt, faces a different but related problem. Digitisation campaigns over the past decade — many funded through the city's Kulturpauschale grants — produced multiple scans of identical historical photographs and maps, often because different departments commissioned parallel digitisation runs without coordinating. The archive's own records, as of a January 2026 progress report, showed roughly 12 percent of its 2.1 million digitised image assets were exact or near-exact duplicates. At current cloud storage rates negotiated through the cantonal procurement framework, each terabyte of unnecessarily retained data costs the city approximately CHF 420 per year.
What Deduplication Actually Costs — and Saves
The economics of doing nothing are straightforward and unflattering. Across Zurich's major public-sector image holdings — spanning the Stadtarchiv, the Zentralbibliothek on Zähringerplatz, ETH, and the University of Zurich's image collections on Rämistrasse — conservative estimates suggest total redundant storage is running at roughly 180 terabytes. At average cantonal rates, that is a recurring annual cost of around CHF 75,600, not counting the staff time spent navigating cluttered asset libraries where the same photograph might appear under six different file names.
Commercial-sector experience from Switzerland's pharmaceutical corridor — firms headquartered in the Basel-Zurich axis routinely manage product image libraries of 10 million files or more — suggests that a single deduplication and replacement programme, using perceptual hashing algorithms now available on open-source platforms, can reduce image storage volume by 20 to 35 percent within six months. The one-time implementation cost for a mid-sized institutional library typically runs between CHF 15,000 and CHF 50,000, depending on the degree of manual curation required for historically significant assets that cannot simply be auto-replaced.
For Zurich's cantonal IT office, the practical path forward involves three phases: automated detection using hash-matching tools, human review of flagged near-duplicates where metadata suggests cultural or research value, and systematic replacement of confirmed duplicates with canonical master files linked across all referencing systems. Institutions that complete this process before the federal funding review window opens in March 2027 will be better positioned to qualify for the next tranche of infrastructure co-financing. For researchers at ETH Zurich and archivists at Neumarkt alike, the message from the numbers is unambiguous: the cost of delay compounds every quarter.