Zurich's public institutions are sitting on a problem that has been quietly compounding since the early 2000s: vast digital archives riddled with duplicate images, redundant file versions, and mis-catalogued visual assets that cost money to store, slow down public access, and undermine the credibility of records that researchers and city planners depend on daily.
The issue crystallised this spring when Stadt Zürich's digitisation office, operating out of its Stadtarchiv on Neumarkt, completed an internal audit of the municipal photo collection. The review — the first of its kind in over a decade — found that a significant share of the archive's digital image holdings were either exact duplicates or near-identical versions of the same photograph stored under different file names, across multiple servers. The full findings are expected to be presented to the Stadtrat before the end of the third quarter of 2026.
A Problem That Grew With Every Hard Drive
The roots of the duplication crisis trace back to the transition period between roughly 2001 and 2010, when Swiss institutions rushed to digitise analogue collections without standardising their workflows. Each department — urban planning, tourism, heritage preservation — ran its own digitisation programme, often contracting different vendors and using incompatible metadata schemas. The result was predictable in hindsight: the same photograph of, say, the Lindenhügel or the Bahnhofstrasse at Christmastime ended up stored three or four times across separate institutional drives, each copy tagged differently and none of them linked.
ETH Zurich, whose library system ranks among Europe's most heavily used research archives, encountered the same structural problem at scale. The ETH-Bibliothek, located on the main Hönggerberg campus as well as at the Rämistrasse building in the Hochschulquartier, manages hundreds of thousands of digitised images across its e-rara and e-pics platforms. Librarians there have been piloting automated deduplication software since at least 2023, testing tools capable of identifying perceptual duplicates — images that are visually identical but differ in resolution, compression, or colour profile — rather than just byte-for-byte matches.
The cost dimension is not trivial. Cloud and on-premises storage prices in Switzerland have fallen sharply over the past decade, but institutional archives are not simply storing jpegs. They maintain high-resolution TIFF masters, preservation-grade checksums, and layered backup systems that comply with Swiss federal archiving law, specifically the Bundesgesetz über die Archivierung (BGA), which was last revised in 2022. Each redundant file carries its full preservation overhead. Industry benchmarks suggest that deduplication exercises in comparable European municipal archives have reduced active storage loads by between 20 and 40 percent — a range that, applied to Zurich's holdings, would represent a meaningful budget recapture across multiple departments.
Standardisation, Finally
The Swiss memory institutions consortium — which includes the Schweizerisches Nationalmuseum on the Museumstrasse, the Schweizerische Nationalbibliothek in Bern, and cantonal partners — has been pushing since at least 2021 for a shared controlled vocabulary and unified file-naming convention. Progress has been slow, partly because each institution operates under different cantonal or federal mandates, and partly because legacy IT contracts have made migration expensive.
What has changed in 2026 is the pressure from two directions at once. The Zurich cantonal government's digitalisation strategy, Digitale Verwaltung Kanton Zürich, set a deadline of end-2027 for all affiliated archives to implement machine-readable metadata standards compatible with the European IIIF (International Image Interoperability Framework) protocol. At the same time, the housing and urban planning departments — under acute pressure to process planning documentation quickly given the ongoing Wohnungsnot shortage — have been loudest in complaining that duplicated, poorly indexed image files are slowing down permit research and historical land-use reviews.
For institutions and researchers who rely on these collections, the practical advice is to check whether the archive they are using has implemented a deduplication review before drawing conclusions from image counts or collection completeness figures. The Stadtarchiv on Neumarkt accepts research inquiries directly, and ETH-Bibliothek's digital collections team publishes update notices on the e-pics platform when catalogue corrections are made. The audit cycle, once ignored for years at a stretch, is now on an annual schedule — a small but concrete sign that the backlog, at last, is being treated as a live operational problem rather than an inherited inconvenience.