Zurich's public institutions are sitting on a sprawling mess of duplicate digital images — the same photographs catalogued under different filenames, stored across multiple servers, and in some cases licensed and paid for more than once. The problem did not arrive overnight. It grew steadily over roughly fifteen years as individual departments, universities and cultural organisations each built their own digital infrastructure without a shared standard for image management.
The issue matters now because the city is midway through a broader push to consolidate its data holdings. The Stadtarchiv Zürich, housed at Neumarkt 4 in the Altstadt, has been working since 2023 to audit its photographic collections as part of a wider municipal digitisation programme. Staff there discovered that a significant portion of scanned historical photographs existed in two, three or even four separate catalogue entries — sometimes with slightly different metadata, sometimes with conflicting copyright attributions.
How the Duplication Accumulated
The roots of the problem lie in the early 2000s, when departments began scanning physical collections independently. ETH Zürich's image library, the Zentralbibliothek Zürich on Zähringerplatz, and the Kunsthaus Zürich all developed their own workflows. When a photograph was relevant to more than one institution — a 1960s aerial shot of the Limmatquai, for instance, or construction images from the Hardbrücke — each institution often scanned and stored its own copy rather than linking to a shared asset.
Cloud migration made it worse. Between roughly 2015 and 2020, institutions moved local server archives to hosted platforms. Migration scripts frequently failed to check for pre-existing files, so duplicates were imported wholesale into the new environments. A 2024 internal review at the Zentralbibliothek found that approximately 18 percent of its digital image holdings were redundant copies — a figure that, applied across the city's dozens of archival bodies, represents a substantial waste of licensed storage capacity.
There is a direct financial consequence. Commercial cloud storage for large image files is not cheap. Uncompressed archival TIFF files can run to several hundred megabytes each. At current enterprise pricing on platforms used by Swiss public institutions, retaining millions of unnecessary files costs real money annually — money drawn from budgets that are already under pressure in a city where the housing shortage and infrastructure demands consume an ever-larger share of public spending.
The Push Toward a Unified Standard
The Swiss federal government's KOST — the coordination body for the permanent preservation of digital data — has been pushing cantonal and municipal archives toward the OAIS reference model since at least 2018. Adoption in Zurich has been uneven. The Stadtarchiv and ETH Zürich have made the most progress. Smaller bodies, including several of the Stadtkreis-level district archives, are still working from legacy systems that were never designed to flag duplicate assets on ingestion.
The practical mechanics of deduplication are more complicated than simply deleting obvious copies. Archivists must establish which version of a duplicated image carries the most accurate metadata, the most legally clear rights information, and the highest technical quality. Deleting the wrong copy — even of an apparently identical image — can erase a provenance chain that researchers rely on. The Stadtarchiv completed a pilot deduplication exercise across one photographic sub-collection in late 2025, resolving around 4,200 duplicate pairs. Staff described the process as time-intensive, requiring manual review of a meaningful fraction of the flagged pairs.
What comes next depends largely on whether the city's various institutions can agree on a shared asset management platform — something that has been discussed at the cantonal level but not yet formalised. A working group under the Präsidialdepartement was due to report its recommendations by the end of the first quarter of 2026; that timeline has slipped. For researchers using the Lesesaal at Neumarkt or accessing the ETH's e-pics platform, the practical advice for now is straightforward: when the same image appears under different catalogue numbers, report it. Archivists say user reports remain one of their most reliable tools for catching duplication that automated checks miss.