Zurich's public institutions are sitting on hundreds of thousands of duplicate digital images. The problem has been building for roughly a decade, and administrators at the Stadtarchiv Zürich on Alfred-Escher-Strasse and curators at the Kunsthaus Zürich are now confronting what happens when rapid digitisation outpaces the governance frameworks meant to manage it.
The timing matters. Switzerland's federal government set a binding open-data deadline for cantonal and municipal archives, requiring structured, de-duplicated digital collections to be accessible through national portals by the end of 2026. That clock is running. Institutions that digitised in siloed bursts — each department uploading its own scans with its own naming conventions, often using incompatible software platforms — are now discovering that their collections contain duplicate images numbering in the tens of thousands, consuming server space, distorting catalogue search results and complicating copyright clearance workflows.
A Decade of Fragmented Digitisation
The roots of the problem stretch back to around 2015, when Swiss institutions began large-scale image digitisation in earnest, driven partly by EU-adjacent funding opportunities and partly by the Swiss National Strategy on Open Government Data, which the Federal Council adopted in 2014. At the time, each institution largely went its own way. The Zentralbibliothek Zürich on Zähringerplatz launched its own scanning programme. ETH-Bibliothek, operating from the ETH Zürich campus on Rämistrasse, built a parallel infrastructure. City museum networks, neighbourhood history societies in districts like Wiedikon and Aussersihl, and the cantonal school archives all contributed material — frequently the same material, scanned twice or three times over.
The practical consequence is a catalogue problem that is also a cost problem. Cloud storage is not free. Industry analysts tracking Swiss public-sector IT spending have noted that unstructured image repositories impose ongoing licensing and maintenance costs that compound annually. A 2024 report from the Swiss Federal Archives, cited in national media at the time, identified duplicate file management as one of the top three inefficiencies in cantonal digital preservation budgets, though precise figures for Zurich specifically were not broken down publicly.
Staff at affected institutions describe a workflow where curators searching for, say, a 1960s photograph of the Limmatquai for an exhibition must wade through multiple near-identical scans of the same image, each tagged differently, each sitting in a different folder hierarchy inherited from a different project phase. The issue compounds when images are shared externally — a duplicate with slightly different metadata can create conflicting copyright attribution, which matters acutely for institutions that license images commercially to publishers and media organisations.
What Comes Next for Zurich's Collections
The city is not starting from scratch. Zurich adopted a Digital Strategy in 2021 that formally committed municipal departments to interoperable data standards, and the canton has been running a pilot de-duplication programme through its Amt für Informatik since 2023. The pilot uses hash-based image fingerprinting tools to identify pixel-identical and near-identical files across repositories, a technical approach now standard in large media organisations and national broadcasters.
For institutions on the front line, the practical advice from archivists and digital preservation specialists is consistent: prioritise establishing a single master record for each image before the end-of-year federal deadline, rather than attempting a full clean-up simultaneously. That means metadata reconciliation first, deletion second. The Kunsthaus, which digitised significant portions of its photographic archive during its 2021 extension project — the Chipperfield building on Heimplatz opened that year — is reported to be among those working through precisely this sequencing challenge.
The broader lesson from Zurich's experience is one other European cities with similar digitisation timelines — Geneva, Hamburg, Vienna — are also working through. Rapid scanning solved an access problem. It created a management problem. The institutions that moved fastest without a shared taxonomy are now spending more time and money on remediation than they saved in the original rush to go digital. The federal deadline is concentrating minds in a way that years of internal policy memos evidently did not.