Zurich's major public institutions are staring down a problem that is unglamorous but consequential: tens of thousands of duplicate images clogging digital archives, stored across fragmented servers, driving up costs and degrading the reliability of public records. The immediate trigger is a cantonal deadline. Under the updated canton of Zurich data governance framework, institutions receiving public funding must demonstrate compliance with unified metadata and deduplication standards by 31 March 2027.
The stakes are higher than a housekeeping exercise. Duplicate image records distort search results in public databases, create legal ambiguity around usage rights, and waste server infrastructure that costs money. For a city that has staked part of its identity on clean, efficient governance — the same instinct that drives Swiss direct democracy — the disorder inside these archives is an embarrassment that administrators are no longer able to defer.
Who Is Affected and What Must Change
The institutions most exposed are those that digitised large physical collections rapidly during the pandemic years, often without standardised naming conventions or hash-verification tools. The Stadtarchiv Zürich on Neumarkt, which holds centuries of municipal records, acknowledged in its 2025 annual report that its digitisation programme had produced significant file redundancy across image batches from 2020 to 2022. The Zentralbibliothek Zürich on Zähringerplatz faces a similar situation, particularly in its photograph and map collections, where duplicate scans were generated across different departmental workflows.
ETH Zurich's library services, ranked among the top research library networks in Europe, have already begun a phased deduplication project using perceptual hashing software — technology that identifies visually identical or near-identical images even when file names differ. That process, which started in January 2026, is expected to run until at least the end of the year and involves reviewing an estimated 1.4 million digitised image files. The approach ETH has taken is being watched closely by smaller cantonal institutions that lack the same technical capacity.
The financial dimension matters too. Cloud and on-premises storage costs for the city's cultural and administrative bodies have risen sharply since 2022, partly because unchecked duplication inflates storage volumes. Municipalities across the Zürich Oberland have already started sharing deduplication infrastructure costs through a joint procurement arrangement administered via the Gemeindeverband Zürich Unterland, a model that Zurich city planners are now evaluating for potential adoption in the Kreis 4 and Kreis 5 administrative clusters.
The Decisions That Will Define the Outcome
Three choices will determine whether this moment becomes a genuine reform or a compliance formality. First, institutions must decide whether to use automated deduplication tools alone or pair them with human review. Automated systems miss contextually significant near-duplicates — two almost-identical photographs of Bellevue Platz taken seconds apart, for instance, may both have archival value even if a hashing algorithm flags them as redundant. Getting that balance wrong risks deleting records that can never be recovered.
Second, city leaders need to decide whether to centralise duplicate-image management under a single cantonal body or allow institutions to handle it independently under a shared standard. Centralisation is faster and cheaper. Decentralisation preserves institutional expertise and autonomy, a value that carries real weight in a political culture built on subsidiarity and referendum rights. The canton's Amt für Informatik is reportedly drafting a recommendation on this question, though no decision has been announced publicly.
Third, there is the question of what happens to duplicates once identified. Deletion is not automatic — records that have been cited in legal proceedings, academic publications, or heritage designations may need to be retained even if they are technically redundant. The Stadtarchiv's internal protocols currently require case-by-case review for any image older than 1900, a rule that will generate significant backlog pressure as deduplication scales up.
The March 2027 deadline is firm. Institutions that miss it risk losing access to cantonal co-funding for digitisation projects — a significant threat given that several major archive expansion programmes depend on that money. The decisions made between now and the end of 2026 will determine whether Zurich's public institutions meet that deadline with genuine reform in place, or arrive at it having ticked boxes while leaving the underlying disorder intact.