Zurich's public digital infrastructure is carrying a problem that has quietly ballooned over the past decade: municipal image databases, cantonal archives and university repositories are riddled with duplicate photographs, scanned documents and visual records that nobody has systematically cleared out. Archivists and computer scientists now say the redundancy is no longer a minor inconvenience — it is costing real storage budget and undermining the reliability of search results across platforms used by journalists, planners and researchers alike.
The issue has gained urgency in 2026 partly because the City of Zurich's five-year digital transformation programme, launched under the Stadtrat's administrative modernisation framework in 2023, is entering its review phase this autumn. Officials responsible for that programme have indicated that deduplication — automatically identifying and removing near-identical image files — is among the technical tasks flagged for resolution before the programme's 2027 target date.
What the Experts Are Saying
At ETH Zurich, researchers in the Data Analytics Lab on Rämistrasse have been developing hash-based and perceptual-similarity tools capable of scanning large image collections and flagging duplicates for human review. The work is not purely academic. The lab has collaborated with the Zentralbibliothek Zürich on a pilot project that, according to the library's published project documentation, tested deduplication methods on a subset of its historical photograph collection housed in the Predigerplatz building. The library holds more than 130,000 digitised photographic items.
The core technical debate among specialists is whether fully automated deletion is safe or whether a human archivist must approve every removal. Automated systems can achieve very high accuracy on exact duplicates — two files that are bit-for-bit identical — but near-duplicates, such as two scans of the same 19th-century print made at slightly different resolutions, require judgment calls about which version to keep. Archivists at the Stadtarchiv Zürich on Neumarkt have publicly noted in their annual reporting that any deletion of historical visual records requires documented sign-off under cantonal records law.
The Swiss Federal Archives in Bern set a relevant precedent in 2024 when it published deduplication guidelines recommending a tiered approach: automated flagging, archival review, and a minimum 90-day hold period before permanent deletion. Zurich's cantonal IT office, the Amt für Informatik, has been studying whether those federal guidelines can be adapted for municipal use.
Cost, Housing Data and the Practical Stakes
The stakes are not abstract. Zurich's chronic Wohnungsnot housing shortage means that planning databases — which include aerial photographs, building survey images and architectural drawings — are under constant pressure from new construction and rezoning activity in districts such as Altstetten and Oerlikon. Planners relying on those databases need clean, deduplicated records to make fast decisions. A report presented to the Gemeinderat last year cited storage costs for the city's central document management system as one of the fastest-growing items in the IT budget, though specific figures were not made public at that session.
Private-sector pressure is also shaping the conversation. Zurich-based financial institutions managing their own compliance image archives — a category that grew substantially after the UBS absorption of Credit Suisse in 2023 — have separately been working through deduplication projects on internal document stores. That work has created a small but growing pool of local technical expertise that municipal archivists say they are watching closely.
For ordinary residents and small organisations, the practical advice from digital archiving specialists is consistent: do not wait for a top-down mandate. Institutions managing their own image collections — whether a neighbourhood association in Seefeld or a cultural venue like the Museum Rietberg — can run open-source perceptual hashing tools such as digiKam or ImageDuplicateFinder on their own servers before any centralised standard is finalised. The cost of storage in Zurich's commercial cloud market, currently around CHF 0.02 per gigabyte per month for standard tiers, makes the business case for early action straightforward. The alternative, specialists warn, is inheriting a backlog that only compounds as image collections continue to grow.