Zurich's cantonal administration is confronting a problem that built up quietly over more than a decade: tens of thousands of duplicate images sitting inside public-sector digital repositories, consuming server capacity, distorting search results, and costing real money to store. The Stadt Zürich Informatik directorate, which oversees digital infrastructure for the city, has been working since early 2025 on a systematic deduplication programme to clear the backlog.
The issue matters now because the city is in the middle of a broader push to modernise its e-government services. The Digital Zurich 2030 strategy, adopted by the Stadtrat in 2023, committed the administration to consolidating its IT systems and cutting redundant data holdings. Duplicate image files — aerial photographs of the Limmattal, planning visuals from the Stadtentwicklung Zürich unit, promotional shots used by Zürich Tourismus — have emerged as one of the more tangible symptoms of what happens when departments digitised independently, without shared standards.
A Problem That Grew in Layers
The duplication crisis did not happen overnight. Through the 1990s and 2000s, individual departments at the Stadthaus and across the city's 34 Kreise scanned documents and photographs using different software, different naming conventions, and different storage locations. When the city migrated to a unified cloud environment between 2018 and 2021, those legacy archives were imported largely as-is. Automated deduplication was not part of the migration brief. The result was repositories where the same image of, say, the Grossmünster or the construction site at Europaallee could exist in four or five versions under different filenames.
ETH Zürich's chair for information management published research in 2024 estimating that Swiss public-sector digital archives typically carry a duplication rate of between 18 and 27 percent by file count — a figure the city has cited internally when making the case for the current clean-up effort. Storage costs inside the cantonal data centre on Neugasse in Zürich West are billed to departments by the terabyte per month, so the financial incentive to deduplicate is direct and measurable.
The programme also intersects with the aftermath of Zurich's housing construction boom and the ongoing Wohnungsnot debate. Stadtentwicklung Zürich generates large volumes of planning imagery — drone surveys, before-and-after construction photographs from areas like Leutschenbach and Altstetten — and that material has been among the worst affected. Staff working on housing density assessments reported spending significant time in 2024 sorting through near-identical images to find the correct, most recent version of a given site photograph.
What the Clean-Up Looks Like in Practice
The deduplication work is being handled in two phases. The first phase, which ran from March to December 2025, focused on static archives — images no longer in active use. Automated hash-comparison tools identified exact duplicates, which were then flagged for deletion after a 60-day review window. Phase two, currently under way and scheduled for completion by the end of 2026, addresses near-duplicate images: visually similar photographs that differ only in resolution, watermark, or minor cropping. That process requires a combination of algorithmic perceptual-hash scanning and human review, and it is being piloted inside the Zürich Stadtarchiv on Neumarkt before rolling out to other departments.
The Stadtarchiv pilot is significant because it sets the template for how metadata will be standardised going forward. Each surviving image will be assigned a canonical identifier linked to the city's central asset management system, making it far harder for duplicates to accumulate again. Departments will be required to check that system before uploading new visual material.
For residents and organisations that access public imagery — journalists, architects, neighbourhood associations filing planning objections — the practical upshot should be faster, more reliable search results when querying the city's open-data portal at data.stadt-zuerich.ch. The portal currently lists more than 800 datasets, and image-heavy collections have been among the most difficult to navigate. The full deduplication programme, once complete, is expected to reduce the affected repositories by roughly a fifth in total file count, freeing capacity and lowering annual storage costs across the administration.