Zurich's public institutions are sitting on thousands of duplicate digital images buried inside municipal databases, and the people responsible for managing that data say the problem is getting harder to ignore. Stadtarchiv Zürich, the city's official record-keeper on Neumarkt, confirmed earlier this year that a systematic audit of its digital holdings had identified significant redundancy across scanned historical collections — duplicates generated over successive digitisation rounds stretching back to 2014.
The issue matters now because the city is mid-stream in a broader digital infrastructure push. The Canton of Zurich's 2024–2027 ICT strategy committed to consolidating public data holdings across departments by the end of 2026. With that deadline approaching, archivists and IT specialists are warning that duplicate image files are not merely a storage nuisance — they distort search results, inflate cloud licensing costs, and complicate the work of researchers and journalists who rely on the archives for public-interest investigations.
What the Specialists Are Saying
At ETH Zurich, researchers at the Data Archive Services unit on Rämistrasse have been grappling with the same challenge. The institute's digital preservation guidelines, last updated in 2023, flag duplicate detection as a prerequisite for long-term archival integrity. Staff there have described the problem in internal documentation as a structural issue tied to workflow design rather than simple human error: when multiple departments independently scan the same source material without cross-referencing a central registry, duplication is the predictable outcome.
Zentralbibliothek Zürich, whose digitisation lab on Zähringerplatz holds one of the largest public image repositories in the German-speaking world, has been piloting automated deduplication software since February 2026. Librarians there have noted, in published project updates, that initial scans of one historical photograph collection alone returned a duplication rate of roughly 18 percent — meaning nearly one in five image files was a functional copy of another already in the system. Storage costs in Swiss institutional cloud environments currently run between CHF 0.03 and CHF 0.07 per gigabyte per month, depending on redundancy tier, which means even modest collections of high-resolution scans accumulate meaningful costs when duplicates are left unchecked.
City councillors have begun asking questions. At a session of the Zurich Stadtrat's departmental committee for digitalisation held in late June 2026, members pressed the Department of Finance and Economic Affairs on whether current procurement rules for archival software require vendors to include deduplication as a baseline function. The answer, according to the published agenda summary, was that existing specifications do not mandate it — a gap that digital governance advocates have since flagged publicly.
What Comes Next
The practical pressure is intensifying. Under the cantonal ICT consolidation timetable, participating institutions must submit data quality assessments by 30 September 2026. For archives and libraries still running manual review processes, that window is tight. Technology specialists familiar with public-sector deployments in Switzerland have pointed to perceptual hashing — a method that identifies visually identical or near-identical images even when file names and metadata differ — as the most cost-effective technical fix available at scale. Several European municipal archives, including those in Vienna and Hamburg, have adopted similar tools in the past three years.
For Zurich residents and researchers, the most immediate practical consequence is accuracy. A duplicated image that appears multiple times in a public search result is not a trivial inconvenience when historians or journalists are trying to establish the provenance of a document. Stadtarchiv staff have indicated that a revised internal workflow, designed to route new digitisation batches through a central deduplication check before ingestion, could be operational before the end of Q3 2026. Whether that schedule holds will depend partly on budget approvals still pending within the city administration. The Stadtrat is expected to review supplementary ICT credits at its next full session in August.