A weeks-long effort to consolidate Zurich's expanding municipal digital image archives has run into a concrete problem: duplicate photographs, running into the tens of thousands, are clogging storage systems across at least three major institutions and delaying public access to newly scanned historical collections. The issue came to a head this week after the Stadt Zürich Stadtarchiv, the Zentralbibliothek Zürich on Zähringerplatz, and ETH Zürich's image database unit each independently flagged overlapping file inventories that have accumulated since a joint scanning initiative launched in January 2026.
The timing matters because the initiative was meant to mark the 175th anniversary of ETH Zürich's founding, with a flagship public portal scheduled to go live in September. Duplicate records now threaten that deadline. When the same photograph exists under four or five separate catalogue entries — each with slightly different metadata — search results become unreliable and storage costs climb without corresponding public benefit.
How the Duplicates Accumulated
The root cause is procedural rather than technical. When the three institutions began exchanging digitised files in January, they used incompatible metadata schemas. The Stadtarchiv tags images by district — Kreis 1 through Kreis 12 — while ETH's system indexes by research department. The Zentralbibliothek uses a legacy Dewey-adjacent classification that neither of the other two institutions employs. Files transferred across systems were re-registered rather than matched against existing entries, and the duplicates multiplied.
By late June, internal counts at the Stadtarchiv placed the number of confirmed duplicate image files at roughly 34,000, according to a document circulated to partner institutions and reviewed by The Daily Zurich. ETH's image unit estimated an additional 18,000 suspect entries in its own holdings. Neither institution has published these figures publicly.
The financial dimension is not trivial. Cloud storage costs for Swiss public institutions have risen sharply since 2024, and maintaining redundant high-resolution files — many of them 50 to 80 megabytes each — adds up. A standard archival TIFF at that size, duplicated 34,000 times, represents roughly 1.7 terabytes of unnecessary storage. At enterprise rates from Swiss providers, that volume costs in the range of several hundred francs per month, a small but symbolically awkward waste of public funds during a period when city budget discussions in the Gemeinderat have repeatedly emphasised digital efficiency.
What Institutions Are Doing This Week
Staff at the Zentralbibliothek confirmed to The Daily Zurich on Thursday that a working group met on Tuesday, July 1st, at the library's Predigergasse reading room annex to agree on a shared deduplication protocol. The group is piloting an open-source perceptual hashing tool — software that identifies visually identical or near-identical images regardless of filename or metadata — across a test batch of 5,000 files drawn from the Stadtarchiv's Kreis 4 and Kreis 5 collections.
Results from the pilot are expected by July 18th. If the tool performs well, the three institutions plan to run it across the full combined archive before the end of August, leaving a narrow window to clean the database before the September portal launch. Staff on the working group have also agreed to adopt a unified metadata standard based on the Dublin Core schema, which is already used by the Swiss National Library in Bern and by several cantonal archives in Basel-Stadt.
For Zürich residents hoping to access the new portal — particularly historians, architects, and neighbourhood groups in areas like Aussersihl and Wipkingen who have been promised access to previously unseen street-level photography from the 1950s and 1960s — the advice for now is to wait. The September date has not been officially postponed, but the working group has flagged it as contingent on the pilot's success. Anyone with urgent research needs can contact the Zentralbibliothek's reading room services directly; the physical collections remain accessible by appointment on Zähringerplatz regardless of the digital backlog.