The Daily Zurich

Zurich news, every day

News

Zurich Leads on Duplicate Image Cleanup in City Archives — but Amsterdam and Vienna Are Catching Up

As municipalities worldwide grapple with bloated digital records, Zurich's stadtarchiv is running one of Europe's more systematic deduplication programs — though the finish line is still years away.

By Zurich News Desk · Published 4 July 2026, 9:11 pm

3 min read

Zurich Leads on Duplicate Image Cleanup in City Archives — but Amsterdam and Vienna Are Catching Up
Photo: Photo by Malte Luk on Pexels

Zurich's city administration confirmed this spring that its digitisation office has flagged more than 340,000 duplicate image files across the municipal photo archive, a sprawling collection that grew rapidly after a 2019 scanning drive pushed tens of thousands of analogue prints online. The deduplication process, handled partly through ETH Zurich's Visual Computing Lab in Hönggerberg, is expected to take until at least late 2027 to complete.

The issue matters now because Zurich is mid-way through a broader open-data push. The city's opendata.swiss portal, which publishes municipal datasets for public use, has been expanding its image holdings since 2022. Duplicate records inflate storage costs, degrade search results, and — critically in a city where the Stadtrat approved a digital governance charter in March 2025 — undermine the transparency goals that charter was meant to guarantee. With housing data, infrastructure maps, and construction permits increasingly visualised through image-linked records, a cluttered archive creates downstream errors in planning tools used by offices from Altstetten to Witikon.

What Zurich Is Actually Doing

The Stadtarchiv Zürich on Neumarkt is the operational centre of the cleanup. Staff there are working through a three-phase review: automated hash-matching to catch exact duplicates, perceptual hashing to identify near-identical scans with slight colour or resolution differences, and a manual curatorial layer for images of historical significance. The ETH Visual Computing Lab partnership, formalised in a memorandum signed in January 2026, provides the perceptual-matching algorithm at no direct licensing cost — a meaningful saving given the Stadtarchiv's annual digitisation budget, which city documents put at roughly CHF 1.2 million for 2026.

The programme also feeds into Zürich's responsibilities under the Swiss Federal Act on Archiving, which sets retention and quality standards for public records. Cantons and municipalities that allow large-scale data redundancy risk compliance complications during federal audits, an incentive that the city's IT department, based at the Stadthaus on Stadthausquai, has cited internally as a driver of the timeline.

How Zurich Compares to Peer Cities

Amsterdam's Stadsarchief began a comparable deduplication exercise in 2023, initially focused on its pre-1950 photographic holdings. The Dutch institution has publicly reported removing approximately 180,000 duplicate files from a collection roughly half the size of Zurich's — suggesting a faster per-file clearance rate, though Amsterdam's archive is more homogeneous in format, which simplifies automated matching. Vienna's Wiener Stadt- und Landesarchiv launched a deduplication project in late 2024 tied to that city's Smart City Wien 2025 framework, but the programme is still in its first phase and has not published comparable throughput figures.

Berlin's Landesarchiv, by contrast, has taken a different approach entirely: it has deprioritised deduplication in favour of raw digitisation volume, reasoning that storage costs are falling fast enough to make cleanup less urgent than access. That argument has critics among archival professionals elsewhere in Europe who argue it defers a problem rather than solving it. Zurich's method — parallel digitisation and deduplication — costs more upfront but is designed to avoid the kind of compounding backlog Berlin now faces.

Storage is not trivial even at Swiss prices. Enterprise-grade archival storage in Switzerland runs at roughly CHF 0.03 to CHF 0.05 per gigabyte per month for municipal contracts, according to published procurement frameworks. A collection carrying 340,000 redundant high-resolution image files can represent several terabytes of avoidable monthly cost, an expense that compounds over the multi-year lifecycle of a public archive.

For residents or researchers who use the Stadtarchiv's public reading room on Neumarkt, or who pull images through the opendata.swiss API, the practical improvement should become visible by mid-2027: faster search returns, fewer duplicate hits when querying construction or neighbourhood history records, and cleaner links when the archive feeds into planning visualisations. The city's IT office has said it will publish a progress report each January until the project closes. The next one is due in January 2027.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.