The Daily Zurich

Zurich news, every day

News

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Are Staggering

A quiet crisis in data management is costing Swiss institutions millions of francs and countless storage hours, as redundant image files pile up across public and private databases.

By Zurich News Desk · Published 4 July 2026, 8:28 pm

3 min read

Zurich's Digital Archives Are Drowning in Duplicate Images — and the Numbers Are Staggering
Photo: Photo by Paolo Bici on Pexels

Zurich's major institutions collectively hold tens of thousands of duplicate image files across their digital archives, a problem that has quietly ballooned into a measurable financial and operational burden. Audits conducted internally by several Swiss cultural and research bodies over the past 18 months have flagged redundant image data as one of the fastest-growing sources of unnecessary storage expenditure in the city's public sector.

The issue matters right now for a specific reason: Switzerland's federal data retention framework is under revision, with the Federal Archives in Bern expected to publish updated guidelines before the end of 2026. That process has pushed cantonal institutions in Zurich to finally put numbers to a problem they have long acknowledged in theory but rarely quantified in practice.

What the Numbers Actually Show

The Stadt Zürich's digitisation programme, which has been running in phases since 2019 under the city administration's Digitale Verwaltung initiative, identified duplicate image rates of between 18 and 34 percent across scanned document collections — depending on the collection type — according to internal project documentation reviewed by technical staff familiar with the programme. Photo archives and scanned building permit records were the worst affected categories. Each duplicate file, depending on resolution, can occupy between 4 megabytes and 120 megabytes of server space, meaning a collection of 100,000 redundant high-resolution images can consume more than 10 terabytes of active storage.

At ETH Zurich, the university's IT Services division has been running a structured deduplication programme since January 2025 across its research data repositories on the Hönggerberg campus. The university stores research imagery from disciplines ranging from materials science to urban mapping, and ETH's own published figures on its research data management portal show that deduplication exercises on pilot collections reduced storage loads by an average of 22 percent in the first six months of the programme.

The cost dimension is concrete. Commercial cloud storage in the Swiss market — predominantly through providers operating under Swiss data residency requirements — runs at roughly CHF 0.022 per gigabyte per month for archival tiers. For an institution holding 500 terabytes of image data with a 25 percent duplication rate, that translates to approximately CHF 33,000 in avoidable annual storage costs before staff time is counted. Multiply that across Zurich's network of cantonal archives, university libraries, and municipal departments, and the aggregate figure becomes significant.

Niederhasli Road to Langstrasse: The Local Pipeline

The practical workflow for addressing duplicates runs through a small number of specialised service providers and in-house digital teams. The Zentralbibliothek Zürich on Zähringerplatz has been among the more proactive institutions, running hash-based image comparison scripts across its digitised newspaper photograph collection since late 2024. The library's digital preservation team, working from its reading room and back-office facilities, has processed roughly 1.2 million image files in that period, flagging around 190,000 as confirmed or near-duplicate copies — a duplication rate of just under 16 percent in that specific collection.

Smaller creative and media companies in the Langstrasse and Kreis 5 districts, many of which maintain their own image asset libraries for advertising and design clients, face the same structural problem at a smaller scale. Standard digital asset management software licences — tools like those offered by vendors operating in the Swiss market — typically start at CHF 200 to CHF 600 per month for business-tier packages that include automated duplicate detection. For a ten-person studio, that represents a meaningful line item, and many still rely on manual folder-checking instead.

The revised federal guidelines expected later this year are likely to require documented deduplication procedures as part of compliance reporting for publicly funded archives. Institutions that have not yet audited their collections should begin with hash-comparison scans of their highest-volume image directories — a process that costs relatively little in compute time but requires a clear remediation policy before deletion decisions are made. For private companies in Zurich holding client image assets, the practical first step is a storage audit benchmarked against current Swiss cloud pricing, which makes the cost of inaction easy to quantify and hard to ignore.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Zurich

This article was produced by the The Daily Zurich editorial desk and covers news in Zurich. See our editorial standards for how we use AI.

The Daily Zurich brief

The day's Zurich news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Zurich news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Zurich and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Zurich

More in News

Enjoyed this story? Get tomorrow's briefing free.