unsubbed.co

Immich Deduper

Released under GPL-3.0, Immich Deduper provides extension toolkit for Immich photos on self-hosted infrastructure.

Honestly reviewed for Immich users who want to reclaim storage without deleting memories they’ll regret.

TL;DR

  • What it is: A standalone companion tool that finds and removes duplicate or visually similar photos in your Immich library using deep learning — not filename matching [README].
  • Who it’s for: Existing Immich users with libraries grown large enough that duplicate photos are a real storage and organization problem. Not a general tool — zero value if you don’t run Immich [README].
  • Cost: Free and GPL-3.0 licensed. You run it alongside Immich on your existing server [README].
  • Key strength: Uses ResNet152 visual embedding rather than file-hash comparison, so it catches near-duplicates — the burst shots, lightly edited copies, and redownloaded images that hash-based tools miss [README].
  • Key weakness: Requires direct PostgreSQL access to Immich’s database (not just the API), adds Qdrant as another service you have to keep running, and the most useful feature — Metadata Merge — is explicitly still in beta with the developer’s own warning not to use it on photos you care about [README].

What is Immich Deduper

Immich is a self-hosted photo backup platform — a Google Photos replacement that stores your library on your own hardware rather than Google’s [1][4]. As libraries grow (especially after years of phone backups, cloud migrations, and household merges), duplicates accumulate: burst shots of the same moment, RAW+JPEG pairs, the same image downloaded twice from different devices.

Immich Deduper (previously called Immich MediaKit) is a separate web application that plugs into your running Immich setup to find and surface these duplicates. Instead of comparing file hashes or filenames, it runs each photo through a ResNet152 neural network to extract a feature vector representing what the image looks like, stores those vectors in a Qdrant vector database, and then uses vector similarity search to group photos that look alike — regardless of filename, file size, or exact pixel data [README].

The deletion flow respects Immich’s own trash system: when you mark a duplicate for removal, it goes to Immich’s trash rather than being permanently deleted immediately [README]. That’s a meaningful safety net given the irreversibility of photo deletion.

The project sits at 481 GitHub stars, which is small by open-source standards. It’s a single-developer project by RazgrizHsu. There are no independent published reviews of Immich Deduper specifically — the tool is niche enough that it lives and dies by GitHub issues and the Immich community forums. The observations in this review draw primarily from the README, the repository structure, and contextual knowledge of the Immich ecosystem it depends on [1][3][4].


Why people choose it

Immich has grown quickly as the de facto self-hosted Google Photos alternative [5]. As of this writing the roadmap [5] lists many planned features, and Immich has built-in “duplicate” detection in more recent versions — but Immich’s native duplicate detection works on exact file hashes. That catches truly identical files. It does not catch the burst of ten nearly-identical sunset shots, the JPEG converted from a RAW you already have, or the same image re-uploaded at slightly different compression.

People reach for Immich Deduper when they’ve already cleaned up the obvious duplicates and still have tens of thousands of photos that are “the same” in any practical sense. The visual similarity approach — adjustable from a 0.97 threshold (near-identical) down to 0.60 (similar composition or subject) — lets you tune how aggressively you want to collapse similar shots [README].

The cross-user detection feature is a specific pain point it solves: couples or families sharing an Immich server often have both users backing up their phones, which creates duplicates across user accounts rather than within them. Immich’s native tools don’t bridge that gap; Immich Deduper does [README].

The metadata merge feature — still in beta — addresses a downstream problem that most duplicate tools ignore: when you delete one copy of a photo, you might lose metadata that only existed on that copy. Favorites, album assignments, tags, descriptions, ratings. The tool attempts to consolidate all of that onto the kept copy before deleting the others [README].


Features: what it actually does

Based on the README:

Core detection:

  • Visual similarity detection via ResNet152 feature vectors [README]
  • Adjustable similarity threshold (0.60–0.97+) — tune between “similar shots” and “exact duplicate” [README]
  • Related Tree: follows the graph of similarity to surface photos connected to connected photos [README]
  • Search from Photo: start with a specific image and find everything visually similar to it [README]
  • Cross-user duplicate detection (works across multiple Immich accounts on the same server) [README]

Review and selection UI:

  • Multi Mode: process up to 50 duplicate groups in a single session [README]
  • Auto-selection logic: picks the “best” copy by configurable criteria — date, file size, EXIF data [README]
  • Exclude filters: skip specific file extensions (.dng, .png or custom patterns) to avoid incorrect duplicate matches on RAW+JPEG pairs [README]

Deletion:

  • Sends removed photos to Immich trash — permanently deletable or restorable from there [README]
  • Follows Immich’s own deletion logic for database consistency [README]
  • Requires a manual Fetch sync after trashing assets directly in Immich to keep Qdrant in sync [README]

Metadata Merge (BETA — treat with caution):

  • Albums: adds kept photo to every album any duplicate was in [README]
  • Favorites, Tags, Rating: merges across the group (highest rating wins, favorites propagate) [README]
  • Description: concatenates, deduplicated line by line [README]
  • Location: applies most common GPS coordinates across the group [README]
  • Visibility: applies the strictest setting (locked > hidden > archived > timeline) [README]
  • Creates XMP sidecar files to prevent Immich’s “Refresh Metadata” from overwriting merged values [README]
  • Atomic: if any step fails, all changes (DB + XMP) roll back [README]
  • Requires exiftool installed and write permission to the photo directories [README]

Pricing: SaaS vs self-hosted math

Immich Deduper is free software under GPL-3.0. There is no paid tier, no cloud version, no subscription [README].

The cost comparison is therefore: Immich Deduper on your existing Immich server versus dedicated duplicate-cleaning software for large photo libraries.

Common alternatives and what they cost:

  • Gemini 2 (macOS, one-time): $19.99. Works across any local folder, no server required, well-reviewed for Mac users — but operates only on local copies, not a self-hosted server library.
  • Duplicate Cleaner Pro (Windows): ~$34.99 one-time. Similar scope — local files only.
  • Google Photos duplicate detection: built into Google Photos but only available if you’re paying for Google One storage. At 100GB: $2.99/mo (~$36/yr). Loses all relevance the moment you’re self-hosting.
  • Digiikam (open source): free, handles duplicates among its many features, but is a full desktop DAM application, not a companion to a running server.

If you’re already running Immich [1][4], Immich Deduper costs you only the compute overhead — one additional Docker container plus Qdrant — on infrastructure you already own. The first-time indexing of a large library (tens of thousands of photos) requires meaningful CPU time to run ResNet152 inference. This is a one-time cost per index build.

The honest math: if you’re already self-hosting Immich, the software cost is zero. If you’re not, no amount of Immich Deduper’s features justifies setting up Immich just for deduplication.


Deployment reality check

This is where Immich Deduper earns its complexity flag.

What you need before starting:

  • A running Immich installation [1][4]. This alone requires Docker, at minimum 6GB RAM, 2 CPU cores, a PostgreSQL instance, and a reverse proxy for HTTPS [1].
  • Direct access to Immich’s PostgreSQL database — not just the API. The tool reads users and assets data directly from the DB [README]. This means you need to expose or share the DB connection string with Immich Deduper.
  • A running Qdrant instance. Qdrant is a separate vector database that ships with Immich Deduper’s Docker setup, but it’s an additional service that needs to stay running and healthy. If Qdrant loses its data (volume misconfiguration, container restart without persistence), you re-index from scratch.
  • For Metadata Merge: write permission to your photo directories, and either exiftool installed in the container (handled if using Docker) or installed on the host [README].

Install path: The README provides Docker Compose as the primary path. The project includes a Dockerfile and docker directory. There’s no Helm chart or turnkey one-click installer.

What can go sideways:

  • The direct PostgreSQL dependency means Immich Deduper needs to be on the same Docker network as Immich’s DB, or the DB needs to be accessible from the Deduper container. In non-trivial network setups (Immich behind Traefik, DB on a different host), this requires explicit configuration.
  • Qdrant persistence: if you don’t correctly mount a Qdrant data volume, a container restart wipes all indexed vectors and you re-index from zero. On a 50K-photo library this takes significant time.
  • The Metadata Merge beta caveat from the developer is not subtle: “If you are not willing to participate in testing and accept potential risks, please do not enable this feature. Do not test on photos you care about.” [README]. The feature touches the Immich PostgreSQL database directly and writes XMP files to your library. The rollback logic is atomic in theory; in practice, beta means the edge cases aren’t fully mapped.
  • Syncing state: if you trash photos directly in Immich (not through Deduper), you must run a Fetch sync to update Qdrant, otherwise Deduper still shows those photos as candidates [README].

Realistic time estimate for someone already running Immich:

  • Docker Compose setup: 30–60 minutes.
  • First index on a 20K-photo library: 1–3 hours of CPU time depending on hardware.
  • First review session: another hour learning the UI and tuning thresholds.

For someone new to self-hosting, the prerequisite of running Immich [1][4] is the real barrier — Immich Deduper is the last mile, not the starting point.


Pros and cons

Pros

  • Visual similarity, not hash matching. Catches burst shots, lightly-edited copies, redownloaded images — the whole category of duplicates that hash tools miss [README].
  • Threshold control. Tunable from near-identical (0.97) to broadly similar (0.60), so you decide how aggressively to merge similar shots without being forced into an all-or-nothing decision [README].
  • Cross-user detection. Solves the family-shared-library problem where duplicates live in separate accounts, not just within one [README].
  • Safe deletion path. Photos go to Immich trash, not permanent deletion. Recoverable until you explicitly empty the trash [README].
  • Exclude filters. Skip .dng files (or any extension) to avoid falsely flagging RAW+JPEG pairs as duplicates — a specific pain point for photographers [README].
  • Metadata Merge concept is genuinely thoughtful. The idea of consolidating albums, tags, favorites, and ratings before deleting is the right approach. Most duplicate tools delete blindly [README].
  • GPL-3.0, no cost. No subscription, no per-scan pricing, no vendor lock-in [README].

Cons

  • Hard dependency on Immich. Not a general-purpose tool. Zero value outside of the Immich ecosystem [README].
  • Requires direct PostgreSQL access. Not an API integration — it reads and writes to Immich’s database directly. A misconfigured connection can corrupt your Immich data [README].
  • Qdrant is another service to maintain. Another container, another volume to back up, another thing that can drift out of sync with reality [README].
  • Metadata Merge is beta with developer-flagged risk. The developer explicitly warns against using it on photos you care about. For a tool whose entire job is touching irreplaceable data, “don’t use this on photos you care about” is a significant caveat [README].
  • 481 GitHub stars, single developer. If RazgrizHsu moves on, this project could stall. There’s no organization or corporate backing behind it [README].
  • No GPU acceleration mentioned for the CPU-bound indexing path. The requirements file structure (requirements-cuda.txt alongside requirements-cpu.txt) suggests CUDA support exists [README], but initial indexing on CPU-only hardware with a large library will be slow.
  • No independent reviews or community benchmarks. There’s no published data on accuracy rates, false positive rates, or performance on large libraries beyond what you can test yourself.

Who should use this / who shouldn’t

Use Immich Deduper if:

  • You’re already running Immich and have a library with a meaningful duplicate problem — particularly burst shots, near-duplicates from multiple devices, or a shared library across multiple users.
  • You’ve already cleaned up exact duplicates and still have thousands of “essentially the same photo” cluttering your library.
  • You’re comfortable adding one more Docker service (Qdrant) to your stack and maintaining it.
  • You’re willing to test Metadata Merge carefully with fake duplicates before running it on real photos, or you’re content using the tool only for detection and basic deletion.

Skip it if:

  • You don’t run Immich. Nothing here applies to you.
  • You’re running Immich but only have a small library (a few thousand photos). Immich’s built-in duplicate detection handles exact duplicates, and the setup overhead isn’t worth it for marginal benefit.
  • You want a turnkey, tested, stable tool for irreplaceable family photos. A single-developer GPL project with a beta metadata engine is not that — yet.
  • You can’t tolerate another service in your stack or don’t have the patience to re-index after configuration issues.

Alternatives worth considering

Within the Immich ecosystem:

  • Immich’s built-in duplicate detection: available in recent Immich versions [5]. Handles exact file duplicates only. First thing to try before adding another service.

For general photo deduplication (not Immich-specific):

  • DigiKam (open source, cross-platform): full-featured photo management with duplicate detection built in. Works on local folders. Steep learning curve, heavy application, but well-maintained and genuinely powerful for photographers.
  • Gemini 2 (macOS, $19.99): polished duplicate cleaner for Mac users with a local photo library. No server, no containers — just works. Not useful for server-based libraries.
  • Duplicate Photo Cleaner (Windows): similar scope to Gemini but Windows-native.
  • rmlint (Linux CLI): finds duplicate files by hash and content, free, blazing fast. Doesn’t understand visual similarity, but handles the “exact same file, different path” problem efficiently.

If you’re still evaluating whether to run Immich at all:

  • PhotoPrism: another self-hosted Google Photos alternative. Has its own duplicate handling and doesn’t need a separate deduper tool. Different community, different trade-offs [1].
  • Nextcloud + Photos plugin: less photo-centric but broader functionality.

Bottom line

Immich Deduper solves a real problem that Immich’s built-in tools don’t: the “burst shots and near-duplicates” category of library bloat that accumulates over years of phone backups. The ResNet152 visual similarity approach is technically sound and meaningfully better than hash-based detection for real-world photo libraries.

The honest ceiling on this recommendation is that it’s a single-developer GPL tool at 481 stars with a beta metadata engine that the author explicitly tells you not to use on photos you care about. If you’re running a family photo archive, that caveat deserves weight. If you’re a hobbyist who wants to clean up a messy library and you’re comfortable with the setup and rollback paths, Immich Deduper is the only tool doing this job for Immich users, and it does it more thoughtfully than most.

If the setup is the blocker — getting Immich running in the first place, or adding Qdrant and configuring database access — that’s the kind of one-time deployment upready.dev handles for clients. The ongoing maintenance is light once it’s running.


Sources

  1. Immich Quick Start Guide — docs.immich.app. https://docs.immich.app/overview/quick-start/
  2. Immich Mobile App Documentation — docs.immich.app. https://docs.immich.app/features/mobile-app/
  3. Immich Upgrading Guide — docs.immich.app. https://docs.immich.app/install/upgrading/
  4. Immich Docker Compose Installation — docs.immich.app. https://docs.immich.app/install/docker-compose/
  5. Immich Roadmap — immich.app. https://immich.app/roadmap

Primary sources:

Features

Integrations & APIs

  • REST API

Search & Discovery

  • Tags / Labels