Paperless-ngx
A community-supported supercharged version of paperless. Scan, index, and archive all your physical documents.
Self-hosted document management, honestly reviewed. No marketing fluff, just what you get when you stop paying Evernote to store your tax returns.
TL;DR
- What it is: Open-source (GPL-3.0) document management system — scan, OCR, auto-tag, and full-text search every document you own, running entirely on your own server [4].
- Who it’s for: Anyone drowning in physical paperwork or scattered PDFs who wants a private, searchable archive. Particularly valuable for freelancers, small business owners, and home users with ongoing document intake (invoices, contracts, tax documents) [1][4].
- Cost savings: Evernote Personal runs ~$14.99/mo; Evernote Professional ~$17.99/mo. Paperless-ngx self-hosted runs on a $5–10/mo VPS with no document count limits, no upload limits, and no subscription that doubles when your trial ends [2].
- Key strength: The consume-folder automation is genuinely magical — drop a scan in, walk away, come back to a tagged and indexed document. Full-text search works across everything, including scanned images, via OCR [1][4].
- Key weakness: GPL-3.0 license (not MIT) and meaningful setup complexity — you need Docker, PostgreSQL, Redis, and ideally a reverse proxy. Not a tool you hand to a non-technical founder and expect them to manage alone [3][5].
What is Paperless-ngx
Paperless-ngx is a self-hosted document management system. The pitch is in the name: scan your physical documents, hand them to Paperless-ngx, and keep less paper. The GitHub README describes it as a system that “transforms your physical documents into a searchable online archive” [README]. That undersells it.
The actual workflow is: you point a “consume” folder at Paperless-ngx. Anything dropped into that folder — a scanner output, a downloaded PDF invoice, a photographed receipt — gets processed automatically. The system runs OCR using the open-source Tesseract engine (supporting 100+ languages), extracts all the text, then applies machine learning to suggest or auto-assign tags, correspondents (the sender/recipient), and document types [4]. The result sits in a searchable web interface with full-text search, relevance ranking, and a “more like this” feature that surfaces similar documents [4].
Paperless-ngx is the official successor to the original Paperless project and its fork Paperless-ng, now maintained by a community team rather than a single developer [README][4]. As of this review it sits at 37,438 GitHub stars, which puts it solidly in the top tier of self-hosted productivity tools by community interest.
The software is written in Python and TypeScript. Documents are stored as PDF/A on disk — a format designed for long-term archival — alongside the unaltered originals [4]. Nothing goes to a cloud. Nothing is transmitted anywhere. Your data lives where you put it.
Why people choose it
The people who land on Paperless-ngx are mostly solving one of two problems: either they have a physical paper problem (filing cabinets, stacks of documents they can never find when needed), or they have a cloud privacy problem (documents scattered across Google Drive, Dropbox, or Evernote, on someone else’s servers).
The physical paper problem. Yash Patel, writing for XDA Developers [1], describes the moment that made the tool non-negotiable: “If my office caught fire tomorrow, my biggest stress wouldn’t be the hardware; it would be the mountain of tax returns, client contracts, important files, and property deeds gathering dust in my filing cabinet.” The consume-folder automation is what closes that gap — he describes dropping scans in and having them automatically tagged, labeled, and filed without any manual categorization step. He pairs it with Obsidian to embed PDFs in notes, creating a complete knowledge workflow [1].
The cloud privacy problem. The Packetswitch writeup [3] opens with the same framing: documents were in Google Drive, accessible and searchable, but “with all the concerns around privacy and data usage, I’d prefer to keep my documents locally.” The trade-off being evaluated is: convenience of someone else’s cloud versus control of your own. Paperless-ngx makes that trade-off acceptable because it replicates the functionality (upload, organize, search) without the data leaving your network.
Versus Evernote (the profile’s listed SaaS competitor). Evernote is the obvious comparison for document organization and note storage. The practical case against Evernote for this use case is three things: per-seat pricing that compounds over time, upload limits that penalize heavy document intake, and the recurring anxiety of a company that has restructured twice and raised prices on grandfathered plans. Paperless-ngx doesn’t have upload limits. It doesn’t have seats. It doesn’t have pricing tiers. The only recurring cost is the server it runs on [2][4].
Versus Google Drive / Dropbox. These win on raw convenience — no setup, accessible everywhere, mobile apps that work. Paperless-ngx wins on OCR depth and auto-classification. Google Drive does OCR on PDFs, but it doesn’t apply machine learning to tag documents automatically or route them into a structured archive. If you want to type “water bill July 2024” and have it appear instantly, Paperless-ngx is more reliable than folder-browsing a Drive [1][4].
Features
Based on the README, LinuxLinks feature writeup, and Elestio’s documentation:
Core document processing:
- Consume folder with automatic intake — drop a file, it’s processed [1][4]
- OCR via Tesseract engine, 100+ languages, adds selectable text to image-only documents [4]
- Machine learning auto-tagging: assigns tags, correspondents, and document types based on learned patterns [1][4]
- Documents stored as PDF/A (archival format) plus original [4]
- Supports PDFs, images, plain text, Office documents (Word, Excel, PowerPoint, LibreOffice equivalents) [4]
- Configurable filename and folder structure on disk [4]
Search and navigation:
- Full-text search across all documents, including scanned content [1][4]
- Auto-completion from document contents [4]
- Results sorted by relevance; matched text is highlighted [4]
- “More like this” — search for documents similar to a given document [4]
Web interface:
- Single-page application with a customizable dashboard [2]
- Statistics overview: document counts, inbox size, tag distribution [2]
- Filtering by tags, correspondents, types, custom fields [4]
- Bulk editing — reassign tags, types, or correspondents across many documents at once [4]
- Drag-and-drop uploading throughout the interface [4]
- Shareable public links with optional expiration dates [4]
- Custom fields of various data types [4]
- Customizable saved views that appear on the dashboard and sidebar [4]
Email processing:
- Import documents directly from email accounts [4]
- Configure multiple accounts and rules per account [4]
- Post-processing actions: mark as read, delete, move messages [4]
Multi-user and permissions:
- Built-in multi-user system with global permissions and per-document/per-object permissions [4]
- Workflow system for automated document routing and processing rules [4]
Infrastructure:
- Docker Compose deployment (default path) [3][5]
- PostgreSQL or MariaDB as database backend, Redis as message broker [3][5]
- Optimized for multi-core systems — consumes multiple documents in parallel [4]
- Integrated sanity checker to verify archive health [4]
- REST API for external integrations [README]
Pricing: SaaS vs self-hosted math
Evernote (the SaaS comparison):
- Free tier: 1 notebook, 1 device, 60MB monthly uploads — practically unusable for ongoing document intake
- Personal: ~$14.99/month, 10GB monthly uploads, unlimited devices
- Professional: ~$17.99/month, 20GB uploads, AI features
- Teams: ~$24.99/user/month
Elestio managed Paperless-ngx (if you don’t want to self-host):
- Starting at $14/month — fully managed, includes automated backups, SSL, monitoring, updates [2]
- That’s comparable to Evernote Personal, except you get unlimited document storage and no per-upload limits
Self-hosted (the real value proposition):
- Software: $0 (GPL-3.0) [README]
- VPS to run it on: $5–10/month on Hetzner, Contabo, or DigitalOcean (PostgreSQL + Redis + Paperless-ngx comfortably fits on a 2GB RAM instance)
- Your time to set it up
Concrete math for a small business owner:
Say you’re scanning 50 documents a month — invoices, contracts, receipts. On Evernote Personal at $14.99/mo, you’re paying $179/year with upload limits, one-vendor dependency, and no guarantee your pricing won’t change. On a $6 Hetzner VPS, you’re paying $72/year with no limits and no vendor dependency. That’s roughly $107/year saved, which doesn’t sound dramatic until you also factor in the 10 years of tax documents and contracts sitting in your filing cabinet that you’d like to digitize — Evernote’s upload limits make bulk ingestion painful, Paperless-ngx doesn’t care how much you throw at it [4][5].
The financial case is less extreme than something like Zapier vs. self-hosted automation (where the savings can hit four figures annually). The stronger argument here is data sovereignty and long-term reliability — your document archive isn’t going anywhere because a company restructured.
Deployment reality check
This is where the honest review diverges from the enthusiastic hobbyist writeup. Paperless-ngx is not a one-click install. It requires Docker, a database (PostgreSQL or MariaDB), Redis as a message broker, and ideally a reverse proxy with HTTPS for any internet-facing deployment [3][5].
What the standard setup looks like:
- A Linux VPS with 2GB RAM minimum (PostgreSQL + Redis + Paperless under load; 4GB is more comfortable)
- Docker and Docker Compose installed [3][5]
- A
docker-compose.ymlwith three services: the database, Redis, and the Paperless webserver [3] - A reverse proxy (Caddy or nginx) if you want HTTPS and a real domain [3][5]
- Configuration via environment variables in a
docker-compose.envfile — timezone, secret key, URL, database credentials [3]
The Sliplane guide [5] and Packetswitch guide [3] both walk through this in detail. The Packetswitch guide uses PostgreSQL; Sliplane uses MariaDB — either works. Both recommend bind mounts over Docker volumes for easier migration between hosts [3]. The install script in the README can bootstrap a working instance: bash -c "$(curl -L https://raw.githubusercontent.com/paperless-ngx/paperless-ngx/main/install-paperless-ngx.sh)" [README].
What can go sideways:
- The consume folder requires correct UID/GID mapping between the host and container — misconfigure this and documents silently fail to import [3]
- The
PAPERLESS_URLsetting must be set correctly if you’re running behind a reverse proxy, or redirects break [3] - PostgreSQL tuning matters at scale — a large archive (tens of thousands of documents) will stress a low-RAM instance
- Email import configuration requires IMAP access and rule setup that isn’t trivial for non-technical users [4]
- No official mobile app exists — you use the web interface on mobile, which is functional but not native
Realistic time estimates:
- Technical user who’s deployed Docker before: 30–60 minutes to a working HTTPS instance
- Technical user new to Docker: 2–4 hours including reading documentation
- Non-technical user following a guide carefully: half a day — possibly more if DNS propagation or SMTP configuration adds friction
- Non-technical user without any Linux server experience: not recommended without assistance
The XDA Developers article [1] notes setup was “much quicker than expected” with a one-afternoon timeline, which tracks for someone who knows what Docker Compose is. That caveat matters.
Pros and cons
Pros
- OCR that actually works. Tesseract on 100+ languages means scanned documents from any country become searchable. Image-only PDFs get selectable text added. This is the core value proposition and it delivers [4].
- ML auto-tagging eliminates manual classification. After a short training period (or with rules configured), incoming documents are tagged and categorized without user intervention. The consume folder plus ML equals a genuinely automated archive [1][4].
- Full-text search is fast and relevant. Highlighting and relevance ranking make finding documents feel more like a search engine than a file browser [2][4].
- 37,438 GitHub stars, community-maintained. Not a single-developer project that disappears when life happens — it’s a team-maintained successor to two previous projects with a strong community [README][4].
- No limits on documents, storage, or uploads. Unlike cloud services that gate features behind tiers, self-hosted Paperless-ngx is limited only by your disk space [5].
- Data never leaves your network. Particularly relevant for business documents, legal paperwork, financial records — the kind of documents you probably shouldn’t be uploading to someone else’s server [3].
- Supports email as a document source. Invoices landing in your inbox can be automatically imported and archived without manual download-and-upload steps [4].
- Multi-user with per-document permissions. A household or small team can share the instance with appropriate access controls [4].
- PDF/A archival format preserves documents for the long term alongside originals [4].
Cons
- GPL-3.0, not MIT. If you want to embed this in a commercial product or redistribute a modified version, GPL-3.0 has implications. Fine for personal and business use; potentially a complication for developers building on top of it.
- Not a beginner-friendly install. PostgreSQL, Redis, Docker Compose, and reverse proxy configuration are baseline requirements [3][5]. A non-technical founder needs help with the initial setup.
- No native mobile app. The web UI works on mobile browsers but there’s no dedicated iOS or Android application. Third-party apps exist (like Paperless Mobile on F-Droid/Play Store) but they’re not official.
- Multi-user permissions are present but not fine-grained RBAC. Suitable for a household or tiny team; probably undersized for a company with document access control requirements [4].
- Setup requires multiple moving parts. Three containers minimum (database, Redis, webserver) plus reverse proxy means more surface area for configuration issues compared to simpler self-hosted tools [3][5].
- No built-in cloud sync. Your documents live on the server. You need to manage your own backup strategy — the Elestio managed service includes automated backups [2]; self-hosting doesn’t give you that for free.
- UI is functional, not polished. The web interface is clean and modern but doesn’t have the refinement of a commercial product. Fine for daily use, noticeable if you’re used to Notion or Evernote’s interfaces.
Who should use this / who shouldn’t
Use Paperless-ngx if:
- You have ongoing document intake — invoices, contracts, tax documents, receipts — and you want them automatically organized without manual filing.
- Privacy or data sovereignty matters to you: you don’t want your legal and financial documents on Evernote’s or Google’s servers.
- You’re comfortable with (or willing to learn) basic Docker deployment, or you’ll hire someone to set it up once.
- You want a searchable archive that doesn’t impose upload limits or subscription tiers.
- You’re a freelancer or small business owner drowning in digital documents across email, downloads, and scanner outputs.
Skip it (use Evernote) if:
- You need a polished mobile app and native sync across iPhone/Mac/PC with zero setup.
- You want to mix notes and documents in the same tool — Paperless-ngx is specifically a document archive, not a note-taking app.
- You have no server infrastructure and no interest in managing one.
Skip it (use Google Drive or Dropbox) if:
- You need real-time collaboration on documents with external parties.
- You want automatic sync from desktop folders with no server setup.
- Your document volume is small enough that manual organization is tolerable.
Use Elestio’s managed Paperless-ngx [2] if:
- You want Paperless-ngx specifically but won’t manage a server yourself — at $14/mo you get fully managed hosting with backups and monitoring, roughly equivalent to Evernote Personal pricing but without the upload limits.
Alternatives worth considering
- Mayan EDMS — the other major open-source document management system. More enterprise-feature-rich (workflows, cabinet organization), significantly more complex to deploy. Choose Mayan if you need institutional-grade document management. Choose Paperless-ngx if you want something that works for a household or small business.
- Teedy — simpler self-hosted DMS, lighter footprint, less powerful OCR. Good if Paperless-ngx feels like overkill.
- Evernote — the incumbent. Best mobile experience, largest integration ecosystem, closed source, subscription pricing. The right choice if you want zero infrastructure management.
- Notion — not a document archive but often used as one. Collaborative, polished, no OCR on scanned documents, closed source.
- Nextcloud with Files + Full Text Search — if you’re already running Nextcloud, its document management capabilities partially overlap with Paperless-ngx. Less specialized for OCR and auto-tagging; better if you want a unified file storage solution.
- DEVONthink (macOS only) — powerful local document management with good OCR, Apple ecosystem only, one-time license. Worth considering if you’re Mac-only and want a native app instead of a web interface.
Bottom line
Paperless-ngx solves a specific problem well: it turns a disorganized pile of paper and PDFs into a searchable, auto-tagged archive that lives on your own server. The OCR pipeline, the consume folder automation, and the machine-learning tagging are genuinely mature — this isn’t a hobbyist experiment, it’s a 37,000-star project that has been in active development across three generations (Paperless → Paperless-ng → Paperless-ngx). The trade-off is real: GPL-3.0 instead of MIT, meaningful deployment complexity, and no mobile app. For a non-technical founder, this is a tool where you either invest in learning Docker Compose for an afternoon [5], use a managed host like Elestio [2], or find someone to deploy it once. Once running, it largely takes care of itself. If you have a filing cabinet full of documents you’re terrified to lose and currently pay Evernote for the privilege of searching them, the math and the control argument both point the same direction.
If the deployment is the blocker, that’s exactly what unsubbed.co’s parent studio upready.dev deploys for clients. One-time fee, done, you own the infrastructure.
Sources
- Yash Patel, XDA Developers — “My non-negotiable self-hosted productivity stack for 2026” (Jan 4, 2026). https://www.xda-developers.com/non-negotiable-self-hosted-productivity-stack-for-2026/
- Elestio — “Managed Paperless-ngx as a Service”. https://elest.io/open-source/paperless-ngx
- Suresh Vinasiththamby, Packetswitch — “Paperless-ngx - Self-Hosted Document Manager” (Feb 3, 2025). https://www.packetswitch.co.uk/paperless-ngx-self-hosted-document-manager/
- LinuxLinks — “Paperless-ngx - document management system”. https://www.linuxlinks.com/paperless-ngx-document-management-system/
- Jonas Scholz, Sliplane — “Self-hosting paperless-ngx with MariaDB on an Ubuntu Server”. https://sliplane.io/blog/self-hosting-paperless-ngx-with-mariadb-on-ubuntu-server
Primary sources:
- GitHub repository and README: https://github.com/paperless-ngx/paperless-ngx (37,438 stars, GPL-3.0 license)
- Official documentation: https://docs.paperless-ngx.com
- Live demo: https://demo.paperless-ngx.com (login: demo / demo)
Compare Paperless-ngx
These solve completely different document problems: Paperless-ngx manages and OCRs scanned documents (filing cabinet replacement). Documenso handles digital document signing (DocuSign replacement). Most organizations need both.
Both Stirling PDF and Paperless-ngx are strong open-source options in the documents space. Stirling PDF has 75k GitHub stars and Paperless-ngx has 37k. Compare their features, deployment, and community to choose the right fit for your needs.
Related Documents & Knowledge Base Tools
View all 226 →Stirling-PDF
75KThe most popular self-hosted PDF platform — merge, split, convert, OCR, sign, and process documents with AI, all running on your own infrastructure.
AppFlowy
69KAn open-source Notion alternative with AI, wikis, projects, and databases — cross-platform (desktop, mobile, web) with offline-first architecture and full data ownership.
AFFiNE Community Edition
66KAn open-source workspace that merges docs, whiteboards, and databases into one platform — a privacy-focused alternative to Notion and Miro with AI built in.
Docusaurus
64KA static site generator built on React for documentation websites — write in Markdown/MDX, version your docs, and deploy anywhere. Created by Meta.
Crawl4AI
62KOpen-source LLM-friendly web crawler that generates clean markdown from any website, purpose-built for RAG pipelines, AI data extraction, and automated research.
Atom
61KGitHub's hackable text editor, officially sunset in December 2022. The codebase remains archived on GitHub as a reference for community forks like Pulsar.