All articles
TutorialsJun 02, 2026 · 24 min read

Self-Host Paperless-ngx on a VPS to Digitize Your Documents

Self-Host Paperless-ngx on a VPS to Digitize Your Documents

Everyone has the drawer. Tax returns, insurance policies, warranty cards, a lease from three apartments ago, the receipt for a laptop that's still technically under warranty if you could only find it. Paper piles up, and the digital version isn't much better - a Scans folder with 400 files named scan_0042.pdf is just a different kind of drawer.

Paperless-ngx turns that mess into a searchable archive. You feed it a PDF or a photo of a document, it runs OCR so every word is full-text searchable, then it auto-tags and files the thing based on rules you teach it. Six months later you type "vet invoice 2025" and the right document is on screen in under a second. It's open source, actively maintained, and runs comfortably on a small VPS.

This guide gets you from a fresh server to a working, HTTPS-secured Paperless instance: Docker Compose, PostgreSQL, Redis, Caddy for automatic certificates, OCR in your language, and a backup script that actually covers everything that matters.

Save your `PAPERLESS_SECRET_KEY` somewhere permanent before you start - a password manager is ideal. It's used to sign sessions and tokens. If you lose it and have to regenerate it, every logged-in session breaks and some stored data becomes unreadable. Treat it like a database password you can never rotate casually.

TL;DR

  • Install Docker and Docker Compose on a fresh VPS
  • Point a subdomain like docs.example.com at your server
  • Run Paperless-ngx + PostgreSQL + Redis + Gotenberg + Tika behind Caddy with one docker-compose.yml
  • Set PAPERLESS_URL and PAPERLESS_CSRF_TRUSTED_ORIGINS so the reverse proxy doesn't trigger CSRF errors
  • Create the admin user from env vars on first boot
  • Drop files into the consume folder (or email them in) and let OCR + tagging do the rest
  • Back up the database, the media archive, and your env file daily

Total time: about 30 minutes.

What You Need

  • A VPS with at least 2 GB RAM running Ubuntu 22.04 or 24.04. OCR is CPU and memory hungry - 1 GB will swap and crawl
  • A domain or subdomain you can point at the server
  • Ports 80 and 443 open to the internet for Let's Encrypt
  • Root or sudo access

Paperless idles around 400-500 MB once everything is warm. The spikes come during OCR: a dense multi-page scan can briefly use a full core and a few hundred extra megabytes. On a 2 GB / 1 vCPU VPS that's fine for a household; if you plan to bulk-import thousands of documents at once, give yourself 2 vCPUs so the initial backlog doesn't take all night.

Step 1: Point a Subdomain at Your VPS

In your DNS provider, add an A record:

docs.example.com → YOUR_VPS_IPV4

Add an AAAA record too if your server has IPv6. Confirm it resolves before going further:

dig +short docs.example.com

DNS has to resolve to your server before Caddy can fetch a Let's Encrypt certificate.

Step 2: Install Docker and Docker Compose

On a fresh Ubuntu server:

sudo apt update sudo apt install -y ca-certificates curl sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg \ -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] \ https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo $VERSION_CODENAME) stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt update sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Confirm the daemon is running:

sudo docker run --rm hello-world

Step 3: Create the Project Layout

Paperless keeps its data on persistent volumes. Make a home for the stack:

sudo mkdir -p /opt/paperless cd /opt/paperless

Everything below lives in /opt/paperless. The bind-mounted folders (data, media, consume, export) will be created automatically by Docker on first run, but you can pre-create the consume folder so you can drop files into it immediately:

sudo mkdir -p /opt/paperless/consume

Step 4: Generate Your Secrets

Paperless needs a long random secret key and a database password. Generate both now and keep them somewhere safe:

# Secret key - copy this into PAPERLESS_SECRET_KEY openssl rand -base64 48 # Database password - copy this into POSTGRES_PASSWORD and PAPERLESS_DBPASS openssl rand -base64 24

Step 5: Write the Environment File

Create /opt/paperless/paperless.env. This holds all the app configuration so it stays out of the compose file:

sudo nano /opt/paperless/paperless.env # --- Core --- PAPERLESS_SECRET_KEY=PASTE_THE_48_CHAR_KEY_HERE PAPERLESS_URL=https://docs.example.com PAPERLESS_CSRF_TRUSTED_ORIGINS=https://docs.example.com PAPERLESS_TIME_ZONE=Europe/Berlin # --- Database --- PAPERLESS_DBHOST=db PAPERLESS_DBNAME=paperless PAPERLESS_DBUSER=paperless PAPERLESS_DBPASS=PASTE_THE_DB_PASSWORD_HERE # --- Redis broker --- PAPERLESS_REDIS=redis://broker:6379 # --- OCR --- PAPERLESS_OCR_LANGUAGE=eng PAPERLESS_OCR_LANGUAGES=eng deu # --- Office docs (Gotenberg + Tika) --- PAPERLESS_TIKA_ENABLED=1 PAPERLESS_TIKA_GOTENBERG_ENDPOINT=http://gotenberg:3000 PAPERLESS_TIKA_ENDPOINT=http://tika:9998 # --- First-run admin account --- PAPERLESS_ADMIN_USER=admin PAPERLESS_ADMIN_PASSWORD=choose-a-strong-password-here

A few notes on the values:

  • PAPERLESS_OCR_LANGUAGE is the default OCR language (use ISO 639-2 codes: eng, deu, fra, spa, nld, etc.). PAPERLESS_OCR_LANGUAGES is the space-separated list of language packs to install in the container so you can OCR documents in more than one language.
  • PAPERLESS_TIME_ZONE controls how dates are displayed - set it to yours.
  • The PAPERLESS_ADMIN_* vars only create the account on the very first boot with an empty database. After that they're ignored, so you can leave them or remove them.
`PAPERLESS_URL` and `PAPERLESS_CSRF_TRUSTED_ORIGINS` are the two settings people forget, and skipping them is the number one reason logins fail behind a reverse proxy. Without them you'll log in fine over plain `http://IP:8000` but get a "CSRF verification failed" or "Forbidden" error the moment you go through HTTPS on your domain. Set both to your real `https://` URL.

Step 6: Write the Compose File

Create /opt/paperless/docker-compose.yml:

services: broker: image: docker.io/library/redis:7 restart: unless-stopped volumes: - redisdata:/data db: image: docker.io/library/postgres:16 restart: unless-stopped volumes: - pgdata:/var/lib/postgresql/data environment: POSTGRES_DB: paperless POSTGRES_USER: paperless POSTGRES_PASSWORD: PASTE_THE_DB_PASSWORD_HERE webserver: image: ghcr.io/paperless-ngx/paperless-ngx:latest restart: unless-stopped depends_on: - db - broker - gotenberg - tika ports: - "127.0.0.1:8000:8000" volumes: - data:/usr/src/paperless/data - media:/usr/src/paperless/media - ./export:/usr/src/paperless/export - ./consume:/usr/src/paperless/consume env_file: paperless.env healthcheck: test: ["CMD", "curl", "-fs", "-S", "--max-time", "2", "http://localhost:8000"] interval: 30s timeout: 10s retries: 5 gotenberg: image: docker.io/gotenberg/gotenberg:8.7 restart: unless-stopped command: - "gotenberg" - "--chromium-disable-javascript=true" - "--chromium-allow-list=file:///tmp/.*" tika: image: docker.io/apache/tika:latest restart: unless-stopped volumes: redisdata: pgdata: data: media:

Two things worth calling out:

  • The webserver is published on 127.0.0.1:8000, not 0.0.0.0. That means it's only reachable from the server itself - Caddy will proxy to it. Paperless never gets exposed to the raw internet.
  • db reads its password from the compose file, while the app reads the same value from paperless.env. Make sure POSTGRES_PASSWORD and PAPERLESS_DBPASS match exactly, or the app can't connect.

Step 7: Start the Stack

Bring it up:

cd /opt/paperless sudo docker compose pull sudo docker compose up -d

The first start does real work: it initializes the database, runs migrations, and downloads the OCR language packs you listed. Give it a couple of minutes. Watch the progress:

sudo docker compose logs -f webserver

When you see something like Paperless-ngx ... is ready, it's up. Confirm it answers locally:

curl -I http://127.0.0.1:8000

A 200 or a redirect means the app is alive. Now let's put HTTPS in front of it.

Step 8: Install Caddy as the Reverse Proxy

Caddy fetches and renews Let's Encrypt certificates automatically, which makes it the least painful option for a single-service host. Install it on the host (not in Docker):

sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https curl curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | \ sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | \ sudo tee /etc/apt/sources.list.d/caddy-stable.list sudo apt update sudo apt install -y caddy

Replace /etc/caddy/Caddyfile with:

docs.example.com { reverse_proxy 127.0.0.1:8000 encode zstd gzip request_body { max_size 100MB } }

The request_body block matters: scanned PDFs and high-resolution photos get large, and the default upload limit will reject them with a confusing error. Reload Caddy:

sudo systemctl reload caddy

Visit https://docs.example.com. You should get a valid certificate and the Paperless login page. Sign in with the PAPERLESS_ADMIN_USER and PAPERLESS_ADMIN_PASSWORD you set earlier.

If you already use Nginx or another reverse proxy, the same idea applies - proxy to 127.0.0.1:8000, raise the max upload size, and forward the standard X-Forwarded-* headers. Our Caddy reverse proxy guide covers the general pattern in more depth.

Step 9: Add Your First Documents

There are three ways to get documents in, and you'll probably use all three:

Drag and drop in the web UI. The fastest way to test. Click the upload area on the dashboard and drop a PDF. Within a few seconds it appears in the inbox, OCR'd and searchable.

The consume folder. Anything you drop into /opt/paperless/consume on the server gets ingested automatically. This is the killer feature - point a network share, a scanner's "scan to folder" target, or an rsync job at it:

cp ~/some-invoice.pdf /opt/paperless/consume/

Watch it get picked up:

sudo docker compose logs -f webserver | grep -i consum

Email ingestion. In Settings → Mail, connect a dedicated mailbox (a documents@ alias works well) and Paperless will pull attachments from incoming mail on a schedule. Forward a receipt from your phone and it lands in your archive.

Step 10: Teach It to File Things Automatically

Paperless gets dramatically more useful once it tags documents for you. The building blocks live under Settings:

  • Correspondents - who the document is from (your bank, landlord, the tax office).
  • Document types - what it is (invoice, contract, payslip, warranty).
  • Tags - free-form labels (taxes-2025, car, important).

Then create Matching rules so new documents file themselves. For example, a tag called Utilities with an "Automatic" matching algorithm and the match text electricity OR gas OR water will tag any document whose OCR text contains those words. After a dozen or so documents, the auto-matcher becomes genuinely good at guessing.

Set a default view of the Inbox (a built-in tag applied to everything new) so unprocessed documents are easy to find, and clear the inbox tag once you've reviewed each one.

Step 11: Lock Down the Server

Paperless is now only reachable through Caddy, which is good. Tighten the rest:

sudo ufw allow OpenSSH sudo ufw allow 80/tcp sudo ufw allow 443/tcp sudo ufw enable

Port 8000 is bound to localhost and never needs to be open. If this is a brand-new server, it's worth doing a proper SSH hardening pass too - see our SSH hardening and fail2ban guide for a checklist. Because a Paperless archive is exactly the kind of data you don't want anyone else reading, consider putting it behind a VPN instead of the public internet - our Tailscale guide pairs perfectly with this setup.

Step 12: Back Up Everything That Matters

Paperless ships a built-in exporter that writes a portable, restorable snapshot of your entire archive - documents, metadata, tags, and settings - to the export folder:

sudo docker compose exec webserver document_exporter ../export

Wrap that in a script at /opt/paperless/backup.sh:

#!/usr/bin/env bash set -euo pipefail cd /opt/paperless # Portable export of all documents + metadata docker compose exec -T webserver document_exporter ../export # Tarball the export plus the env file and compose file STAMP=$(date +%Y-%m-%d) tar -czf "/opt/paperless/backups/paperless-${STAMP}.tar.gz" \ -C /opt/paperless export paperless.env docker-compose.yml # Keep the last 14 daily archives find /opt/paperless/backups -name 'paperless-*.tar.gz' -mtime +14 -delete

Make it runnable and schedule it:

sudo mkdir -p /opt/paperless/backups sudo chmod +x /opt/paperless/backup.sh sudo crontab -e

Add a nightly run:

30 3 * * * /opt/paperless/backup.sh >> /var/log/paperless-backup.log 2>&1

For off-site safety, push those tarballs to object storage. Our restic to S3 guide shows how to encrypt and ship /opt/paperless/backups to any S3-compatible bucket so a dead disk can't take your archive with it.

Restoring is the reverse: spin up a fresh stack, copy the export back into the export folder, and run document_importer ../export.

Step 13: Upgrade Paperless Safely

Paperless-ngx releases regularly, and most upgrades are painless:

cd /opt/paperless sudo docker compose pull sudo docker compose up -d

Always take a fresh export first. Major version bumps occasionally include database migrations, and the project documents any breaking changes in its release notes - skim them before jumping versions. Pinning to a specific tag instead of latest (for example ghcr.io/paperless-ngx/paperless-ngx:2.14) gives you control over exactly when you move.

Troubleshooting

"CSRF verification failed" when logging in over HTTPS. You missed PAPERLESS_URL and PAPERLESS_CSRF_TRUSTED_ORIGINS, or they don't match your real URL. Set both to https://docs.example.com, then sudo docker compose up -d to apply.

The webserver container restarts in a loop. Almost always a database connection problem. Check that POSTGRES_PASSWORD in the compose file matches PAPERLESS_DBPASS in paperless.env exactly, then look at sudo docker compose logs db for auth errors.

OCR is slow or the consume backlog never clears. OCR is CPU-bound. On a 1 vCPU VPS, large scans take a while. Check progress with sudo docker compose logs -f webserver. If you're importing in bulk, do it overnight or temporarily resize to more cores.

Uploads fail with a 413 or "request entity too large". The reverse proxy is rejecting big files. Confirm the request_body { max_size 100MB } block is in your Caddyfile and that you reloaded Caddy.

Office documents (.docx, .xlsx) won't import. Those go through Gotenberg and Tika. Make sure both containers are running (sudo docker compose ps) and that PAPERLESS_TIKA_ENABLED=1 plus the two endpoint URLs are set in your env file.

Going Further

  • Mobile capture. The community Paperless Mobile app for Android and Swift Paperless for iOS let you scan documents with your phone camera and upload straight into your archive.
  • Bulk import your old scans. Drop your entire existing Scans folder into the consume directory and let Paperless OCR and ingest the lot. Set up your tags and matching rules first so they file themselves on the way in.
  • Two-factor authentication. Paperless supports TOTP MFA out of the box. Enable it in your user settings - this archive holds your most sensitive documents, so it's worth the extra step.
  • Workflows. Newer Paperless versions have a Workflows engine that can run actions on documents as they're consumed (assign owners, set permissions, trigger webhooks). Great once you're hosting for a whole household with separate accounts.

A self-hosted Paperless instance changes your relationship with paper. The drawer stays empty because every statement, receipt, and contract goes through the consume folder once and is then findable forever. Six months in, "where did I put that" stops being a question you ask.


Need a VPS with the RAM and NVMe headroom for OCR workloads like this? Our Linux plans include fast storage, IPv6, and room to grow your Docker stack. See the options.