Everyone has the drawer. Tax returns, insurance policies, warranty cards, a lease from three apartments ago, the receipt for a laptop that's still technically under warranty if you could only find it. Paper piles up, and the digital version isn't much better - a Scans folder with 400 files named scan_0042.pdf is just a different kind of drawer.
Paperless-ngx turns that mess into a searchable archive. You feed it a PDF or a photo of a document, it runs OCR so every word is full-text searchable, then it auto-tags and files the thing based on rules you teach it. Six months later you type "vet invoice 2025" and the right document is on screen in under a second. It's open source, actively maintained, and runs comfortably on a small VPS.
This guide gets you from a fresh server to a working, HTTPS-secured Paperless instance: Docker Compose, PostgreSQL, Redis, Caddy for automatic certificates, OCR in your language, and a backup script that actually covers everything that matters.
docs.example.com at your serverdocker-compose.ymlPAPERLESS_URL and PAPERLESS_CSRF_TRUSTED_ORIGINS so the reverse proxy doesn't trigger CSRF errorsTotal time: about 30 minutes.
80 and 443 open to the internet for Let's EncryptPaperless idles around 400-500 MB once everything is warm. The spikes come during OCR: a dense multi-page scan can briefly use a full core and a few hundred extra megabytes. On a 2 GB / 1 vCPU VPS that's fine for a household; if you plan to bulk-import thousands of documents at once, give yourself 2 vCPUs so the initial backlog doesn't take all night.
In your DNS provider, add an A record:
docs.example.com → YOUR_VPS_IPV4
Add an AAAA record too if your server has IPv6. Confirm it resolves before going further:
dig +short docs.example.com
DNS has to resolve to your server before Caddy can fetch a Let's Encrypt certificate.
On a fresh Ubuntu server:
sudo apt update
sudo apt install -y ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
-o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] \
https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo $VERSION_CODENAME) stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Confirm the daemon is running:
sudo docker run --rm hello-world
Paperless keeps its data on persistent volumes. Make a home for the stack:
sudo mkdir -p /opt/paperless
cd /opt/paperless
Everything below lives in /opt/paperless. The bind-mounted folders (data, media, consume, export) will be created automatically by Docker on first run, but you can pre-create the consume folder so you can drop files into it immediately:
sudo mkdir -p /opt/paperless/consume
Paperless needs a long random secret key and a database password. Generate both now and keep them somewhere safe:
# Secret key - copy this into PAPERLESS_SECRET_KEY
openssl rand -base64 48
# Database password - copy this into POSTGRES_PASSWORD and PAPERLESS_DBPASS
openssl rand -base64 24
Create /opt/paperless/paperless.env. This holds all the app configuration so it stays out of the compose file:
sudo nano /opt/paperless/paperless.env
# --- Core ---
PAPERLESS_SECRET_KEY=PASTE_THE_48_CHAR_KEY_HERE
PAPERLESS_URL=https://docs.example.com
PAPERLESS_CSRF_TRUSTED_ORIGINS=https://docs.example.com
PAPERLESS_TIME_ZONE=Europe/Berlin
# --- Database ---
PAPERLESS_DBHOST=db
PAPERLESS_DBNAME=paperless
PAPERLESS_DBUSER=paperless
PAPERLESS_DBPASS=PASTE_THE_DB_PASSWORD_HERE
# --- Redis broker ---
PAPERLESS_REDIS=redis://broker:6379
# --- OCR ---
PAPERLESS_OCR_LANGUAGE=eng
PAPERLESS_OCR_LANGUAGES=eng deu
# --- Office docs (Gotenberg + Tika) ---
PAPERLESS_TIKA_ENABLED=1
PAPERLESS_TIKA_GOTENBERG_ENDPOINT=http://gotenberg:3000
PAPERLESS_TIKA_ENDPOINT=http://tika:9998
# --- First-run admin account ---
PAPERLESS_ADMIN_USER=admin
PAPERLESS_ADMIN_PASSWORD=choose-a-strong-password-here
A few notes on the values:
PAPERLESS_OCR_LANGUAGE is the default OCR language (use ISO 639-2 codes: eng, deu, fra, spa, nld, etc.). PAPERLESS_OCR_LANGUAGES is the space-separated list of language packs to install in the container so you can OCR documents in more than one language.PAPERLESS_TIME_ZONE controls how dates are displayed - set it to yours.PAPERLESS_ADMIN_* vars only create the account on the very first boot with an empty database. After that they're ignored, so you can leave them or remove them.Create /opt/paperless/docker-compose.yml:
services:
broker:
image: docker.io/library/redis:7
restart: unless-stopped
volumes:
- redisdata:/data
db:
image: docker.io/library/postgres:16
restart: unless-stopped
volumes:
- pgdata:/var/lib/postgresql/data
environment:
POSTGRES_DB: paperless
POSTGRES_USER: paperless
POSTGRES_PASSWORD: PASTE_THE_DB_PASSWORD_HERE
webserver:
image: ghcr.io/paperless-ngx/paperless-ngx:latest
restart: unless-stopped
depends_on:
- db
- broker
- gotenberg
- tika
ports:
- "127.0.0.1:8000:8000"
volumes:
- data:/usr/src/paperless/data
- media:/usr/src/paperless/media
- ./export:/usr/src/paperless/export
- ./consume:/usr/src/paperless/consume
env_file: paperless.env
healthcheck:
test: ["CMD", "curl", "-fs", "-S", "--max-time", "2", "http://localhost:8000"]
interval: 30s
timeout: 10s
retries: 5
gotenberg:
image: docker.io/gotenberg/gotenberg:8.7
restart: unless-stopped
command:
- "gotenberg"
- "--chromium-disable-javascript=true"
- "--chromium-allow-list=file:///tmp/.*"
tika:
image: docker.io/apache/tika:latest
restart: unless-stopped
volumes:
redisdata:
pgdata:
data:
media:
Two things worth calling out:
127.0.0.1:8000, not 0.0.0.0. That means it's only reachable from the server itself - Caddy will proxy to it. Paperless never gets exposed to the raw internet.db reads its password from the compose file, while the app reads the same value from paperless.env. Make sure POSTGRES_PASSWORD and PAPERLESS_DBPASS match exactly, or the app can't connect.Bring it up:
cd /opt/paperless
sudo docker compose pull
sudo docker compose up -d
The first start does real work: it initializes the database, runs migrations, and downloads the OCR language packs you listed. Give it a couple of minutes. Watch the progress:
sudo docker compose logs -f webserver
When you see something like Paperless-ngx ... is ready, it's up. Confirm it answers locally:
curl -I http://127.0.0.1:8000
A 200 or a redirect means the app is alive. Now let's put HTTPS in front of it.
Caddy fetches and renews Let's Encrypt certificates automatically, which makes it the least painful option for a single-service host. Install it on the host (not in Docker):
sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https curl
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | \
sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | \
sudo tee /etc/apt/sources.list.d/caddy-stable.list
sudo apt update
sudo apt install -y caddy
Replace /etc/caddy/Caddyfile with:
docs.example.com {
reverse_proxy 127.0.0.1:8000
encode zstd gzip
request_body {
max_size 100MB
}
}
The request_body block matters: scanned PDFs and high-resolution photos get large, and the default upload limit will reject them with a confusing error. Reload Caddy:
sudo systemctl reload caddy
Visit https://docs.example.com. You should get a valid certificate and the Paperless login page. Sign in with the PAPERLESS_ADMIN_USER and PAPERLESS_ADMIN_PASSWORD you set earlier.
If you already use Nginx or another reverse proxy, the same idea applies - proxy to 127.0.0.1:8000, raise the max upload size, and forward the standard X-Forwarded-* headers. Our Caddy reverse proxy guide covers the general pattern in more depth.
There are three ways to get documents in, and you'll probably use all three:
Drag and drop in the web UI. The fastest way to test. Click the upload area on the dashboard and drop a PDF. Within a few seconds it appears in the inbox, OCR'd and searchable.
The consume folder. Anything you drop into /opt/paperless/consume on the server gets ingested automatically. This is the killer feature - point a network share, a scanner's "scan to folder" target, or an rsync job at it:
cp ~/some-invoice.pdf /opt/paperless/consume/
Watch it get picked up:
sudo docker compose logs -f webserver | grep -i consum
Email ingestion. In Settings → Mail, connect a dedicated mailbox (a documents@ alias works well) and Paperless will pull attachments from incoming mail on a schedule. Forward a receipt from your phone and it lands in your archive.
Paperless gets dramatically more useful once it tags documents for you. The building blocks live under Settings:
taxes-2025, car, important).Then create Matching rules so new documents file themselves. For example, a tag called Utilities with an "Automatic" matching algorithm and the match text electricity OR gas OR water will tag any document whose OCR text contains those words. After a dozen or so documents, the auto-matcher becomes genuinely good at guessing.
Set a default view of the Inbox (a built-in tag applied to everything new) so unprocessed documents are easy to find, and clear the inbox tag once you've reviewed each one.
Paperless is now only reachable through Caddy, which is good. Tighten the rest:
sudo ufw allow OpenSSH
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enable
Port 8000 is bound to localhost and never needs to be open. If this is a brand-new server, it's worth doing a proper SSH hardening pass too - see our SSH hardening and fail2ban guide for a checklist. Because a Paperless archive is exactly the kind of data you don't want anyone else reading, consider putting it behind a VPN instead of the public internet - our Tailscale guide pairs perfectly with this setup.
Paperless ships a built-in exporter that writes a portable, restorable snapshot of your entire archive - documents, metadata, tags, and settings - to the export folder:
sudo docker compose exec webserver document_exporter ../export
Wrap that in a script at /opt/paperless/backup.sh:
#!/usr/bin/env bash
set -euo pipefail
cd /opt/paperless
# Portable export of all documents + metadata
docker compose exec -T webserver document_exporter ../export
# Tarball the export plus the env file and compose file
STAMP=$(date +%Y-%m-%d)
tar -czf "/opt/paperless/backups/paperless-${STAMP}.tar.gz" \
-C /opt/paperless export paperless.env docker-compose.yml
# Keep the last 14 daily archives
find /opt/paperless/backups -name 'paperless-*.tar.gz' -mtime +14 -delete
Make it runnable and schedule it:
sudo mkdir -p /opt/paperless/backups
sudo chmod +x /opt/paperless/backup.sh
sudo crontab -e
Add a nightly run:
30 3 * * * /opt/paperless/backup.sh >> /var/log/paperless-backup.log 2>&1
For off-site safety, push those tarballs to object storage. Our restic to S3 guide shows how to encrypt and ship /opt/paperless/backups to any S3-compatible bucket so a dead disk can't take your archive with it.
Restoring is the reverse: spin up a fresh stack, copy the export back into the export folder, and run document_importer ../export.
Paperless-ngx releases regularly, and most upgrades are painless:
cd /opt/paperless
sudo docker compose pull
sudo docker compose up -d
Always take a fresh export first. Major version bumps occasionally include database migrations, and the project documents any breaking changes in its release notes - skim them before jumping versions. Pinning to a specific tag instead of latest (for example ghcr.io/paperless-ngx/paperless-ngx:2.14) gives you control over exactly when you move.
"CSRF verification failed" when logging in over HTTPS. You missed PAPERLESS_URL and PAPERLESS_CSRF_TRUSTED_ORIGINS, or they don't match your real URL. Set both to https://docs.example.com, then sudo docker compose up -d to apply.
The webserver container restarts in a loop. Almost always a database connection problem. Check that POSTGRES_PASSWORD in the compose file matches PAPERLESS_DBPASS in paperless.env exactly, then look at sudo docker compose logs db for auth errors.
OCR is slow or the consume backlog never clears. OCR is CPU-bound. On a 1 vCPU VPS, large scans take a while. Check progress with sudo docker compose logs -f webserver. If you're importing in bulk, do it overnight or temporarily resize to more cores.
Uploads fail with a 413 or "request entity too large". The reverse proxy is rejecting big files. Confirm the request_body { max_size 100MB } block is in your Caddyfile and that you reloaded Caddy.
Office documents (.docx, .xlsx) won't import. Those go through Gotenberg and Tika. Make sure both containers are running (sudo docker compose ps) and that PAPERLESS_TIKA_ENABLED=1 plus the two endpoint URLs are set in your env file.
Scans folder into the consume directory and let Paperless OCR and ingest the lot. Set up your tags and matching rules first so they file themselves on the way in.A self-hosted Paperless instance changes your relationship with paper. The drawer stays empty because every statement, receipt, and contract goes through the consume folder once and is then findable forever. Six months in, "where did I put that" stops being a question you ask.
Need a VPS with the RAM and NVMe headroom for OCR workloads like this? Our Linux plans include fast storage, IPv6, and room to grow your Docker stack. See the options.