All articles
TutorialsJun 15, 2026 · 20 min read

Self-Host Stirling-PDF on a VPS for a Private PDF Toolkit

Self-Host Stirling-PDF on a VPS for a Private PDF Toolkit

Every time you drag a PDF onto a "free PDF merger" website, you are handing a document - often a contract, a payslip, a scanned passport - to a stranger's server. Those sites pay their bills with your files: ads, trackers, and in the worst cases a quiet copy kept around. For a one-off it feels harmless, but the moment you are merging tax documents or signing an NDA, you really do not want that upload happening.

Stirling-PDF is the fix. It is a self-hosted web app that does just about everything those scattered websites do - merge, split, rotate, compress, convert, OCR, watermark, sign, redact, fill forms - all from one clean interface, all on your own server. Nothing leaves the box. This guide walks through a production-ready install on a VPS using Docker, with Caddy in front for automatic HTTPS and a login so it is not open to the world.

Stirling-PDF processes files in memory and temporary storage, then discards them - it is not a document archive. Download your result and keep your originals elsewhere. Do not treat the server as the only copy of anything important.

TL;DR

  • Install Docker and Docker Compose on a fresh VPS
  • Point a subdomain like pdf.example.com at your server
  • Run Stirling-PDF and Caddy from a single docker-compose.yml
  • Mount volumes for OCR language data and config
  • Caddy handles HTTPS automatically
  • Turn on login so the toolkit is not public, then change the default password
  • Add Tesseract language packs to unlock OCR on scanned documents

Total time: about 15 minutes.

What You Need

  • A VPS with at least 1 GB RAM (2 GB if you plan to OCR or convert large files) running Ubuntu 22.04 or 24.04
  • A domain you can add DNS records to
  • Ports 80 and 443 open to the internet (Let's Encrypt needs them)
  • Root or sudo access

Stirling-PDF is a Java application, so it idles a little heavier than a Go app - figure on 200-400 MB of RAM at rest. The CPU-hungry operations are OCR and compression, and those run only while you actively use them, so a small VPS handles a single user comfortably.

Step 1: Point a Subdomain at Your VPS

In your DNS provider, create an A record:

pdf.example.com → YOUR_VPS_IPV4

Add an AAAA record too if your server has IPv6. Confirm it resolves:

dig +short pdf.example.com

The output should be your VPS IP. Caddy cannot issue a certificate until DNS points at the box.

Step 2: Install Docker and Docker Compose

On a fresh Ubuntu 22.04 or 24.04 server:

sudo apt update sudo apt install -y ca-certificates curl sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg \ -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] \ https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo $VERSION_CODENAME) stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt update sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Check it works:

docker --version docker compose version

Step 3: Open the Firewall

sudo ufw allow 22/tcp sudo ufw allow 80/tcp sudo ufw allow 443/tcp sudo ufw enable

Caddy needs 80 for the ACME HTTP challenge and 443 for HTTPS. Do not expose Stirling-PDF's internal port 8080 to the internet - Caddy is the only thing that should answer publicly.

Step 4: Create the Project Directory

sudo mkdir -p /opt/stirling-pdf cd /opt/stirling-pdf sudo mkdir -p tessdata configs customFiles logs caddy-data caddy-config

These directories map to how Stirling-PDF keeps state:

  • tessdata/ holds Tesseract OCR language files - the data that lets it read text out of scanned images
  • configs/ stores the app settings and, with login enabled, the user database
  • customFiles/ is for optional branding (custom logos, footer text, static overrides)
  • logs/ is exactly what it sounds like

Everything important to back up lives under /opt/stirling-pdf.

Step 5: Write the Compose File

Create /opt/stirling-pdf/docker-compose.yml:

services: stirling-pdf: image: stirlingtools/stirling-pdf:latest container_name: stirling-pdf restart: unless-stopped environment: DOCKER_ENABLE_SECURITY: "true" SECURITY_ENABLELOGIN: "true" SYSTEM_DEFAULTLOCALE: "en-US" UI_APPNAME: "PDF Tools" UI_HOMEDESCRIPTION: "My private, self-hosted PDF toolkit" UI_APPNAMENAVBAR: "PDF Tools" LANGS: "en_GB" volumes: - ./tessdata:/usr/share/tessdata - ./configs:/configs - ./customFiles:/customFiles - ./logs:/logs networks: - pdfnet caddy: image: caddy:2 container_name: stirling-caddy restart: unless-stopped ports: - "80:80" - "443:443" volumes: - ./Caddyfile:/etc/caddy/Caddyfile:ro - ./caddy-data:/data - ./caddy-config:/config networks: - pdfnet networks: pdfnet:

Notes:

  • DOCKER_ENABLE_SECURITY: "true" builds the image with the security/login module included. It must be set for SECURITY_ENABLELOGIN to have any effect.
  • SECURITY_ENABLELOGIN: "true" puts a login wall in front of every tool. Leave this on - an open PDF toolkit on the public internet will get found and abused as free compute.
  • LANGS controls which OCR language packs the image downloads at startup. en_GB is a sensible default; add more like LANGS: "en_GB,de_DE,fr_FR" and the container fetches them on boot.
  • The internal port 8080 stays on the pdfnet Docker network. Caddy reaches it by container name, so there is no ports: mapping on the Stirling service.

Step 6: Write the Caddyfile

Create /opt/stirling-pdf/Caddyfile:

pdf.example.com { encode zstd gzip reverse_proxy stirling-pdf:8080 }

Caddy automatically requests a Let's Encrypt certificate for pdf.example.com on first boot and renews it forever, with the HTTP-to-HTTPS redirect handled for you. Unlike nginx, Caddy does not impose a small default upload limit, so big scanned PDFs pass through without extra tuning.

Step 7: Start the Stack

cd /opt/stirling-pdf sudo docker compose up -d sudo docker compose logs -f

The first boot takes a minute or two: the container downloads the OCR language packs listed in LANGS and warms up the Java runtime. Wait until the Caddy logs show the certificate was issued and Stirling-PDF reports it is listening, then open https://pdf.example.com.

Step 8: Log In and Change the Default Password

With login enabled, Stirling-PDF ships with one account:

  • Username: admin
  • Password: stirling

Log in immediately and change it. Go to the account menu (top right) -> Settings -> Change username/password and set a strong password from your password manager. While you are there, create separate accounts for anyone else who needs access rather than sharing the admin login.

Changing the default admin / stirling credentials is the single most important step. Bots scan for self-hosted apps and try known defaults first. Do this before you walk away from the server, not later.

Step 9: Take the Tools for a Spin

The home page is a grid of every operation, grouped by category. The ones you will reach for most:

  • Merge - combine several PDFs into one, drag to reorder.
  • Split - break a document on specific pages, by size, or by chapter bookmarks.
  • Compress - shrink a bloated PDF to a target size; great for email attachment limits.
  • Convert - PDF to image, image to PDF, PDF to Word/PDF to PDF-A, and more.
  • Sign - draw, type, or upload a signature and stamp it onto a page.
  • Add Password / Remove Password - encrypt a PDF or strip a known password.
  • Redact - black out text permanently, not just visually.

Every operation runs on the server and hands you a file to download. Nothing is uploaded to a third party, which is the entire point.

Step 10: Enable OCR on Scanned Documents

A scanned PDF is just a stack of images - you cannot select or search the text. OCR (optical character recognition) fixes that by recognizing the characters and layering a searchable text behind the image. Stirling-PDF uses Tesseract, and Tesseract needs a language data file for each language you want to read.

If you set LANGS in the compose file, the matching packs were already downloaded into tessdata/ on first boot. To add a language by hand, drop its .traineddata file into the folder:

cd /opt/stirling-pdf/tessdata sudo curl -fsSLO https://github.com/tesseract-ocr/tessdata/raw/main/deu.traineddata sudo docker compose restart stirling-pdf

Now open the OCR / Cleanup tool, upload a scanned document, pick the language, and choose whether to embed a searchable text layer (the usual choice) or convert to fully selectable text. The result is a PDF you can search and copy from.

OCR is the most CPU-intensive thing Stirling-PDF does. A large multi-page scan can peg a core for a while - that is normal. On a small shared VPS, process big batches one at a time.

Step 11: Automate Pipelines (Optional)

Stirling-PDF has a Pipeline feature that chains operations into a saved workflow - for example "OCR, then compress, then add a watermark" applied to every file you drop in. You build the pipeline once in the UI, export it as a JSON file, and it lives in your configs/ directory. For a small office that always processes incoming scans the same way, this turns a five-click chore into a single action.

For true hands-off automation you can also call the REST API directly. Every tool in the UI has a matching endpoint, documented at https://pdf.example.com/swagger-ui/index.html. A nightly cron job that flattens and compresses a folder of invoices is a few lines of curl.

Step 12: Back Up Your Config

The files themselves are transient, but your settings, user accounts, custom branding, and saved pipelines are not. They live in configs/ and customFiles/. Pair them with a nightly restic job to object storage:

restic -r s3:s3.amazonaws.com/my-backup-bucket backup \ /opt/stirling-pdf/configs /opt/stirling-pdf/customFiles

The data is tiny - usually a few megabytes - so backups are instant. Restoring is just dropping the folders back and starting the container.

Optional: Keep It Private with a VPN

If you are the only person using the toolkit, it does not need to face the public internet at all. Putting Stirling-PDF behind Tailscale removes every bot scan and login attempt in one move - skip the firewall openings for 80 and 443 and reach pdf.example.com over your tailnet instead. A WireGuard VPN on your VPS does the job just as well if you already run one. For a shared instance with a real login, the public Caddy setup above is fine - just keep the password strong.

Troubleshooting

Caddy returns a 502 right after docker compose up. Stirling-PDF takes a while to start its Java runtime and download language packs on first boot, and Caddy can reach it first. Give it a minute or two and reload. If it sticks, check docker compose logs stirling-pdf.

Login is enabled but I cannot get past the password page. Both variables are required together: DOCKER_ENABLE_SECURITY must be true for the login module to exist, and SECURITY_ENABLELOGIN must be true to switch it on. If you set only the second one, the toggle does nothing. Fix both and run docker compose up -d to recreate the container.

OCR says no languages are available. The tessdata/ folder is empty or the file name is wrong. Confirm a .traineddata file is present (ls /opt/stirling-pdf/tessdata) and restart the container. Remember the codes are three letters - eng, deu, fra - not the two-letter LANGS codes.

Large file uploads fail or time out. This is almost always a reverse-proxy limit upstream of Caddy, not Stirling-PDF. If you put Cloudflare in front, free plans cap uploads at 100 MB. Connect directly to your server's hostname for big files, or split the document first.

The app feels sluggish under load. Java plus OCR plus compression is genuinely memory-hungry. If the container is being killed (docker compose logs shows it restarting), bump the VPS to 2 GB RAM or add swap.

Going Further

  • Lock it down with a second factor. Front the whole site with Authelia to add SSO and 2FA in front of Stirling-PDF's own login.
  • Add more OCR languages. Grab any .traineddata file from the Tesseract data repo and drop it into tessdata/ - there are over 100 languages, including vertical scripts.
  • Wire up the API. The Swagger UI at /swagger-ui/index.html documents every endpoint, so you can script repetitive jobs from another machine or a cron task.
  • Run it next to your other tools. Stirling-PDF sits happily behind the same Caddy instance as the rest of your self-hosted stack - just add another site block and a second service.

That's it. A self-hosted PDF toolkit gives you every common document operation in one place, fast, and without uploading a single sensitive file to a website you have never heard of.


Need a VPS for your self-hosted tools? Our Linux plans ship with fast NVMe storage, generous bandwidth, and IPv6 out of the box. See the options.