ArchiveBox — Runbook
Quick Reference
| Item | Value |
|---|---|
| LXC | 128 @ 192.168.1.128 |
| URL | https://archive.eva-00.network |
| Version | v0.7.3 (stable) |
| Health (web) | curl http://192.168.1.128:8000 (200 or 302) |
| Health (API) | curl http://192.168.1.128:8001/health (200) |
| Vault | secret/data/archivebox |
| Deploy | Forgejo Actions -> Deploy ArchiveBox |
Check Service Status
ssh [email protected] docker compose -f /opt/archivebox/docker-compose.yml ps
Both containers should be running:
archivebox— Web UI on port 8000archivebox-api— API wrapper on port 8001
Restart Services
ssh [email protected] docker compose -f /opt/archivebox/docker-compose.yml restart
View Logs
Via Loki (preferred)
{container_name="archivebox"}
{container_name="archivebox-api"}
{container_name=~"archivebox.*"} |= "error"
{container_name="archivebox-api"} |= "POST /add"
Via SSH (fallback)
ssh [email protected] docker compose -f /opt/archivebox/docker-compose.yml logs -f --tail 100
ssh [email protected] docker compose -f /opt/archivebox/docker-compose.yml logs -f --tail 100 api-wrapper
Add URLs to Archive
Via API wrapper (preferred)
curl -X POST http://192.168.1.128:8001/add \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"url": "https://old.reddit.com/r/...", "tag": "subreddit_name"}'
Via CLI (direct)
ssh [email protected] docker exec archivebox archivebox add 'https://example.com'
With tags
ssh [email protected] docker exec archivebox archivebox add --tag 'reddit,pics' 'https://old.reddit.com/r/pics/...'
From a file (one URL per line)
scp urls.txt [email protected]:/tmp/urls.txt
ssh [email protected] docker exec archivebox archivebox add < /tmp/urls.txt
Re-run missing extractors on existing snapshots
ssh [email protected] docker exec archivebox archivebox update
This only runs extractors that haven't succeeded yet (smart incremental).
Trigger Backups Manually
PBS — LXC snapshot
Via Proxmox UI or CLI:
ssh [email protected] vzdump 128 --storage cajita-elite --compress zstd --mode snapshot --notes "manual backup"
Databasement — SQLite dump
Trigger via the Databasement web UI at http://192.168.1.196:2226. Navigate to the ArchiveBox database entry and click "Backup Now".
Backrest — Archive files
ssh [email protected] curl -s -X POST 'http://localhost:9898/v1.Backrest/Backup' \
-H 'Content-Type: application/json' \
-d '{"value":"archivebox-archives"}'
Fresh Redeploy
Trigger via Forgejo Actions with force_clean=true, or manually:
ssh [email protected]
cd /opt/archivebox
docker compose down
rm -rf /opt/archivebox/data/*
# Note: archive files on urahara are preserved
Then re-run the workflow. It will re-init and re-create the admin user.
Memory / Performance
LXC resource allocation
| Resource | Value | Notes |
|---|---|---|
| RAM | 3072 MB | Headless Chrome is the main consumer (~300-500MB per tab) |
| Cores | 2 | |
| Disk | 16 GB rootfs | Archives on urahara, so rootfs stays small |
RAM guidance
| Archiving activity | Expected RAM |
|---|---|
| Idle (no archiving) | ~200 MB |
| Screenshot (1 URL) | ~500 MB - 1 GB |
| DOM + wget + WARC (1 URL) | ~300-500 MB |
| Media download (yt-dlp) | ~300-500 MB |
| Peak (Chrome + wget + yt-dlp) | ~1.5-2 GB |
ArchiveBox processes URLs sequentially by default. If you see OOM, increase LXC RAM:
pct set 128 --memory 4096
Disk usage on urahara
Reddit post archives average 5-10 MB. General web pages 1-5 MB. Media varies widely.
| Bookmark count | Estimated archive size |
|---|---|
| 1,000 | 5-15 GB |
| 5,000 | 35-50 GB |
| 10,000 | 70-100 GB |
Troubleshooting
API wrapper returns 401
Check the Bearer token matches the one in Vault at secret/data/archivebox -> api_key. Also verify the token matches what's in n8n's .env as ARCHIVEBOX_API_KEY.
API wrapper returns 500 or timeout
Check the archivebox container is healthy and the /data volume is shared:
ssh [email protected] docker exec archivebox-api curl -s http://localhost:8001/health
ssh [email protected] docker exec archivebox-api ls /data/index.sqlite3
ArchiveBox returns 403 when accessing via browser
oauth2-proxy is blocking. Check:
- PocketID OIDC client exists with correct callback URL
- oauth2-proxy container is running on LXC 119:
{container_name="oauth2-proxy-archivebox"} - Caddy is routing
archive.eva-00.networkto192.168.1.119:8592
wget/WARC not running despite being enabled
In v0.7.3, WARC generation requires SAVE_WGET=True (wget produces the WARC). Both must be enabled. Check:
ssh [email protected] docker exec archivebox archivebox config --get SAVE_WGET
ssh [email protected] docker exec archivebox archivebox config --get SAVE_WARC
"output.html does not exist" in web UI
This happens when --overwrite creates new snapshot directories with different timestamps while DB still points to the original. Fix:
# Re-run missing extractors on original snapshot
ssh [email protected] docker exec archivebox archivebox update
yt-dlp not downloading media
Check yt-dlp is installed and up to date:
ssh [email protected] docker exec archivebox yt-dlp --version
Archives not appearing in /mnt/archivebox/archive
Verify the bind mount is active:
ssh [email protected] df -h /mnt/archivebox/archive
ssh [email protected] ls -la /mnt/archivebox/archive/
If empty, check Proxmox bind mount config for LXC 128.
OAuth2 cookie issues (login loop)
Clear the _oauth2_archivebox cookie in your browser, or try incognito.