Skip to content

Ollama — Runbook

Routine Tasks

List available models

ssh [email protected] "ollama list"

Pull a model manually

ssh [email protected] "ollama pull <model>"

Restart Ollama

ssh [email protected] "systemctl restart ollama"

Check disk usage

ssh [email protected] "df -h /"
The rootfs is a 450 GiB sparse image on /dev/sde (ollama-disk). Check host-side storage usage on chizuru:
ssh [email protected] "df -h /mnt/pve/ollama-disk"


Logs

Log Contents Location Loki query Format
Ollama service Model loading, inference requests, GPU allocation, errors LXC 107 journald {job="ollama", unit="ollama.service"} Plain text

Notes: - Prometheus metrics are also scraped from 192.168.1.107:11434/api/metrics (inference stats, model loading) - SSH fallback: ssh [email protected] "journalctl -u ollama -f"


Troubleshooting

Model pull fails

High memory usage / OOM

  • Ollama loads models into memory. Unload unused ones: ssh [email protected] "ollama stop <model>"
  • LXC 107 has 32 GiB RAM — enough for one 32B model (~19 GiB) or two smaller models simultaneously
  • If OOM persists, set OLLAMA_MAX_LOADED_MODELS=1 in /etc/systemd/system/ollama.service.d/override.conf

LXC not reachable after Proxmox reboot

Reprovisioning the LXC from scratch

Re-run the Ollama workflow via Forgejo (workflow_dispatch). The playbook will: 1. Skip storage setup (already exists) 2. Destroy and recreate LXC 107 (since rootfs will be on ollama-disk already, it skips destruction — to force reprovision, manually run pct destroy 107 --purge on chizuru first) 3. Reinstall Ollama and re-pull all models