Skip to content

ADR-021 — Local Fork of Filebrowser Quantum for Progress + Tasks Page

Date: 2026-05-14 Status: Accepted (in implementation) Context: Filebrowser Quantum, our chosen web file manager (LXC 126, gtstef/filebrowser:beta), shows an indeterminate spinner for all cross-filesystem copy/move operations. For a 100 GB move between /srv/unohana and /srv/filedump the UI is opaque for ~40 minutes. There is no way to see in-flight transfers from a second device, survive a page reload, or know whether an interrupted server restart left a half-finished copy. Cloud Commander has the same defect with a broken percentage bar that stuck at 1% for hours. Upstream issue #1019 raised this in 2024 and was closed by shipping the spinner that exists today.


Decision

Maintain a local fork of github.com/gtsteffaniak/filebrowser and ship a patched image (filebrowser-qa:progress for QA, eventually a versioned tag for production). The patch adds:

  1. Real-time copy/move progress via a counting io.Copy wrapper that emits SSE events through the project's existing event bus.
  2. A new backend/jobs/ package — in-memory job registry with JSON-file persistence — that lets any client query in-flight and recently-finished transfers.
  3. A /tasks route on the frontend that renders the registry with progress bars, transfer rate (5-second moving avg), ETA, elapsed time, and a Cancel button.
  4. Context-cancellation plumbed through the copy loop so DELETE /api/jobs/{id} actually aborts the byte transfer mid-stream.

The fork is rebased against upstream main on a cadence to be determined (initial guess: monthly).


Why fork instead of upstream-only

Upstream PR first Local fork now
Time to value Weeks to months; subject to maintainer review and direction Hours; ships when we're done
Design influence Maintainer may reshape it We control trade-offs (e.g., 30-min retention, JSON persistence)
Risk to homelab stability None — only our fork is touched Our fork has to be re-validated after each upstream rebase
Reusability for others Yes, eventually Only if we submit later

The /tasks page in particular is a substantive UX addition that may not align with the upstream maintainer's design preferences. Asking for review of a 700+ line patch as a first contact is unlikely to land cleanly. Build first, validate locally, then propose upstream if the design holds up.


Why this scope (progress + tasks together)

The progress dialog alone is fragile: closing the browser or pressing reload during a multi-GB copy throws away all visibility, even though the byte copy continues on the server (Go's request goroutine doesn't exit on client disconnect, and io.Copy doesn't check r.Context()). A tasks page backed by a server-side registry removes the page-reload risk entirely — the UI becomes a view onto persistent state rather than an ephemeral subscription. Doing both together is also cleaner: the same copyProgress SSE events feed both the dialog (per-operation) and the tasks page (cross-cutting).


Alternatives considered

  • Upstream-only PR (no local patches). Rejected: too slow, design uncertain. Reconsidered as a follow-up after the patch has been validated in QA.
  • Switch file manager. Cloud Commander (the dual-pane companion) has the same problem. FileRise advertises progress for uploads but server-side copy progress is unclear. Replacing Filebrowser Quantum would mean re-doing OIDC, search indexing, and Caddy routing — a much larger change for a narrower benefit.
  • Build a separate "transfer manager" service alongside Filebrowser. Rejected: would require duplicating filesystem access patterns and OIDC integration. The copy is happening inside Filebrowser regardless; instrumenting the existing path is much smaller scope.
  • Persist jobs to BoltDB (the same DB Quantum uses for users). Rejected for v1: would require a storm model and migration. A flat JSON file at data/jobs.json covers the requirement (interrupted transfers visible after restart) with one file, atomic writes, and no schema dependency. Can promote to BoltDB later if scope grows.
  • Persist progress state across restarts. Rejected: the actual byte counter and io.Copy state are in-memory only. Surviving a restart would require checkpointing partial files and resumable file copies — large complexity for a rare event. Instead: interrupted jobs show as state=interrupted and the user re-initiates.

Consequences

Positive: - Real progress visibility for moves between ZFS pools (which is every move in our setup since unohana / urahara / filedump / dlbox are all separate mounts). - Reload-safe — laptop sleep, accidental tab close, second device, all work. - Server restart no longer silently loses transfers; they surface as interrupted in the tasks page. - Cancel button — actually useful for the inevitable "wait, wrong folder" moment. - SSE bus reused; no new transport.

Negative: - Maintenance burden: every time upstream ships a release we want, we rebase. The patch touches backend/adapters/fs/fileutils/file.go, backend/adapters/fs/files/files.go, backend/http/resource.go, plus the new backend/jobs/ package, plus two Vue components and one new view. Rebase conflicts are likely on file.go if upstream refactors copy semantics. - Custom Docker image to build and store (~85 MB content). Builds take ~5-10 minutes on LXC 131 (dev). Will need a workflow to ship versioned tags. - No automatic upstream security updates until rebase.

Mitigations: - Pin to a specific upstream commit, not a moving branch. - Rebase quarterly or whenever a CVE lands in upstream. - Keep the patch minimal — adds public functions (CopyFileWithProgress, MoveFileWithProgress) alongside the originals rather than changing signatures of code shared with other call sites. - After three months of stable use, write up the design and open an upstream issue / draft PR. Even if it doesn't merge, the conversation surfaces whether anyone else wants this.


Rollback plan

Production filebrowser still runs gtstef/filebrowser:beta on LXC 126. The fork runs as a second container (filebrowser-qa) on the same LXC, port 8081, no Caddy entry, no OIDC. If the patch causes problems we just stop the QA container. If we promote to production we change the image tag in services/filebrowser/docker-compose.yml; reverting is a one-line git revert + workflow rerun.