Skip to content

FreeBSD: force-remove the jail on exit to reap detached daemons#210

Open
mtelvers wants to merge 2 commits into
masterfrom
freebsd-jail-reap-detached-daemons
Open

FreeBSD: force-remove the jail on exit to reap detached daemons#210
mtelvers wants to merge 2 commits into
masterfrom
freebsd-jail-reap-detached-daemons

Conversation

@mtelvers

Copy link
Copy Markdown
Member

Builds run in a non-persistent jail, which only auto-removes once its last process exits. A build step that detaches a process (e.g. a test running git daemon --detach, reparented to PID 1) leaves a straggler that keeps the jail, its devfs mount and ZFS snapshot alive forever; the busy mounts also defeat the manual umounts. These leak until the worker grinds to a halt.

Replace the manual umounts with jail -r on every exit path: it SIGKILLs any straggler so the jail's release actions unmount everything. On a clean exit the jail is already gone, so it's a no-op.

Builds run in a non-persistent jail, which the kernel only auto-removes
once its *last* process exits. A build step that detaches a process
(e.g. a test running `git daemon --detach`, which reparents to PID 1)
leaves a straggler that keeps the jail -- and its devfs mount and ZFS
snapshot -- alive indefinitely. The mounts are then held busy, so the
manual umounts on the success path fail with EBUSY and are ignored,
and the jail leaks. On a long-lived worker these pile up until the host
grinds to a halt.

Replace the manual umounts with `jail -r`, run on every exit path. It
SIGKILLs any straggler so the jail's release actions can unmount devfs
and the cache mounts. On a clean exit the jail has already auto-removed,
so this is a harmless no-op.
@mtelvers mtelvers force-pushed the freebsd-jail-reap-detached-daemons branch from 6d7ed17 to a863b86 Compare June 24, 2026 13:37
A command jail auto-removes the instant its command exits, but that
auto-removal does not run the unmounts for its mount.fstab (nullfs
caches) and mount.devfs. The previous code unmounted them explicitly on
the clean-exit (Ok 0) path; pointing that path at "jail -r" dropped the
unmount (jail -r is a no-op once the jail has already auto-removed), so
every clean build leaked its nullfs cache and devfs mounts. The leaked
nullfs bind keeps the cache dataset busy, which breaks obuilder's cache
rotation ("zfs rename ... failed: pool or dataset is busy") and leaves
orphaned mounts behind.

Restore the explicit umount of the fstab and devfs, now after teardown
and on every exit path. jail -r still SIGKILLs any detached straggler
first so the mounts are no longer busy, and doing it unconditionally
also closes the pre-existing leak on the error paths.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant