Skip to content

run GL teardown on the refresh thread that owns the context (fixes segfault on power-off / window close)#44

Merged
techomancer merged 2 commits into
techomancer:mainfrom
tenox7:fix-gl-teardown-thread
Jun 22, 2026
Merged

run GL teardown on the refresh thread that owns the context (fixes segfault on power-off / window close)#44
techomancer merged 2 commits into
techomancer:mainfrom
tenox7:fix-gl-teardown-thread

Conversation

@tenox7

@tenox7 tenox7 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Problem

iris segfaults on macOS when the guest powers off (IRIX halt), and on the same code path when the window is closed. The crashing thread is machine-events, in glDeleteTextures:

machine-events  glDeleteTextures + 16                       libGL.dylib
                GlCompositor::destroy                        gl_compositor.rs:545
                GlRenderer::stop                             ui.rs:502
                Rex3::stop                                   rex3.rs
                Machine::stop                                machine.rs:620

EXC_BAD_ACCESS (SIGSEGV), KERN_INVALID_ADDRESS at 0x1e0 — i.e. a dereference of a null current-context plus a small field offset, not a freed/garbage pointer.

Root cause

The GL context is created and made current lazily by GlRenderer::init_gl, which only ever runs from present() — and present() is only called from Rex3::refresh_loop (the REX3-Refresh thread). That thread is the sole owner of the context; the unsafe impl Send for GlRenderer even documents this ("sent to the refresh thread where it owns and uses the GL context. No other thread touches these fields").

Rex3::stop() joined (terminated) the refresh thread and then called renderer.stop()Compositor::destroy(&state.gl)glDeleteTextures from whatever thread invoked stop()machine-events on guest power-off, the main thread on window close (main.rs after the winit loop returns). On macOS, GL dispatches through the calling thread's current context; with none current, the call faults.

This was a broad latent bug, not specific to power-off. Headless/CI paths escaped only because state == None makes the teardown a silent no-op.

Fix

Tear the renderer down at the end of refresh_loop — on the thread that owns the context, while it is still current — and drop the renderer.stop() call from Rex3::stop(). Ordering is preserved: Rex3::stop() still joins the refresh thread, so teardown completes before stop() returns. Because Machine::stop stops the CPU first and the processor thread is joined before the refresh thread, the GFIFO is fully drained, the fence-wait spin falls through, the loop observes running == false, exits, and runs teardown. After reset, restart_peripherals()rex3.start() respawns the refresh thread and present() re-inits GL lazily — no black screen.

Known related issue (not addressed here)

disp compositor gl|swGlRenderer::switch_compositor destroys GL resources directly on the monitor thread (ui.rs), same foreign-thread fault, rarely triggered. A proper fix routes the compositor swap through the refresh thread; left out of scope to keep this change focused.

Notes / testing

  • Builds clean with --features lightning,rex-jit,tlbvmap (only pre-existing warnings).
  • Reasoned fix; not runtime-verified on my end. To confirm: boot IRIX and halt (the soft power-off that crashed), and separately close the window — both hit rex3.stop() and both are fixed.
  • Adds a short rules/gui/ note documenting the "all GL calls belong to the refresh thread" invariant, per the repo's rules/ convention.

Copilot AI review requested due to automatic review settings June 22, 2026 01:03

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a macOS crash during guest power-off (halt) and window close by ensuring OpenGL teardown (e.g., glDeleteTextures via compositor destruction) runs on the REX3-Refresh thread that owns the GL context, rather than on whichever thread calls Rex3::stop().

Changes:

  • Move renderer.stop() into Rex3::refresh_loop() so GL resource destruction happens on the context-owning refresh thread.
  • Remove the foreign-thread renderer.stop() call from Rex3::stop().
  • Add a rules/gui/ document codifying the “all GL calls belong to the refresh thread” invariant.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/rex3.rs Relocates renderer teardown to the refresh thread after the loop exits; removes teardown from stop() to avoid foreign-thread GL calls.
rules/gui/gl-teardown-must-run-on-the-refresh-thread.md Documents the thread-affinity rule for GL calls and explains the original crash mechanism and fix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread rules/gui/gl-teardown-must-run-on-the-refresh-thread.md Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@techomancer

Copy link
Copy Markdown
Owner

yup makes sense. gl context is bound to thread.

@techomancer techomancer merged commit 5db94c0 into techomancer:main Jun 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants