Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions rules/gui/gl-teardown-must-run-on-the-refresh-thread.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# GL teardown must run on the REX3-Refresh thread

**Keywords:** opengl,gl,glow,glutin,glDeleteTextures,thread,context,make_current,refresh,poweroff,shutdown,crash,segfault,compositor,destroy,rex3,renderer
**Category:** gui

## All OpenGL calls — including teardown — belong to the thread that owns the context

The GL context is created and made current lazily by `GlRenderer::init_gl`, which
runs from `present()` — and `present()` is only ever called from `Rex3::refresh_loop`
(the **REX3-Refresh** thread). That thread is the sole owner of the context; this is
documented by the `unsafe impl Send for GlRenderer` comment ("sent to the refresh
thread where it owns and uses the GL context. No other thread touches these fields").

On macOS, GL functions dispatch through the **calling thread's** current context.
A GL call from a thread with no current context dereferences a null current-context
struct and faults at a small offset (the observed crash was `glDeleteTextures` with
`KERN_INVALID_ADDRESS at 0x1e0` — a null + field offset, not a freed pointer).

## The bug this rule encodes

`Rex3::stop()` used to:
1. set `running = false`, join the REX3-Processor thread,
2. **join the REX3-Refresh thread** (the context owner is now gone), then
3. call `renderer.stop()` → `Compositor::destroy(&state.gl)` → `glDeleteTextures`
**on whatever thread called `Rex3::stop()`** — never the refresh thread.

`Rex3::stop()` is reached from many non-owning threads: the `machine-events` thread
(guest soft power-off → `MachineEvent::PowerOff` → `Machine::stop`), the **main**
thread (`main.rs` after the winit loop returns on window close), and the monitor/CI
thread (`reset`, snapshot load/save). The GUI power-off and window-close paths have a
live context and crash; the headless/CI paths have `state == None` so teardown is a
silent no-op, which is why it went unnoticed.

## The fix

Tear the renderer down at the **end of `refresh_loop`**, right after the
`while self.running` loop exits — i.e. on the owning thread, while the context is
still current — and drop the `renderer.stop()` call from `Rex3::stop()`. Ordering is
preserved because `Rex3::stop()` joins the refresh thread, so teardown still finishes
before `stop()` returns. After a `reset`, `restart_peripherals()` calls
`rex3.start()`, which respawns the refresh thread; `present()` then re-inits GL lazily.

## Still open (same root cause, different trigger)

`disp compositor <gl|sw>` → `GlRenderer::switch_compositor` (ui.rs) calls
`compositor.destroy(&state.gl)` directly on the **monitor** thread. Same foreign-thread
GL teardown, same fault; rarely triggered. A proper fix routes the compositor swap
through the refresh thread (e.g. a pending-request flag the loop services).

## Rule of thumb

Never call any `gl.*` / `glow` method (create, upload, draw, or delete) from outside
`refresh_loop`/`present`. If another thread needs GL work done, signal the refresh
thread and let it do it.
8 changes: 4 additions & 4 deletions src/rex3.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3625,6 +3625,10 @@ impl Rex3 {
thread::sleep(frame_duration - elapsed);
}
}

if let Some(renderer) = self.renderer.lock().as_mut() {
renderer.stop();
}
}

/// Returns `true` if the register offset was recognized and updated state.
Expand Down Expand Up @@ -3897,10 +3901,6 @@ impl Device for Rex3 {
#[cfg(feature = "developer")]
eprintln!("REX3: GFIFO high-watermark = {} / {} entries",
self.gfifo_hwm.load(Ordering::Relaxed), GFIFO_DEPTH);

if let Some(renderer) = self.renderer.lock().as_mut() {
renderer.stop();
}
}

fn start(&self) {
Expand Down
Loading