Move tables 1.6 crash safe resume#1708
Open
zacharysierakowski wants to merge 24 commits into
Open
Conversation
| if !migrationContext.UseGTIDs { | ||
| log.Infof("--move-tables requires GTID coordinates for cutover drain checks; enabling --gtid") | ||
| migrationContext.UseGTIDs = true | ||
| } |
Author
There was a problem hiding this comment.
I chose to not fail the request without these, and just enable them. Could also change it to a Fatal and force the operator to rerun with the required flags. But that feels kinda silly if they are always required.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR makes --move-tables crash-resume safe across the cooperative cutover by persisting additional cutover state to the checkpoint table (cutover-started + drain GTID), adding logic to resume mid-cutover, and ensuring move-tables runs with checkpointing + GTID coordinates enabled.
Changes:
- Extend checkpoint schema + insert query builder to include move-tables cutover resume fields, and read/write them from the target-side checkpoint table in move-tables mode.
- Add migrator logic to persist cutover checkpoints, drain to a captured GTID with improved “non-DML tail” handling, and resume cutover directly from checkpoint state.
- Update local move-tables scripts/docs to exercise checkpointing and provide better local test flows.
Show a summary per file
| File | Description |
|---|---|
script/move-tables/teardown |
Adds additional Docker cleanup steps after removing containers. |
script/move-tables/README.md |
Updates local testing steps/command to include checkpointing and validation workflow. |
script/move-tables/insert-source-primary-loop |
Adds batched insert support for higher write rates during local testing. |
go/sql/builder.go |
Extends checkpoint insert builder to optionally include move-tables cutover columns. |
go/sql/builder_test.go |
Adds coverage for move-tables checkpoint insert query shape. |
go/logic/migrator.go |
Implements move-tables cutover checkpoint persistence, drain logic, and resume-from-cutover-checkpoint flow. |
go/logic/migrator_move_tables_cutover_test.go |
Adds unit test for resuming move-tables cutover when already drained. |
go/logic/checkpoint.go |
Extends Checkpoint struct with move-tables cutover resume fields. |
go/logic/applier.go |
Routes checkpoint storage to the target DB in move-tables mode and reads/writes new move-tables cutover fields. |
go/logic/applier_test.go |
Adds integration tests for move-tables checkpoint persistence and cutover-checkpoint selection. |
go/cmd/gh-ost/main.go |
Automatically enables --checkpoint and --gtid when running --move-tables. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 11/11 changed files
- Comments generated: 6
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
danieljoos
approved these changes
Jun 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
closes https://github.com/github/database-infrastructure/issues/8210
A Pull Request should be associated with an Issue.
Related issue (internal): https://github.com/github/database-infrastructure/issues/8210
Related issue (public): #1681
Description
This PR updates move-tables mode to be crash-resume safe. It manages the
checkpointtable life, and a backgrounded checkpoint process as well as introduces new checkpoint data specifically for the move-tables cutover. Per the docs on crash safe resume across cutover, we now store that the move-tables cutover has started (a bool) and the move-tables cutover drain GTID. We use these two attributes for--move-tables --resumeto pickup if the process failed mid-cutover. It's safe for failure between T1-T3 (the rename and the drain) and also safe for failure post drain, pre hook call (between T3-T5). I simple--resumewill pick back up in the cutover protocol and ensure the rename isn't re-run, the drain is complete, and the complete hook is fired.script/cibuildreturns with no formatting errors, build errors or unit test errors.Demo
ghost-crash-resume-cutover.mp4