chore: automate releases#3833
Conversation
Overhauls the release pipeline into a fully automated, robust, and modular architecture: - Created utils.py, git.py, and gh.py to encapsulate shell executions, git helpers, and GitHub CLI wrappers under clean, prefix-free module namespaces. - Overhauled release.py to use these modules, return clean exit codes, and support dynamic RC tagging and checklist gating. - Overhauled cmd_prepare to make the preparation pipeline fully unconditional (cuts branch, commits, pushes, opens PR, and creates/updates tracking issue). - Integrated news processing directly into cmd_process_backports to merge and amend changelogs per backport. - Created GitHub Actions workflows to automate all release phases (prepare, branch cut, backports, RC tagging, and promotion). - Added comprehensive unit test suites in release_test.py mocking the new module-level namespace helpers.
There was a problem hiding this comment.
Code Review
This pull request introduces a code-centric, reactive release automation architecture for rules_python by splitting the release logic into dedicated helper modules (gh.py, git.py, utils.py) and expanding release.py with several subcommands to handle preparation, branch cutting, backport processing, and release candidate tagging. Feedback on the changes highlights several critical issues, including the need to use git status --porcelain and .splitlines() to properly parse git output line-by-line rather than character-by-character. Additionally, the reviewer noted a Python compatibility issue with Exception.add_note on older Python versions, unsafe dictionary access on PR merge commits, potential failures when staging a non-existent news/ directory, loose regex patterns for parsing versions from issue titles, and potential regex failures due to Windows-style line endings.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| def status(): | ||
| """Returns the output of git status.""" | ||
| return run_cmd("git", "status") |
There was a problem hiding this comment.
Running git status returns a verbose, human-readable string even when the workspace is completely clean (e.g., "nothing to commit, working tree clean"). This causes the dirty check in cmd_prepare to always fail and report that local edits were detected. Using git status --porcelain is machine-readable and returns an empty string when the workspace is clean.
| def status(): | |
| """Returns the output of git status.""" | |
| return run_cmd("git", "status") | |
| def status(): | |
| """Returns the output of git status --porcelain.""" | |
| return run_cmd("git", "status", "--porcelain") |
| status = git.status() | ||
| if status: | ||
| print( | ||
| "Error: Local edits detected. Workspace must be completely clean" | ||
| " before running release preparation." | ||
| ) | ||
| for line in status: | ||
| print(f" {line}") | ||
| return 1 |
There was a problem hiding this comment.
Since git.status() returns a single string containing the command output, iterating over status directly will iterate character-by-character rather than line-by-line. This causes the error message to print single characters on separate lines. Use .splitlines() to iterate over the lines of the status output.
| status = git.status() | |
| if status: | |
| print( | |
| "Error: Local edits detected. Workspace must be completely clean" | |
| " before running release preparation." | |
| ) | |
| for line in status: | |
| print(f" {line}") | |
| return 1 | |
| status = git.status() | |
| if status: | |
| print( | |
| "Error: Local edits detected. Workspace must be completely clean" | |
| " before running release preparation." | |
| ) | |
| for line in status.splitlines(): | |
| print(f" {line}") | |
| return 1 |
| for line in modified_files: | ||
| file_path = line.strip().split()[-1] | ||
| git.add(file_path) |
There was a problem hiding this comment.
Since modified_files is a single string containing the command output, iterating over it directly will iterate character-by-character. This causes line.strip().split()[-1] to raise an IndexError or return single characters, resulting in incorrect git.add calls. Use .splitlines() to iterate over the lines of the status output.
| for line in modified_files: | |
| file_path = line.strip().split()[-1] | |
| git.add(file_path) | |
| for line in modified_files.splitlines(): | |
| file_path = line.strip().split()[-1] | |
| git.add(file_path) |
| except subprocess.CalledProcessError as e: | ||
| note = f"Error running command: {' '.join(cmd)}" | ||
| if capture_output: | ||
| note += f"\nStdout: {e.stdout}\nStderr: {e.stderr}" | ||
| e.add_note(note) | ||
| raise |
There was a problem hiding this comment.
The add_note method on exceptions was introduced in Python 3.11 (PEP 678). Since rules_python supports older Python versions (such as 3.8, 3.9, and 3.10), calling e.add_note will raise an AttributeError and crash the release tool. To maintain compatibility with older Python versions, check if the exception has the add_note attribute before calling it, or print the note to sys.stderr as a fallback.
| except subprocess.CalledProcessError as e: | |
| note = f"Error running command: {' '.join(cmd)}" | |
| if capture_output: | |
| note += f"\nStdout: {e.stdout}\nStderr: {e.stderr}" | |
| e.add_note(note) | |
| raise | |
| except subprocess.CalledProcessError as e: | |
| note = f"Error running command: {' '.join(cmd)}" | |
| if capture_output: | |
| note += f"\nStdout: {e.stdout}\nStderr: {e.stderr}" | |
| if hasattr(e, "add_note"): | |
| e.add_note(note) | |
| else: | |
| import sys | |
| print(note, file=sys.stderr) | |
| raise |
| else: | ||
| item["commit"] = pr_info["mergeCommit"]["oid"] |
There was a problem hiding this comment.
Directly accessing pr_info["mergeCommit"]["oid"] is unsafe because mergeCommit can be null or missing from the JSON response in certain edge cases. This would raise a TypeError and crash the script. Use defensive dictionary access to handle missing or null mergeCommit fields gracefully.
| else: | |
| item["commit"] = pr_info["mergeCommit"]["oid"] | |
| else: | |
| merge_commit = pr_info.get("mergeCommit") | |
| if merge_commit and "oid" in merge_commit: | |
| item["commit"] = merge_commit["oid"] | |
| else: | |
| print(f"PR #{pr_num} has no merge commit SHA. Gating.") | |
| item["status"] = "unmerged-pr" |
| # Stage changelog changes and news/ deletions | ||
| git.add("CHANGELOG.md", "news/") |
There was a problem hiding this comment.
If a backported PR does not contain any news files, the news/ directory might be empty or non-existent on disk. In such cases, running git add news/ will fail with fatal: pathspec 'news/' did not match any files, which aborts the entire backport process. To prevent this, only stage the news/ directory if it exists on disk.
| # Stage changelog changes and news/ deletions | |
| git.add("CHANGELOG.md", "news/") | |
| # Stage changelog changes and news/ deletions | |
| git.add("CHANGELOG.md") | |
| if os.path.exists("news"): | |
| git.add("news") |
| issue_title = gh.get_issue_title(args.issue) | ||
| version_match = re.search(r"[0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)?", issue_title) | ||
| if not version_match: | ||
| print(f"Error: Could not parse version from issue title: {issue_title}") | ||
| return 1 | ||
|
|
||
| version = version_match.group(0).replace("v", "") |
There was a problem hiding this comment.
The current regex [0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)? is loose and can match any random numbers in the issue title. It is safer and more robust to specifically match the Release <version> pattern and extract the version using a capture group, which also avoids the redundant .replace("v", "") call.
| issue_title = gh.get_issue_title(args.issue) | |
| version_match = re.search(r"[0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)?", issue_title) | |
| if not version_match: | |
| print(f"Error: Could not parse version from issue title: {issue_title}") | |
| return 1 | |
| version = version_match.group(0).replace("v", "") | |
| issue_title = gh.get_issue_title(args.issue) | |
| version_match = re.search(r"Release\s+v?(\d+\.\d+\.\d+(?:-rc\d+)?)", issue_title, re.IGNORECASE) | |
| if not version_match: | |
| print(f"Error: Could not parse version from issue title: {issue_title}") | |
| return 1 | |
| version = version_match.group(1) |
| issue_title = gh.get_issue_title(args.issue) | ||
| version_match = re.search(r"[0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)?", issue_title) | ||
| if not version_match: | ||
| print(f"Error: Could not parse version from issue title: {issue_title}") | ||
| return 1 | ||
|
|
||
| version = version_match.group(0).replace("v", "") |
There was a problem hiding this comment.
The current regex [0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)? is loose and can match any random numbers in the issue title. It is safer and more robust to specifically match the Release <version> pattern and extract the version using a capture group, which also avoids the redundant .replace("v", "") call.
| issue_title = gh.get_issue_title(args.issue) | |
| version_match = re.search(r"[0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)?", issue_title) | |
| if not version_match: | |
| print(f"Error: Could not parse version from issue title: {issue_title}") | |
| return 1 | |
| version = version_match.group(0).replace("v", "") | |
| issue_title = gh.get_issue_title(args.issue) | |
| version_match = re.search(r"Release\s+v?(\d+\.\d+\.\d+(?:-rc\d+)?)", issue_title, re.IGNORECASE) | |
| if not version_match: | |
| print(f"Error: Could not parse version from issue title: {issue_title}") | |
| return 1 | |
| version = version_match.group(1) |
| issue_title = gh.get_issue_title(args.issue) | ||
| version_match = re.search(r"[0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)?", issue_title) | ||
| if not version_match: | ||
| print(f"Error: Could not parse version from issue title: {issue_title}") | ||
| return 1 | ||
|
|
||
| version = version_match.group(0).replace("v", "") |
There was a problem hiding this comment.
The current regex [0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)? is loose and can match any random numbers in the issue title. It is safer and more robust to specifically match the Release <version> pattern and extract the version using a capture group, which also avoids the redundant .replace("v", "") call.
| issue_title = gh.get_issue_title(args.issue) | |
| version_match = re.search(r"[0-9]+\.[0-9]+\.[0-9]+(-rc[0-9]+)?", issue_title) | |
| if not version_match: | |
| print(f"Error: Could not parse version from issue title: {issue_title}") | |
| return 1 | |
| version = version_match.group(0).replace("v", "") | |
| issue_title = gh.get_issue_title(args.issue) | |
| version_match = re.search(r"Release\s+v?(\d+\.\d+\.\d+(?:-rc\d+)?)", issue_title, re.IGNORECASE) | |
| if not version_match: | |
| print(f"Error: Could not parse version from issue title: {issue_title}") | |
| return 1 | |
| version = version_match.group(1) |
| def parse_backports(body): | ||
| """Parses the ## Backports checklist section.""" | ||
| match = re.search( | ||
| r"## Backports\n(.*?)(?=\n##|\n---|\Z)", body, re.DOTALL | re.IGNORECASE | ||
| ) |
There was a problem hiding this comment.
If the issue body contains Windows-style line endings (\r\n), the regex pattern ## Backports\n will fail to match because of the carriage return (\r). Normalizing the newlines to \n at the beginning of the function ensures robust parsing across all platforms and editors.
def parse_backports(body):
"""Parses the ## Backports checklist section."""
body = body.replace("\r\n", "\n")
match = re.search(
r"## Backports\n(.*?)(?=\n##|\n---|\Z)", body, re.DOTALL | re.IGNORECASE
)
This introduces workflows and tools to automate releases. The basic
idea it to have an issue with tasks, and workflows invoke tools that
parse and update the tasks in the issue.
The basic release steps are:
Each step has a workflow. For now, most of the workflows require a
manual trigger, to help prevent workflows from running wild as we
flush out bugs and edge cases.
The workflows almost entirely delegate to the
releasetool. There'sbasically one command per workflow. This makes it easy to encode
specialized logic and run it locally for testing or reproduction.