Add GitHub-issue-driven test framework (+ src/ library carve-out) by gaurav · Pull Request #67 · TranslatorSRI/babel-validation

gaurav · 2026-01-08T23:59:16Z

Introduces a GitHub-issue-driven test framework for babel-validation, alongside a library carve-out and supporting tooling.

Major pieces

Library carve-out — shared code moves from tests/common/ to an installable src/babel_validation/ package (core, services, sources, assertions, tools). The project now builds as a hatchling wheel; existing from src.babel_validation.X import Y imports keep working.
Assertion framework — 8 AssertionHandler types (Resolves, DoesNotResolve, ResolvesWith, DoesNotResolveWith, HasLabel, ResolvesWithType, SearchByName, Needed) with an auto-generated assertions/README.md and a drift test that fails CI if it goes stale.
GitHub integration (PyGitHub) — parses {{BabelTest|...}} wiki lines and ```yaml babel_tests: blocks from issue bodies. Open issues are expected to fail (strict xfail, so XPASS flags a closeable issue); closed issues are expected to pass.
CSV→YAML CLI (csv-to-babeltests) for bulk-authoring assertion blocks, reusing the same handlers and YAML schema as the issue parser.
CI workflow running pytest -m unit on PRs.
xdist parallelism with FileLock-coordinated CSV/issue caches; the controller refreshes caches, workers share them.

Review fixes (this PR)

Addressing the code review on the original revision:

pytest -m unit now runs fully offline. Both the GitHub issue tests and the Google Sheet / blocklist tests previously fetched from the network at collection time, because their parametrization is built before pytest applies -m deselection. The marker expression is now evaluated up front (shared tests/_pytest_helpers.deselected_by_markexpr), and the GitHub fetch and Google Sheet downloads are skipped when the test would be deselected. Net effect: pytest -m unit drops from ~2.5s + network to ~0.3s offline. Row IDs (row=N) and xfail marks are unchanged, so -k 'row=42' still works.
ResolvesWith / DoesNotResolveWith now enforce their documented 2+ CURIE arity. A single-CURIE param_set previously passed (or failed) vacuously; it now fails loudly with a clear message, matching HasLabel/SearchByName. Adds unit coverage.
SearchByName top-N is read from targets.ini (NameResXFailIfInTop) instead of being hardcoded to 5, so it's configurable per environment.
CachedNodeNorm.normalize_curie uses .get() to avoid a KeyError if NodeNorm ever omits a requested CURIE, consistent with the rest of the code.

Unrelated changes riding along in this PR

These are intentional and correct, but not part of the GitHub-issue test framework itself:

pytest bumped 8.4.2 → 9.0.2 (plus new deps: pygithub, pytest-xdist[psutil], pytest-subtests, click, pyyaml, tqdm, filelock, python-dotenv).
tests/pytest.ini removed; pytest config (testpaths, timeout, markers) moves into pyproject.toml [tool.pytest.ini_options].
tests/targets.ini [ci]/[ci-es] restructure: both CI targets now point NodeNorm at nodenorm-es.ci.transltr.io (the old biothings.ci.transltr.io/nodenorm/ URL is gone); [ci] pairs the ES NodeNorm with the non-ES NameRes, [ci-es] with the ES NameRes. A Repositories list (scanned for embedded assertions) is added to [DEFAULT].
tests/nameres/test_nameres_from_gsheet.py: the biolink_type request parameter is now sent as a list ([biolink_class]) instead of a bare string, matching the current NameRes API.

Follow-up: library carve-out (stacked PRs)

This PR is being kept as the base for two smaller, stacked follow-up PRs that make the GitHub-issue test framework usable as an installable library (so the Babel pipeline and Babel Explorer can consume it). They branch off this one rather than main, since the library code only exists here:

Library packaging: import as babel_validation + split runtime/dev deps #95 — Library packaging (base: add-github-issue-tests): rename the import package src.babel_validation → babel_validation, ship src/babel_validation as the top-level wheel package, and split runtime vs dev dependencies (PyGitHub stays core). No behavior change.
Library API: Assertion object model + run_issue_tests reporting API #96 — Library API (base: library-packaging, stacked on Library packaging: import as babel_validation + split runtime/dev deps #95): turn GitHubIssueTest into a plain Assertion object, and add a pytest-independent run_issue_tests() reporting API (IssueReport/ResultRecord) with the pytest suite rewired as a thin adapter.

Recommended review/merge order: this PR (#67) → #95 → #96. Out-of-scope library-usability improvements surfaced along the way are tracked in #97, #98, and #99.

Each assertion type (Resolves, DoesNotResolve, ResolvesWith, ResolvesWithType, SearchByName, Needed) is now a self-contained class in src/babel_validation/assertions/, grouped by which service it targets (nodenorm.py, nameres.py, common.py). A central ASSERTION_HANDLERS registry in __init__.py maps lowercase assertion names to handler instances, and NodeNormAssertion / NameResAssertion marker base classes allow isinstance() checks for applicability. GitHubIssueTest.test_with_nodenorm() and test_with_nameres() are now 3-line dispatchers. A README.md documents all supported assertion types with examples in both wiki and YAML syntax. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…pendently. Bumps pytest to >=9.0 (which includes built-in subtests support). Each TestResult is now evaluated in its own subtest block, so a failure no longer short-circuits the rest. Adds post-loop state-consistency subtests: a closed issue with failing tests fails with a "consider reopening" message, and an open issue where all tests pass emits an xfail "consider closing" hint. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Previously, get_github_issue_id() and GitHubIssueTest.__str__() both resolved the org/repo name via github_issue.repository.organization.name, which triggers lazy PyGitHub API calls. Parse org/repo from html_url instead (always present in the issue JSON, no extra round-trip needed). Also resolves the TODO comment in get_test_issues_from_issue(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The GitHub issue tests always read Repositories from [DEFAULT]; per-environment overrides are not implemented. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 28 out of 41 changed files in this pull request and generated 3 comments.

+def pytest_generate_tests(metafunc):
+    if "github_issue_id" not in metafunc.fixturenames:
+        return
+    issue_id_filter = metafunc.config.getoption("issue", default=[])
+    if issue_id_filter:
+        try:
+            issues = _get_github_issues_test_cases().get_issues_by_ids(issue_id_filter)
+        except GithubException as e:
+            if e.status == 401:
+                _record_auth_error(e)
+                metafunc.parametrize("github_issue_id", [_AUTH_ERROR_ID], ids=[_AUTH_ERROR_ID])
+                return
+            raise
+        ids = [_issue_id(i) for i in issues]
+        for issue, id_ in zip(issues, ids):
+            _fetched_issues_cache[id_] = issue
+    else:
+        ids = _get_all_test_issue_ids()
+    metafunc.parametrize("github_issue_id", ids, ids=ids)


+                # Parse string as YAML.
+                yaml_dict = yaml.safe_load(match.removeprefix("```yaml").removesuffix("```"))
+
+                babel_tests = yaml_dict.get('babel_tests') if isinstance(yaml_dict, dict) else None
+                if babel_tests is None:
+                    raise ValueError(
+                        f"YAML block in issue {github_issue_id} matched the detection pattern "
+                        f"but contains no 'babel_tests' top-level key: {match!r}"
+                    )


+    def test_with_nodenorm(self, param_sets, nodenorm, label=""):
+        yield self.failed("Test needed for issue")


pytest_generate_tests for test_github_issue fetches issues from GitHub at collection time. pytest applies -m marker deselection only *after* generation, so running 'pytest -m unit' (as CI does) would still hit the GitHub search API across every configured repo before deselecting the test. Evaluate the marker expression ourselves against the test's own markers and, when it would deselect this test, parametrize an empty set instead of fetching. This keeps unit-only runs offline. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Both assertions document 'Two or more CURIEs per param_set' but never checked. A single-CURIE ResolvesWith passed vacuously (canonical id == itself) while a single-CURIE DoesNotResolveWith always failed (one canonical id) — both silent authoring traps. Fail loudly with a clear message instead, matching HasLabel and SearchByName. Adds unit coverage for the too-few-params case. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

normalize_curie indexed the result dict directly while every other code path uses .get(). NodeNorm normally echoes every requested CURIE (null when unresolvable), but switch to .get() so an omitted key returns None rather than raising. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The NameRes 'found in top N' threshold was hardcoded to the handler default of 5, ignoring NameResXFailIfInTop in targets.ini. Read it from target_info so the threshold is configurable per environment. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Move the marker-expression check out of the github_issues conftest into tests/_pytest_helpers.py so the Google Sheet test modules can reuse it to defer their own network-backed parametrization. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The gsheet and blocklist test modules instantiated GoogleSheetTestCases / load_blocklist_from_gsheet at import time and parametrized via a module-level @pytest.mark.parametrize, so every collection — including 'pytest -m unit' — downloaded the sheets even though those tests aren't unit tests. Move parametrization into pytest_generate_tests with a lazy, cached fetch, and skip the fetch entirely when a -m filter would deselect the test. Row IDs (row=N) and xfail marks are unchanged, so -k 'row=42' still works. With this, 'pytest -m unit' runs fully offline: ~0.3s and no network, versus ~2.5s and a GitHub + Google Sheet round trip before. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

A `babel_tests:` value that is a list/scalar (not a mapping) previously crashed with an opaque AttributeError on `.items()`, and a non-string assertion key (e.g. `123:`) blew up later when `.lower()` was called. Both now raise a clear ValueError up front, with unit coverage. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Resolved-node message formatting indexed result['type'][0] in five places, which would raise IndexError if NodeNorm ever returned a node with an empty type list. Route all sites through a NodeNormTest.first_type() helper and use .get('label', '') for the same robustness. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Importing the tool registered the _FlowList representer on the global yaml.SafeDumper, changing YAML output process-wide. Move it onto a local _BabelTestDumper subclass; emitter output is unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

gaurav · 2026-06-24T20:32:43Z

Recommended merge order: #67 → #95 → #96.

These are stacked PRs, each based on the previous branch (not main):

Add GitHub-issue-driven test framework (+ src/ library carve-out) #67 (add-github-issue-tests → main) — this PR.
Library packaging: import as babel_validation + split runtime/dev deps #95 (library-packaging → add-github-issue-tests) — library packaging / rename.
Library API: Assertion object model + run_issue_tests reporting API #96 (library-runner → library-packaging) — Assertion object model + run_issue_tests() API.

Merging in this order keeps each diff small and reviewable. Once #67 merges, GitHub will retarget #95 onto main automatically (and #96 onto library-packaging); review/merge #95 next, then #96. Out-of-scope follow-ups are tracked in #97, #98, #99, and #100.

gaurav · 2026-06-26T05:05:23Z

To make this reviewable, I've split it into a stack of smaller PRs (review/merge top-down):

Reorganize Babel Validation for more extensibility #101 — Library reorg: carve tests/common into src/babel_validation (base main)
Assertions framework for BabelTest expectations (2/4) #102 — Assertions framework (base Reorganize Babel Validation for more extensibility #101)
GitHub-issue test parsing + harness (3/4) #103 — GitHub-issue parsing + harness (base Assertions framework for BabelTest expectations (2/4) #102)
csv-to-babeltests CLI for generating BabelTests YAML (4/4) #104 — csv-to-babeltests CLI (base Assertions framework for BabelTest expectations (2/4) #102, independent of GitHub-issue test parsing + harness (3/4) #103)

Each was reconstructed from this branch's final tree (path-based for 1–3; #104 cherry-picks the CSV commits with authorship intact) and verified offline. Plan is to merge these into main, then rebase this branch — it should reduce to little/nothing. Keeping this PR open until then.

Record the non-obvious things: `-m unit` is the offline CI suite and network-backed modules defer fetches via deselected_by_markexpr; src/ is importable thanks to the hatch wheel packaging; black isn't strictly enforced; and commits sign through 1Password's SSH agent. Also fix the `-k "row=42"` example, which pytest's expression parser rejects. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The repo's `.gitignore` predates the Python rewrite — it only covered Scala/Giter8/IntelliJ artifacts, so `__pycache__/` and `*.pyc` showed up as untracked noise throughout the tree and were easy to `git add` by accident. Appends the standard [GitHub `Python.gitignore`](https://github.com/github/gitignore/blob/main/Python.gitignore) template (bytecode, `build/`/`dist/`, `.pytest_cache/`, `.venv`/`.env`, mypy/coverage caches, …) below the existing entries, plus a `.DS_Store` line for macOS. Existing entries are kept at the top, so this is purely additive. Independent of the #67 split stack (#101–#104) and based on `main`, so it can merge immediately. No files are currently tracked that the new patterns would retroactively ignore — `git ls-tree` shows no `.pyc`/`__pycache__` in the tree — so nothing needs `git rm --cached`. ### Verify ``` git check-ignore -v src/__pycache__/x.pyc .venv .env # all matched ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code)

Pure-ish refactor, no service-behavior change. Moves the code under `tests/common/` into an importable `src/babel_validation` package and updates the Google Sheet test modules to match. ### What changes - Split the monolithic `tests/common/google_sheet_test_cases.py` into: - `core/testrow.py` — `TestRow` / `TestStatus` / `TestResult` - `services/{nodenorm,nameres}.py` — `CachedNodeNorm` / `CachedNameRes` - `sources/google_sheets/{google_sheet_test_cases,blocklist}.py` - Gsheet test modules now import from the new locations and parametrize **lazily** in `pytest_generate_tests` via `tests/_pytest_helpers.deselected_by_markexpr`, so marker-deselected runs never hit the network. - Migrate `tests/pytest.ini` into `[tool.pytest.ini_options]`; add a hatchling build that packages `src/`; add `filelock` (gsheet disk cache). - Move `test_env.py` → `tests/test_environment/`. Two incidental one-liners ride along in the final test files (NameRes `biolink_type` passed as a list; description identifiers as lists not sets) — kept as-is so #67 rebases cleanly. ### Verify - `uv run pytest -m unit --collect-only -q` — full suite imports/collects offline (no network). - `uv run pytest tests/test_environment/test_env.py` — live Google Sheet download+parse through the moved code. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

gaurav and others added 16 commits January 8, 2026 16:26

Moved TestRow into its own file.

06857d5

Added /.env to .gitignore.

a0325b6

Implemented basic GitHub issue testing (but it doesn't work at all).

0009dc0

Check for a GitHub Authentication Token.

f297735

Add support for --issue to specify one or more GitHub issues to check.

9b96675

Cleaned up some code.

97bb7a6

Sped up tests, made things work better.

ed8f1db

Fixed some issues.

f0342ce

It works! But it shouldn't. Cross that bridge later etc.

ebac7ab

Various improvements.

05b767e

Various improvements.

15b28d1

Got rid of xfail system (too much bother).

a6845a3

Incorporated issue state into outputs.

54001bb

Better?

150f4a6

Nope, still too messy.

92821bf

Added some NameRes tests.

9a4a22c

gaurav mentioned this pull request Jan 22, 2026

Add tests from GitHub issues TranslatorSRI/babel-explorer#2

Closed

gaurav and others added 10 commits February 15, 2026 02:39

Merge branch 'main' into add-github-issue-tests

2a610a4

Reorganized test files.

0848de4

Don't need __init__.py in top-level src directory.

127cca1

Merge branch 'main' into add-github-issue-tests

b7d2786

Moved NodeNorm/NameRes into own files.

b91609f

Moved sources into separate directories.

8a08e9a

Updated CLAUDE.md in line with updated code.

1f84aea

Claude suggested moving blocklist code into src/ so we did.

31b4436

Added blocklist.py where a lot of the blocklist code has moved.

6b60325

gaurav mentioned this pull request Feb 19, 2026

Add GitHub issue tests with per test reporting #71

Closed

gaurav and others added 2 commits February 19, 2026 19:03

Correct Repositories comment in targets.ini

af19a1e

The GitHub issue tests always read Repositories from [DEFAULT]; per-environment overrides are not implemented. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

gaurav requested a review from Copilot June 10, 2026 18:37

Copilot started reviewing on behalf of gaurav June 10, 2026 18:37 View session

Copilot AI reviewed Jun 10, 2026

View reviewed changes

gaurav and others added 4 commits June 10, 2026 17:14

gaurav changed the title ~~Add GitHub issue tests~~ Add GitHub-issue-driven test framework (+ src/ library carve-out) Jun 10, 2026

gaurav and others added 5 commits June 10, 2026 17:57

This was referenced Jun 26, 2026

Reorganize Babel Validation for more extensibility #101

Merged

Assertions framework for BabelTest expectations (2/4) #102

Open

GitHub-issue test parsing + harness (3/4) #103

Open

csv-to-babeltests CLI for generating BabelTests YAML (4/4) #104

Open

gaurav mentioned this pull request Jun 26, 2026

Add standard Python .gitignore entries #105

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add GitHub-issue-driven test framework (+ src/ library carve-out)#67

Add GitHub-issue-driven test framework (+ src/ library carve-out)#67
gaurav wants to merge 132 commits into
mainfrom
add-github-issue-tests

gaurav commented Jan 8, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

gaurav commented Jun 24, 2026

Uh oh!

gaurav commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		def test_with_nodenorm(self, param_sets, nodenorm, label=""):
		yield self.failed("Test needed for issue")

Uh oh!

Conversation

gaurav commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Major pieces

Review fixes (this PR)

Unrelated changes riding along in this PR

Follow-up: library carve-out (stacked PRs)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

gaurav commented Jun 24, 2026

Uh oh!

gaurav commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gaurav commented Jan 8, 2026 •

edited

Loading