Skip to content

fix: clone parquet-testing into parquet in verify-release-candidate.sh#1617

Open
adriangb wants to merge 1 commit into
mainfrom
fix-verify-parquet-clone-path
Open

fix: clone parquet-testing into parquet in verify-release-candidate.sh#1617
adriangb wants to merge 1 commit into
mainfrom
fix-verify-parquet-clone-path

Conversation

@adriangb

@adriangb adriangb commented Jun 28, 2026

Copy link
Copy Markdown

Which issue does this PR close?

No separate issue. This is a one-line fix to the release verification script, found while manually running the release verification process for 54.0.0-rc2.

Rationale for this change

dev/release/verify-release-candidate.sh clones the parquet-testing repository into a directory named parquet-testing:

git clone https://github.com/apache/parquet-testing.git parquet-testing

However, the git submodule path declared in .gitmodules is parquet:

[submodule "parquet"]
path = parquet
url = https://github.com/apache/parquet-testing.git

and the parquet-based tests resolve their data relative to parquet/data/..., for example:

  • python/tests/test_io.py: read_parquet(path="parquet/data/alltypes_plain.parquet")
  • python/tests/test_store.py: file://{Path.cwd()}/parquet/data/alltypes_plain.parquet
  • python/tests/test_context.py: parquet/data/alltypes_plain.parquet

The arrow-testing clone on the line immediately above already correctly uses its submodule path (testing), so the parquet line is an inconsistency.

This is currently latent: the script's python3 -m pytest invocation is commented out (#TODO: we should really run tests here as well), so the wrong directory is never exercised today. If/when the test run is enabled during release verification, the parquet-reading tests would fail with "No files found ... Cannot infer schema from an empty location". Fixing the clone path is a prerequisite for enabling those tests.

What changes are included in this PR?

  • Clone parquet-testing into parquet (instead of parquet-testing) so it matches the .gitmodules submodule path and the paths the test suite expects.

Are there any user-facing changes?

No. This only affects the release verification tooling.

🤖 Generated with Claude Code

https://claude.ai/code/session_01Pj5DVU7MaammM2nfHh1ZRG

…verify script

`verify-release-candidate.sh` clones the parquet-testing repository into a
`parquet-testing` directory, but the git submodule path declared in
`.gitmodules` is `parquet`, and every parquet-based test reads its data from
`parquet/data/...` (e.g. `python/tests/test_io.py`, `test_store.py`,
`test_context.py`). The arrow-testing clone one line above already correctly
uses the submodule path (`testing`).

This is currently latent because the script's `python3 -m pytest` invocation is
commented out, so the wrong directory is never exercised. Fixing the path is a
prerequisite for enabling the test run during release verification.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Pj5DVU7MaammM2nfHh1ZRG
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant