ci: stabilize TMPDIR for Go test cache#21277
Conversation
|
Skipping CI for Draft Pull Request. |
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🚀 Build Images ReadyImages are ready for commit 3f08c9a. To use with deploy scripts: export MAIN_IMAGE_TAG=4.12.x-239-g3f08c9a93c |
Set TMPDIR=/tmp in the unit-tests workflow env block. Go's test cache (computeTestInputsID in cmd/go/internal/test/test.go) hashes the current value of every environment variable the test binary reads via os.Getenv(). Many packages call os.TempDir() which reads TMPDIR. On GHA, TMPDIR is unique per job, causing testInputsID to differ between cache-save (master push) and cache-restore (PR) runs. This caused ~25% of test packages to miss Phase 2 of the cache lookup on every PR run (75% cache hit rate instead of 93%+). Evidence: - GODEBUG=gocachehash=1 on CI showed 817 testInputs lines referencing TMPDIR out of 128,527 total - With TMPDIR=/tmp: 693/762 (91%) on first run from master cache, stabilizing at 712/762 (93%) across subsequent commits - Without fix (master baseline): 571/762 (75%) Partially generated by AI.
25c0a84 to
3f08c9a
Compare
|
@davdhacs: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
What is a "Phase 2 cache" in this context? |
Description
Set
TMPDIR=/tmpin the unit-tests workflow env block.Go's test cache (
computeTestInputsID) hashes every environment variable that the test binary reads viaos.Getenv(). These reads are recorded in the test binary's internal log (viatesting/internal/testdeps) and replayed bycomputeTestInputsIDon subsequent runs to verify that inputs haven't changed.Approximately half of our test packages call
os.TempDir()(directly or through imported libraries), which readsTMPDIR. On GHA, each job gets a uniqueTMPDIR— sotestInputsIDdiffers between the master push (cache save) and PR run (cache restore), causing Phase 2 cache misses for every affected package.TMPDIR=/tmp(from master cache)TMPDIR=/tmp(own cache, stable)GODEBUG=gocachehash=1on CI (PR #21276) captured 128,527testInputshash lines across 762 packages. 817 of those referenceTMPDIR.User-facing documentation
Testing and quality
Automated testing
No test changes. This fixes the CI test caching infrastructure.
How I validated my change
GODEBUG=gocachehash=1on CI captured hash inputs confirming TMPDIR intestInputsID(PR debug(ci): testInputsID hash capture for cache investigation #21276, run 27789436790)