Skip to content

ci: stabilize TMPDIR for Go test cache#21277

Draft
davdhacs wants to merge 1 commit into
masterfrom
davdhacs/tmpdir-fix
Draft

ci: stabilize TMPDIR for Go test cache#21277
davdhacs wants to merge 1 commit into
masterfrom
davdhacs/tmpdir-fix

Conversation

@davdhacs

@davdhacs davdhacs commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Description

Set TMPDIR=/tmp in the unit-tests workflow env block.

Go's test cache (computeTestInputsID) hashes every environment variable that the test binary reads via os.Getenv(). These reads are recorded in the test binary's internal log (via testing/internal/testdeps) and replayed by computeTestInputsID on subsequent runs to verify that inputs haven't changed.

Approximately half of our test packages call os.TempDir() (directly or through imported libraries), which reads TMPDIR. On GHA, each job gets a unique TMPDIR — so testInputsID differs between the master push (cache save) and PR run (cache restore), causing Phase 2 cache misses for every affected package.

Config Cache hits
Master baseline (no fix) 571/762 (75%)
With TMPDIR=/tmp (from master cache) 693/762 (91%)
With TMPDIR=/tmp (own cache, stable) 712/762 (93%)

GODEBUG=gocachehash=1 on CI (PR #21276) captured 128,527 testInputs hash lines across 762 packages. 817 of those reference TMPDIR.

Testing and quality

  • CI results are inspected

Automated testing

No test changes. This fixes the CI test caching infrastructure.

How I validated my change

@openshift-ci

openshift-ci Bot commented Jun 18, 2026

Copy link
Copy Markdown

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: a7248273-a048-4b76-9fee-f2bb12634712

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch davdhacs/tmpdir-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

🚀 Build Images Ready

Images are ready for commit 3f08c9a. To use with deploy scripts:

export MAIN_IMAGE_TAG=4.12.x-239-g3f08c9a93c

Set TMPDIR=/tmp in the unit-tests workflow env block.

Go's test cache (computeTestInputsID in cmd/go/internal/test/test.go)
hashes the current value of every environment variable the test binary
reads via os.Getenv(). Many packages call os.TempDir() which reads
TMPDIR. On GHA, TMPDIR is unique per job, causing testInputsID to
differ between cache-save (master push) and cache-restore (PR) runs.

This caused ~25% of test packages to miss Phase 2 of the cache lookup
on every PR run (75% cache hit rate instead of 93%+).

Evidence:
- GODEBUG=gocachehash=1 on CI showed 817 testInputs lines referencing
  TMPDIR out of 128,527 total
- With TMPDIR=/tmp: 693/762 (91%) on first run from master cache,
  stabilizing at 712/762 (93%) across subsequent commits
- Without fix (master baseline): 571/762 (75%)

Partially generated by AI.
@davdhacs davdhacs force-pushed the davdhacs/tmpdir-fix branch from 25c0a84 to 3f08c9a Compare June 19, 2026 04:36
@davdhacs davdhacs changed the title ci: fix test cache by stabilizing TMPDIR across CI runs ci: stabilize TMPDIR for Go test cache Jun 19, 2026
@davdhacs davdhacs requested a review from a team June 19, 2026 04:47
@davdhacs davdhacs marked this pull request as ready for review June 19, 2026 04:47
@davdhacs davdhacs requested a review from a team as a code owner June 19, 2026 04:47
@tommartensen

tommartensen commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

What is a "Phase 2 cache" in this context?

@davdhacs

Copy link
Copy Markdown
Contributor Author

What is a "Phase 2 cache" in this context?

Claude named the key creation steps "Phase 1" and "Phase 2" while I was working with it. The first phase is a content hash key lookup, and the second includes the env vars and causes our miss because the TMPDIR changes every run in GHA's runner (and some tests read the TMPDIR variable):

https://go.googlesource.com/go/+/refs/heads/master/src/cmd/go/internal/test/test.go#1882

// The test cache result fetch is a two-level lookup.
--
//
// First, we use the content hash of the test binary
// and its command-line arguments to find the
// list of environment variables and files consulted
// the last time the test was run with those arguments.
// (To avoid unnecessary links, we store this entry
// under two hashes: id1 uses the linker inputs as a
// proxy for the test binary, and id2 uses the actual
// test binary. If the linker inputs are unchanged,
// this way we avoid the link step, even though we
// do not cache link outputs.)
//
// Second, we compute a hash of the values of the
// environment variables and the content of the files
// listed in the log from the previous run.
// Then we look up test output using a combination of
// the hash from the first part (testID) and the hash of the
// test inputs (testInputsID).

@davdhacs davdhacs requested a review from tommartensen June 19, 2026 13:24
@davdhacs davdhacs added the auto-retest PRs with this label will be automatically retested if prow checks fails label Jun 19, 2026
@rhacs-bot

Copy link
Copy Markdown
Contributor

/retest

@davdhacs davdhacs marked this pull request as draft June 19, 2026 14:13
@davdhacs

Copy link
Copy Markdown
Contributor Author

I am seeing 93% go unittest cache hits on recent master commits without this change. I need to investigate more to verify if this change is needed.

@rhacs-bot

Copy link
Copy Markdown
Contributor

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants