Robustness + review fixes for nightly bench sysroot#677
Conversation
Follow-up review fixes for the nightly-bench-sysroot change. - prepare-sysroot/prepare-test-data: a failed `curl` was `;`-separated with no `set -e`, so it fell through to `tar` and could extract a stale tarball from the fixed /tmp path and exit 0 (reproduced). Add `set -e`, remove any pre-existing tarball before downloading, and make the download explicitly fatal. Apply the same hardening to prepare-test-data (same download-fall-through class). - Hardening: `curl --proto '=https'` (no http downgrade on redirect) and `tar --no-same-owner` on extraction (matters in the sudo/root branch). - `rm` -> `rm -f` for the success-path tarball cleanup; drop the now-stale "rm only runs on success" note. - Workflow comment: the sysroot is not `actions/cache`-cached; it persists because the runner filesystem is persistent. Reword to avoid implying a cache mechanism. - Use ASCII `-` instead of an em-dash in the `::error::` output. - README: lead with the recommended user-writable (no-sudo) sysroot path.
Codex Code ReviewNo issues found in the PR diff. The changes are limited to Makefile provisioning hardening, CI text/comment updates, and README command ordering. I didn’t see introduced security vulnerabilities, behavioral bugs, or significant performance concerns. |
Code ReviewOverall: Correct fix for the stale-tarball silent-success bug. Logic is sound, hardening additions are appropriate. What's good
FindingsLow — no tarball integrity check (pre-existing, acknowledged in PR body)
Informational —
Informational — Two concurrent No blocking issues. The fix is correct and the hardening is proportionate. |
Addresses review feedback on #677: document that the fixed /tmp tarball path assumes one writer at a time, so a future CI topology change that shares a runner across concurrent jobs doesn't reintroduce a race silently.
|
Thanks for the review. Addressing the three points:
|
Follow-up review fixes layered on top of #675 (targets
fix/nightly-bench-sysroot, notmain). Found via a multi-agent review of #675; the headline bug was reproduced against the real recipe.Fixed
1. Failed download was not fatal — could silently extract a stale sysroot (Medium, reproduced)
prepare-sysroot's recipe had noset -e, and thecurl -fL … -o "$(SYSROOT_TARBALL)"line ended with;instead of being chained into the extraction. So whencurlfailed, control fell through to thetarstep.tarchokes on the missing/partial tarball and the|| { … exit 1; }handler aborts — but with a misleading "Extracting…" message.SYSROOT_TARBALLis a fixed/tmppath. If a tarball is already there (left by an interrupted prior run, since the success-pathrmnever executed), a failedcurlfalls through,tarextracts the stale tarball, andmakeexits 0. The next run then seesinclude/stdlib.hand treats the sysroot as complete.Repro before the fix (connection-failing URL + pre-existing tarball):
makereturned 0 and provisioned a stale sysroot. After the fix:makeaborts (exit 2), no sysroot is created, and the stale tarball is removed.Fix: add
set -e,rm -f "$(SYSROOT_TARBALL)"before downloading (kills the stale-tarball case and the predictable-path reuse), and make the download explicitly fatal (curl … || { rm -f …; exit 1; }). Same fix applied toprepare-test-data, which had the identical download-fall-through pattern.2. Download/extraction hardening (defense-in-depth)
curl --proto '=https'—-Lfollows redirects; this prevents a redirect from downgrading to plainhttp.tar … --no-same-owner— relevant in thesudo(root) extraction branch so archive-specified ownership isn't restored.3. Comment / doc fixes
bench-vs-nightly.yml: the comment claimed the sysroot "is cached in$HOMEacross nightly runs," but there is noactions/cachestep — it persists only because the self-hosted runner's filesystem is persistent. Reworded so it doesn't imply a cache key/restore mechanism.# Note: the tarball rm above only runs on success…comment (no longer accurate after adding the pre-downloadrm -fand switching the finalrmtorm -f).rm→rm -ffor the success-path tarball cleanup (consistent with the error handlers; won't error if already gone).-instead of an em-dash in the::error::output (greppability / repo consistency).SYSROOT_DIR=$HOME/...invocation instead of the/opt+ sudo one.Verified
curl+ pre-existing tarball →makeexits non-zero, no sysroot created, stale tarball removed (was exit 0 before).SYSROOT_URL→ exit 0,include/stdlib.h+lib/present, tarball cleaned up.Considered but intentionally not changed
rm -rf+ extract: safe within a singlemake -j(Make builds the sharedprepare-sysrootprereq once); only at risk if two separatemakeprocesses share oneSYSROOT_DIR, which the current CI does not do. Not worth the complexity.lib/*.a): skipped to avoid hard-coding a library filename that could be wrong and cause endless re-provisioning. Theinclude/stdlib.hcheck already fixes the reported failure.infra/provision.shprovisions a complete/optsysroot, which appears to contradict the Fix/nightly bench sysroot #675 root-cause note ("never fully provisioned"). Left for the author to reconcile — changing the provisioner needs knowledge of how the bench runner is actually set up.