Skip to content

ci: add Atheris fuzz targets and ClusterFuzzLite#11482

Open
julian-risch wants to merge 6 commits into
mainfrom
ci/add-fuzzing
Open

ci: add Atheris fuzz targets and ClusterFuzzLite#11482
julian-risch wants to merge 6 commits into
mainfrom
ci/add-fuzzing

Conversation

@julian-risch

@julian-risch julian-risch commented Jun 2, 2026

Copy link
Copy Markdown
Member

Related Issues

  • Inspired by the OpenSSF Scorecard Fuzzing check. Fuzzing is the practice of feeding unexpected or random data into a program to expose bugs. Can help to detect issues, including but not limited to security issues.

Proposed Changes:

Adds fuzzing of a selection of Haystack's untrusted-input entry points to our CI. There could be more (for example pipeline breakpoint snapshots) but I limited it to the following for now:

Fuzz harnesses

Harness Target Why
fuzz_pipeline_loads.py Pipeline.loads Deserializing a serialized pipeline (YAML) is a documented attack surface.
fuzz_document_from_dict.py Document.from_dict Reconstructing a Document from an untrusted dict.
fuzz_filters.py document_matches_filter Evaluating an untrusted filter expression.

Each harness catches the exceptions that are a normal reaction to malformed input (DeserializationError, FilterError, ValueError, …) so only genuine crashes, unbounded recursion, hangs, or unexpected exception types are reported. The "expected" lists can be tightened over time to surface subtler bugs.

ClusterFuzzLite.clusterfuzzlite/: Dockerfile + build.sh + project.yaml build the harnesses with the OSS-Fuzz Python toolchain (compile_python_fuzzer).

CI.github/workflows/cflite_pr.yml: a short, code-change-scoped run (180s) on PRs that touch fuzzed code or the fuzzing setup. Least-privilege contents: read token, SHA-pinned ClusterFuzzLite actions, SARIF upload disabled (so no security-events: write needed). Crashes fail the job and upload as artifacts.

licenserc.toml: excludes .clusterfuzzlite from the license-header check (consistent with docker/.github).

How did you test it?

  • End-to-end CI validation only works after merge: the cflite_pr workflow needs to exist on the default branch before it runs on PRs, so its first real exercise will be the next PR after this lands.

Notes for the reviewer

If this creates more problems than it helps to solve or if it slows down the CI to much, I am happy to reconsider but I'd like us to at least try this out.
I can already tell that the fuzzer will be red when it runs for the first time. One expected finding is that fuzz_pipeline_loads fails with an AttributeError. In base.py or from_dict, we should validate that deserialized_data is a dict and raise DeserializationError otherwise. This is the kind of bug fuzzing should surface.

Checklist

🤖 Generated with Claude Code

Address the OpenSSF Scorecard Fuzzing check (0/10) and add real fuzzing of the
project's untrusted-input entry points.

- test/fuzz/: three Atheris harnesses — Pipeline.loads (serialized pipeline
  deserialization), Document.from_dict, and document_matches_filter (filter
  expressions). Each catches the exceptions that are a normal reaction to
  malformed input so only genuine crashes/hangs/unexpected errors are reported.
- .clusterfuzzlite/: Dockerfile + build.sh + project.yaml to build the harnesses
  with the OSS-Fuzz Python toolchain.
- .github/workflows/cflite_pr.yml: short, code-change-scoped ClusterFuzzLite run
  on PRs that touch fuzzed code, least-privilege token, SHA-pinned actions.
- licenserc.toml: exclude .clusterfuzzlite from the license-header check.

Scorecard detects this via both the `import atheris` harnesses and the
.clusterfuzzlite deployment. pytest does not collect fuzz_*.py.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 2, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
haystack-docs Ignored Ignored Preview Jun 15, 2026 7:22pm

Request Review

Comment thread test/fuzz/fuzz_filters.py Fixed
Comment thread test/fuzz/fuzz_pipeline_loads.py Fixed
Comment thread test/fuzz/fuzz_document_from_dict.py Fixed
Pin gcr.io/oss-fuzz-base/base-builder-python to its current digest instead of
the rolling latest tag, for supply-chain integrity. The OSS-Fuzz base-builder
is updated frequently, so the comment documents how to refresh the digest.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Coverage report

This PR does not seem to contain any modification to coverable code.

julian-risch and others added 2 commits June 2, 2026 13:02
Add a docker ecosystem entry for /.clusterfuzzlite so Dependabot keeps the
digest-pinned gcr.io/oss-fuzz-base/base-builder-python in Dockerfile up to date.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- fuzz_pipeline_loads: import DeserializationError from haystack.core.errors
  (it is not exported by haystack.errors; the old import raised ImportError
  at harness load, so the fuzzer failed to build).
- fuzz_document_from_dict / fuzz_filters: parse raw fuzzer bytes directly with
  json.loads (accepts bytes) instead of FuzzedDataProvider, so the input domain
  is JSON text and the seed corpus is usable 1:1.
- Add a seed corpus of valid inputs (test/fuzz/corpus/<harness>/) and zip it
  into <harness>_seed_corpus.zip in build.sh to bootstrap coverage past the
  JSON parse.
- cflite_pr: pass github-token to build_fuzzers so it can check out the PR
  base that mode: code-change diffs against.
- README: document the seed corpus and updated local-run commands.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread test/fuzz/fuzz_document_from_dict.py Dismissed
Comment thread test/fuzz/fuzz_filters.py Dismissed
Comment thread test/fuzz/fuzz_pipeline_loads.py Dismissed
julian-risch and others added 2 commits June 15, 2026 21:17
Scorecard detection is static (Dockerfile + import atheris) and unaffected by
fuzz-seconds; 120s keeps a meaningful per-harness regression pass while trimming
CI time on PRs that touch the fuzzed entry points.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`haystack/**` triggered the build-heavy fuzzing job on nearly every library PR,
paying the Docker build + install cost before code-change mode found nothing to
fuzz. Narrow the path filter to the modules the harnesses actually exercise
(core pipeline/serialization/marshal, dataclasses, utils/filters). Deep
transitively-reached regressions are left to a future scheduled batch run.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@julian-risch julian-risch marked this pull request as ready for review June 15, 2026 19:28
@julian-risch julian-risch requested review from a team as code owners June 15, 2026 19:28
@julian-risch julian-risch requested review from davidsbatista and removed request for a team June 15, 2026 19:28
@claude

claude Bot commented Jun 15, 2026

Copy link
Copy Markdown

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant