Skip to content

CI regression harness: write→read round-trip + committed pixel-hash goldens #2

@cornish

Description

@cornish

Why

Almost every bug this season lived in the write↔read seam — wsitools writes,
opentile-go reads, and the corruption only surfaced when something exercised the
full loop on a diverse input:

  • wsitools#1convert --to {cog-wsi,svs,tiff,ome-tiff} dropped associated
    Predictor(317)/JPEGTables(347); decoded to garbage. Hidden because nothing
    read the emitted associated images back.
  • VR OB-vs-OW (DICOM native PixelData) — emitted OW for 8-bit; only an
    independent reader exposed the RGB→grayscale collapse (the spike round-trip
    through the same lib missed it).
  • The faithful-copy fix was only validated because I ran convert → re-read by
    hand, per format/codec.

These are exactly what a standing round-trip + golden harness catches on day one,
instead of by luck.

Proposal

  1. Round-trip matrix (CI): for each fixture × writable target
    (svs|tiff|ome-tiff|cog-wsi|dicom), convert it, then re-read the output via
    opentile-go and assert the decoded pixels of L0 (+ each associated image)
    equal the source decode. This is the loop that exposed wsitools#1 / opentile-go#21
    / #23 — make it automatic. (Gate slow fixtures behind the integration tag /
    WSI_TOOLS_TESTDIR, like the existing integration suite.)
  2. Committed pixel-hash goldens: hash --mode pixel already exists. Commit a
    per-fixture/per-format pixel-hash manifest and have CI assert stability, so a
    silent pixel regression fails loudly. (Pixel-hash, not file-SHA — the writer is
    nondeterministic at the byte level; see the deterministic-write issue.)
  3. (Stretch) independent cross-reader check for the formats where opentile is
    both writer-validator and reader (TIFF family): re-decode emitted tiles with an
    independent decoder (tifffile / libjpeg / golang.org/x/image/tiff) — that's
    what definitively proved the VR-OB and predictor bugs were real vs reader
    artifacts.

Acceptance

  • A convert of any corpus fixture to any writable target round-trips to
    pixel-identical associated images + L0 region (asserted in CI under the
    integration gate).
  • Pixel-hash goldens committed; CI fails on drift.

Notes

  • Pairs with the wsi-fixtures shopping list (wsi-fixtures#1) — more formats in the
    corpus = more of this matrix actually runs.
  • The new tests/integration/convert_codec_test.go is a small step in this
    direction (novel-codec output re-read); this generalizes it to pixel-equality
    across all targets.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions