Skip to content

Add GrandQC slide quality control notebook for IDC DICOM WSI#105

Draft
fedorov wants to merge 9 commits into
ImagingDataCommons:masterfrom
fedorov:grand-qc
Draft

Add GrandQC slide quality control notebook for IDC DICOM WSI#105
fedorov wants to merge 9 commits into
ImagingDataCommons:masterfrom
fedorov:grand-qc

Conversation

@fedorov

@fedorov fedorov commented Jun 11, 2026

Copy link
Copy Markdown
Member

Demonstrates end-to-end use of GrandQC with NCI Imaging Data Commons:

  • Discover H&E slides via idc-index sm_index metadata
  • Patch GrandQC scripts for native DICOM directory support via OpenSlide 4.0
  • Download TCGA-BRCA DICOM series and run tissue detection + artifact segmentation
  • Visualize results using GrandQC's own wsi_colors.py palette
  • Validate the DICOM-based pipeline against GrandQC's pre-computed TCGA masks (zenodo.org/records/14041578) by comparing per-class artifact fractions

Demonstrates end-to-end use of GrandQC with NCI Imaging Data Commons:
- Discover H&E slides via idc-index sm_index metadata
- Patch GrandQC scripts for native DICOM directory support via OpenSlide 4.0
- Download TCGA-BRCA DICOM series and run tissue detection + artifact segmentation
- Visualize results using GrandQC's own wsi_colors.py palette
- Validate the DICOM-based pipeline against GrandQC's pre-computed TCGA masks
  (zenodo.org/records/14041578) by comparing per-class artifact fractions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@review-notebook-app

Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

fedorov and others added 8 commits June 15, 2026 14:25
main.py loads a fully-pickled QC model via torch.load(), which fails with
exit code 1 under PyTorch >=2.6 (weights_only=True default). Patch the load
to weights_only=False so the trusted Zenodo weights unpickle. Also add
[i/N] progress reporting and a failure guard that records per-slide errors
and exits non-zero, so the notebook surfaces failures the loop used to swallow.

Add grandqc_slide_quality_resolution_notes.md documenting the model working
resolution (MPP 1.5) vs. the fixed 512x512 input tile, how MPP is read from
openslide.mpp-x (DICOM PixelSpacing), and an empirical confirmation on an IDC
slide; link it from the Part 4 intro.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Clone the fedorov/grandqc idc-dicom-fixes branch, which carries the
DICOM-directory, device-detection, PyTorch >=2.6 load, small-slide
overlay, and fail-loud fixes as proper commits. Replace the in-notebook
patch_grandqc_for_dicom() cell with a verification cell that asserts the
fixes are present in the cloned scripts, and update Part 2 prose plus the
resolution notes accordingly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The demo selection picked the smallest TCGA-BRCA slides, which are
frozen-section (TS/BS) slides absent from GrandQC's Zenodo reference
archive (DX diagnostic slides only) — so Part 6 validation matched
nothing. Switch to five verified DX barcodes (variety by magnification,
not tissue type, since all DX slides are primary tumor).

Part 6 now downloads just those 5 reference masks (~700 KB, re-hosted)
instead of the full 2 GB BRCA.tar, and fixes the mask-name matching:
reference masks are {barcode}.{UUID}.svs_mask.png while local masks are
keyed by the bare ContainerIdentifier, so a barcode->filename map is
built. Drop the cohort-wide statistics that required the full archive.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
New section 6.4 overlays per-class artifact region boundaries for the
reference (Zenodo) and local DICOM runs on a shared grid: solid outline
for the reference, dashed for the local run, in the matching artifact
colour. This exposes boundary-level localisation differences that the
aggregate per-class fractions in 6.3 can hide. The reference mask is
resized onto the local grid (nearest-neighbour) since the two can differ
slightly in pixel dimensions. Existing cohort-scaling note renumbered to 6.5.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
If the download cell is re-run after a rename, both the UID directory
and the renamed barcode directory exist, and os.rename(src, dst) onto a
non-empty dst raised OSError. Handle the both-present case by dropping
the redundant UID copy and keeping the barcode directory, so the cell is
safe to re-run in any order.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
With five demo slides the single-row figures made each panel tiny.
Switch the tissue-detection overlays, artifact overlays, and the
artifact-boundary comparison to a 2-column grid with larger per-panel
sizes, hiding any unused trailing panel.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The overlaid solid/dashed contours were hard to read. Instead show the
reference (Zenodo) and local DICOM artifact maps side by side, one slide
per row, each colourised with the same GrandQC palette and a shared
class legend. No resizing needed since the panels are not overlaid.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant