Skip to content

Export mapped variants and add README for public data dump#728

Open
bencap wants to merge 4 commits into
release-2026.2.5from
feature/bencap/664/include-mapped-variants-in-dump
Open

Export mapped variants and add README for public data dump#728
bencap wants to merge 4 commits into
release-2026.2.5from
feature/bencap/664/include-mapped-variants-in-dump

Conversation

@bencap

@bencap bencap commented May 6, 2026

Copy link
Copy Markdown
Collaborator

Reopens the erroneously closed #711.

@bencap bencap changed the base branch from main to release-2026.2.0 May 6, 2026 18:12
@bencap bencap linked an issue May 6, 2026 that may be closed by this pull request
@bencap bencap force-pushed the release-2026.2.0 branch 2 times, most recently from 4b3c99a to a66674d Compare May 8, 2026 18:34
@bencap bencap changed the base branch from release-2026.2.0 to release-2026.2.1 May 8, 2026 19:59
@bencap bencap force-pushed the release-2026.2.1 branch from 4cef247 to f579c2e Compare May 11, 2026 20:11
@bencap bencap changed the base branch from release-2026.2.1 to release-2026.2.2 May 11, 2026 23:15
@bencap bencap marked this pull request as ready for review May 11, 2026 23:15
@bencap bencap changed the base branch from release-2026.2.2 to release-2026.2.3 May 12, 2026 21:35
@bencap bencap changed the base branch from release-2026.2.3 to release-2026.2.4 May 21, 2026 18:11
@bencap bencap changed the base branch from release-2026.2.4 to release-2026.2.5 June 10, 2026 23:06
bencap added 4 commits June 16, 2026 09:46
Emit a va/{urn}.va.ndjson file per mapped score set in the public data
export, one record per current mapped variant carrying its highest
materialized VA-Spec layer.

- add variant_highest_level_annotation to resolve the highest available
  layer (pathogenicity > functional statement > study result), returning
  None for variants without a post-mapped allele
- extract get_current_mapped_variants_for_annotation as the shared
  eager-load source of truth for the annotated-variant endpoints and the
  export, and route the three streaming routers through it
- document the va/ output, layer ladder, and the functional-evidence-only
  caveat in the dump README
- cover the resolver with unit tests across the uncalibrated, functional,
  pathogenicity, and unmapped cases
- Omit the score-calibration "Baseline score" extension when no baseline
  score exists. Extension.value is required, so a null value was stripped
  by model_dump(exclude_none=True) and the object no longer re-parsed
  through the VA-Spec models. This also corrects the API's VA-Spec
  streaming endpoints, which share the builder.
- Gate dump annotation files on the presence of current mapped variants,
  so score sets whose mappings are all superseded no longer emit empty or
  stale annotation files.
- Newline-terminate every NDJSON record to match the API streams and keep
  line-based consumers happy.
- Add regression tests covering the baseline-score extension round-trip.
@bencap bencap force-pushed the feature/bencap/664/include-mapped-variants-in-dump branch from 5d6e150 to 5c155f4 Compare June 16, 2026 22:40
@bencap bencap linked an issue Jun 16, 2026 that may be closed by this pull request
7 tasks
@coveralls

Copy link
Copy Markdown

Coverage Report for CI Build 27652871621

Coverage decreased (-0.2%) to 88.96%

Details

  • Coverage decreased (-0.2%) from the base build.
  • Patch coverage: 1 uncovered change across 1 file (16 of 17 lines covered, 94.12%).
  • 25 coverage regressions across 4 files.

Uncovered Changes

File Changed Covered %
src/mavedb/lib/score_sets.py 3 2 66.67%
Total (3 files) 17 16 94.12%

Coverage Regressions

25 previously-covered lines in 4 files lost coverage.

File Lines Losing Coverage Coverage
src/mavedb/lib/utils.py 11 28.13%
src/mavedb/lib/clinvar/utils.py 6 90.59%
src/mavedb/worker/jobs/external_services/clinvar.py 5 94.95%
src/mavedb/lib/vep.py 3 87.76%

Coverage Stats

Coverage Status
Relevant Lines: 14285
Covered Lines: 12708
Line Coverage: 88.96%
Coverage Strength: 0.89 hits per line

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add VA-Spec Annotation Dumps to Public Data Export Include mapped variant data and README in public data dump

2 participants