Skip to content

[SYSTEMDS-3949] Add native Delta Lake frame read/write via Delta Kernel#2515

Open
Baunsgaard wants to merge 1 commit into
apache:mainfrom
Baunsgaard:delta-frame-io
Open

[SYSTEMDS-3949] Add native Delta Lake frame read/write via Delta Kernel#2515
Baunsgaard wants to merge 1 commit into
apache:mainfrom
Baunsgaard:delta-frame-io

Conversation

@Baunsgaard

Copy link
Copy Markdown
Contributor

Extend the native Delta Lake support (#2511) from matrices to frames, reading and writing Delta Lake tables through the Spark-free Delta Kernel library on the single-node CP path. DML read/write with format="delta" now works for frames, discovering schema, column names, and dimensions directly from the table.

Stacked on #2511 and should merge after it. Append/overwrite semantics, distributed execution, and time travel remain out of scope

@codecov

codecov Bot commented Jun 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 86.74699% with 55 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.65%. Comparing base (384a8dc) to head (8feb495).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...che/sysds/runtime/io/FrameReaderDeltaParallel.java 81.60% 18 Missing and 14 partials ⚠️
.../org/apache/sysds/runtime/io/FrameReaderDelta.java 90.62% 0 Missing and 15 partials ⚠️
.../org/apache/sysds/runtime/io/FrameWriterDelta.java 89.18% 4 Missing and 4 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2515      +/-   ##
============================================
+ Coverage     71.56%   71.65%   +0.09%     
- Complexity    49110    49370     +260     
============================================
  Files          1575     1582       +7     
  Lines        189793   190700     +907     
  Branches      37235    37411     +176     
============================================
+ Hits         135816   136641     +825     
- Misses        43480    43519      +39     
- Partials      10497    10540      +43     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Extend the native Delta Lake support from matrices to frames, reading and
writing Delta Lake tables through the Spark-free Delta Kernel library on the
single-node CP path. DML read/write with format="delta" now works for
frames, discovering schema, column names, and dimensions directly from the
table.

- Add FrameReaderDelta, FrameReaderDeltaParallel and FrameWriterDelta
- Wire DELTA into the frame reader and writer factories
- Refresh cached frame metadata and schema after a Delta read
- Broaden Delta frame component IO coverage

Stacked on the matrix Delta support; append/overwrite semantics,
distributed execution, and time travel remain out of scope.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant