[fix](load) fix empty statistics for forwarded INSERT#64439
Merged
Conversation
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
/review |
Contributor
There was a problem hiding this comment.
Requesting changes for missing regression coverage of the exact bug path.
Critical checkpoint conclusions:
- Goal: the current four-file diff preserves
jobIdwhenEnvFactory/CloudEnvFactoryfall back to legacyCoordinator/CloudCoordinator, so BE progress reports can update the existingInsertLoadJobinstead of job-1. - Scope: the code change is small and focused; both shared-nothing and cloud fallback paths were updated.
- Concurrency/lifecycle/config/compatibility: no new concurrency, lifecycle, configuration, protocol, or storage-format behavior was introduced.
- Parallel paths: the parallel cloud and non-cloud coordinator factory paths are both covered.
- Data writes/persistence: the fix uses the existing
LoadManager.initJobProgress/updateJobProgressand finished-job snapshot flow; preserving the id is the right state handoff forSHOW LOAD.JobDetails. - Performance/observability: no meaningful hot-path cost was added; existing load-progress reporting and warnings remain the observability path.
- Tests: missing. Existing
test_insert_statisticcovers normalINSERT ... SELECTstatistics, but it does not exercise the follower-FE proxy path this PR fixes. The existing docker suitetest_insert_from_followeralready creates a 3-FE cluster and should be extended to assert the labeledINSERT ... SELECTSHOW LOAD.JobDetailsvalues. - User focus: no additional user-provided focus points were supplied.
Contributor
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 28665 ms |
Contributor
TPC-DS: Total hot run time: 168656 ms |
Contributor
FE UT Coverage ReportIncrement line coverage |
Contributor
|
PR approved by at least one committer and no changes requested. |
Contributor
FE Regression Coverage ReportIncrement line coverage |
Contributor
|
PR approved by anyone and no changes requested. |
gavinchou
approved these changes
Jun 12, 2026
github-actions Bot
pushed a commit
that referenced
this pull request
Jun 12, 2026
### What problem does this PR solve? When `INSERT INTO ... SELECT` is forwarded from a follower FE to the master FE, `SHOW LOAD` could show an empty `JobDetails`, such as `ScannedRows=0`, `LoadBytes=0`, `TaskNumber=0`, and empty backend lists. The root cause is that the insert load job is registered with a real `jobId`, but when coordinator creation falls back to the regular `Coordinator` / `CloudCoordinator` path, that `jobId` was not passed into the coordinator. Therefore, the coordinator kept the default `jobId=-1` and did not initialize or update the corresponding `LoadManager` progress. The load job was still recorded as `FINISHED`, but its `LoadStatistic` remained empty when `SHOW LOAD` rendered `JobDetails`. This PR preserves the insert `jobId` in the regular `Coordinator` and `CloudCoordinator` fallback paths, so `initJobProgress()` and `updateJobProgress()` update the same `InsertLoadJob` that is later recorded and displayed by `SHOW LOAD`.
github-actions Bot
pushed a commit
that referenced
this pull request
Jun 12, 2026
### What problem does this PR solve? When `INSERT INTO ... SELECT` is forwarded from a follower FE to the master FE, `SHOW LOAD` could show an empty `JobDetails`, such as `ScannedRows=0`, `LoadBytes=0`, `TaskNumber=0`, and empty backend lists. The root cause is that the insert load job is registered with a real `jobId`, but when coordinator creation falls back to the regular `Coordinator` / `CloudCoordinator` path, that `jobId` was not passed into the coordinator. Therefore, the coordinator kept the default `jobId=-1` and did not initialize or update the corresponding `LoadManager` progress. The load job was still recorded as `FINISHED`, but its `LoadStatistic` remained empty when `SHOW LOAD` rendered `JobDetails`. This PR preserves the insert `jobId` in the regular `Coordinator` and `CloudCoordinator` fallback paths, so `initJobProgress()` and `updateJobProgress()` update the same `InsertLoadJob` that is later recorded and displayed by `SHOW LOAD`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
When
INSERT INTO ... SELECTis forwarded from a follower FE to the master FE,SHOW LOADcould show an emptyJobDetails, such asScannedRows=0,LoadBytes=0,TaskNumber=0, and empty backend lists.The root cause is that the insert load job is registered with a real
jobId, but when coordinator creation falls back to the regularCoordinator/CloudCoordinatorpath, thatjobIdwas not passed into the coordinator. Therefore, the coordinator kept the defaultjobId=-1and did not initialize or update the correspondingLoadManagerprogress. The load job was still recorded asFINISHED, but itsLoadStatisticremained empty whenSHOW LOADrenderedJobDetails.This PR preserves the insert
jobIdin the regularCoordinatorandCloudCoordinatorfallback paths, soinitJobProgress()andupdateJobProgress()update the sameInsertLoadJobthat is later recorded and displayed bySHOW LOAD.