Skip to content

fix(ncp): collect full-access paths for array columns unreferenced in lambda body#64436

Draft
englefly wants to merge 1 commit into
apache:masterfrom
englefly:is-null-lambda
Draft

fix(ncp): collect full-access paths for array columns unreferenced in lambda body#64436
englefly wants to merge 1 commit into
apache:masterfrom
englefly:is-null-lambda

Conversation

@englefly

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

When a lambda body is a constant expression (e.g. x -> true) that never references the array item variable, visitArrayItemSlot is not called, so the bound array column's full-access path is never registered in slotToAccessPaths. This gap is exposed when IS NULL or cardinality() has already registered a data-skipping path ([col.NULL] or [col.OFFSET]) for the same slot — NestedColumnPruning then incorrectly prunes the complex column to null-only or offset-only.

Fix: in collectArrayPathInLambda, after visiting the body, check whether each bound array's underlying Slot appears in the body's input slots (via getInputSlots()). If not, manually visit the bound array with an ACCESS_ALL prefix to register the full-access path.

Regression test lambda_null_pruning covers:

  • Constant lambda body + IS NULL (the original bug)
  • Multi-variable lambda where body references some but not all vars
  • Array functions: array_count, array_map, array_filter
  • Data-skipping triggers: IS NULL, IS NOT NULL, cardinality

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

… lambda body (apache#26178)

When a lambda body is a constant expression (e.g. x -> true) that never
references the array item variable, visitArrayItemSlot is not called,
so the bound array column's full-access path is never registered in
slotToAccessPaths. This gap is exposed when IS NULL or cardinality()
has already registered a data-skipping path ([col.NULL] or [col.OFFSET])
for the same slot — NestedColumnPruning then incorrectly prunes the
complex column to null-only or offset-only.

Fix: in collectArrayPathInLambda, after visiting the body, check whether
each bound array's underlying Slot appears in the body's input slots
(via getInputSlots()). If not, manually visit the bound array with an
ACCESS_ALL prefix to register the full-access path.

Regression test lambda_null_pruning covers:
- Constant lambda body + IS NULL (the original bug)
- Multi-variable lambda where body references some but not all vars
- Array functions: array_count, array_map, array_filter
- Data-skipping triggers: IS NULL, IS NOT NULL, cardinality

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@englefly

Copy link
Copy Markdown
Contributor Author

/run buildall

@englefly

Copy link
Copy Markdown
Contributor Author

/review

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am requesting changes for the actual PR diff, which contains the Nereids collector change and the new regression suite.

Critical checkpoint conclusions:

  • Goal and behavior: the fix is not complete because it uses the lambda body's base-column input slots as a proxy for whether each lambda variable was referenced. That is not equivalent to lambda-variable usage and can both miss unreferenced variables and broaden referenced nested paths to full array access.
  • Test coverage: the new suite uses order_qt_* checks but does not add the generated .out file, so the regression test is not runnable in normal comparison mode. It also does not cover nested array element precision or the outer-array-reference case described inline.
  • Scope and compatibility: the PR scope is narrow and touches only Nereids column-pruning collection plus one regression suite; no API or storage compatibility issue was found in those files.
  • Security: no auth, authorization, tenancy, or network-boundary security behavior is changed by the PR diff.
  • Concurrency/resource lifecycle: not applicable to the two PR files.
  • User focus: the provided focus file did not add extra focus points, and existing inline review threads were empty, so these comments are not duplicates.

I did not run the full Doris build or regression suite; this review is based on static inspection of the PR diff and the surrounding Nereids expression code.

for (Expression argument : arguments) {
if (argument instanceof ArrayItemReference) {
Expression boundArray = argument.child(0);
if (!arguments.get(0).getInputSlots().containsAll(boundArray.getInputSlots())) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is not a reliable way to decide whether the current lambda variable was referenced. arguments.get(0) is the lambda body, and Expression.getInputSlots() deliberately excludes ArrayItemSlot, so a referenced body such as x -> element_at(x, 'f') still does not contain the bound array slot and this fallback broadens the precise a.*.f path to full a.*. The opposite direction is also possible: x -> a IS NULL references the same outer array slot but not x, so this condition is false and the fallback does not add the full-access path for the unreferenced lambda variable, leaving the original null-only pruning hole. Please track references to the lambda argument itself, for example by marking ArrayItemSlot visits by expr id/name, and only add the full-access path for arguments that were truly not used.

notContains "a.NULL"
}

order_qt_case1 """

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suite adds several order_qt_* checks but the matching generated output file is not added (regression-test/data/nereids_rules_p0/column_pruning/lambda_null_pruning.out is missing). In normal regression mode the first order_qt_case1 comparison has no expected section to match, so the new test will fail instead of covering the NCP fix. Please generate and commit the .out file for this suite.

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 28698 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit eb79b1025eb44b8a65a69d04c84ba09d14ec2cfd, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17678	3986	3951	3951
q2	q3	10768	1373	804	804
q4	4680	476	342	342
q5	7536	866	574	574
q6	182	167	134	134
q7	815	840	642	642
q8	9543	1554	1527	1527
q9	6222	4455	4481	4455
q10	6830	1780	1524	1524
q11	443	261	241	241
q12	642	418	301	301
q13	18238	3358	2756	2756
q14	264	258	251	251
q15	q16	821	774	707	707
q17	964	903	889	889
q18	6847	5655	5550	5550
q19	1293	1292	1084	1084
q20	524	389	259	259
q21	5842	2609	2411	2411
q22	437	359	296	296
Total cold run time: 100569 ms
Total hot run time: 28698 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4302	4225	4235	4225
q2	q3	4575	4966	4311	4311
q4	2049	2150	1393	1393
q5	4475	4266	4311	4266
q6	232	174	127	127
q7	1726	1605	1435	1435
q8	2767	2193	2126	2126
q9	7959	7851	7978	7851
q10	4780	4717	4319	4319
q11	569	413	391	391
q12	742	754	555	555
q13	3251	3758	2945	2945
q14	292	314	303	303
q15	q16	716	767	652	652
q17	1359	1308	1328	1308
q18	7868	7349	7154	7154
q19	1119	1084	1118	1084
q20	2207	2221	1949	1949
q21	5218	4532	4416	4416
q22	511	450	418	418
Total cold run time: 56717 ms
Total hot run time: 51228 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 168451 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit eb79b1025eb44b8a65a69d04c84ba09d14ec2cfd, data reload: false

query5	4354	645	493	493
query6	453	217	168	168
query7	4840	558	287	287
query8	352	210	198	198
query9	8772	3995	4049	3995
query10	440	316	277	277
query11	5930	2351	2134	2134
query12	153	103	98	98
query13	1245	618	412	412
query14	6729	5390	5010	5010
query14_1	4357	4317	4348	4317
query15	200	193	170	170
query16	1012	435	443	435
query17	1095	688	571	571
query18	2622	453	336	336
query19	193	183	138	138
query20	110	102	104	102
query21	220	138	113	113
query22	13695	13607	13373	13373
query23	17231	16517	16184	16184
query23_1	16164	16186	16211	16186
query24	7538	1808	1319	1319
query24_1	1308	1310	1287	1287
query25	584	466	398	398
query26	1306	329	176	176
query27	2648	582	351	351
query28	4463	2030	2032	2030
query29	1076	627	490	490
query30	323	239	202	202
query31	1118	1078	955	955
query32	110	63	61	61
query33	528	322	261	261
query34	1167	1113	659	659
query35	744	797	676	676
query36	1368	1369	1235	1235
query37	150	106	97	97
query38	3205	3126	3020	3020
query39	945	922	899	899
query39_1	875	862	893	862
query40	227	126	105	105
query41	69	71	69	69
query42	100	96	96	96
query43	320	326	276	276
query44	
query45	200	197	189	189
query46	1069	1259	764	764
query47	2419	2370	2292	2292
query48	389	405	291	291
query49	665	487	368	368
query50	1039	368	261	261
query51	4665	4406	4259	4259
query52	92	95	80	80
query53	251	285	212	212
query54	285	235	215	215
query55	88	78	71	71
query56	251	253	238	238
query57	1542	1457	1358	1358
query58	263	214	209	209
query59	1612	1740	1528	1528
query60	276	270	236	236
query61	156	149	150	149
query62	713	668	572	572
query63	235	193	188	188
query64	2492	744	604	604
query65	
query66	1749	459	350	350
query67	29656	29682	29508	29508
query68	
query69	414	287	258	258
query70	947	950	959	950
query71	288	217	210	210
query72	2936	2647	2309	2309
query73	876	771	425	425
query74	5110	4931	4741	4741
query75	2660	2553	2254	2254
query76	2317	1153	801	801
query77	343	378	295	295
query78	12441	12373	11892	11892
query79	1466	1043	756	756
query80	1286	460	400	400
query81	519	278	241	241
query82	648	155	117	117
query83	331	271	248	248
query84	
query85	942	505	409	409
query86	422	307	299	299
query87	3367	3379	3187	3187
query88	3615	2753	2716	2716
query89	424	380	326	326
query90	1905	179	194	179
query91	174	163	135	135
query92	64	61	57	57
query93	1627	1580	837	837
query94	712	349	314	314
query95	685	373	347	347
query96	1085	768	343	343
query97	2711	2719	2567	2567
query98	206	205	201	201
query99	1149	1181	1018	1018
Total cold run time: 252545 ms
Total hot run time: 168451 ms

@hello-stephen

Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 80.00% (8/10) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants