Skip to content

#1390 - configurable storage for aggregating functions#2473

Open
MrHDOLEK wants to merge 1 commit into
flow-php:1.xfrom
MrHDOLEK:#1390
Open

#1390 - configurable storage for aggregating functions#2473
MrHDOLEK wants to merge 1 commit into
flow-php:1.xfrom
MrHDOLEK:#1390

Conversation

@MrHDOLEK

Copy link
Copy Markdown
Contributor

GroupBy aggregation can move state out of the PHP heap instead of holding every result in memory. A new AggregationStorage abstraction offers three backends: memory (default,
unchanged behavior), filesystem spill+merge, and PSR-16 key-value (APCu/Redis).

  • MergeableAggregatingFunction + merge() on all built-in aggregators
  • AggregationStorage: Memory / External (filesystem spill+merge) / Kv (PSR-16)
  • KV storage fails loudly on eviction / write failure instead of returning a wrong aggregate
  • config: AggregationStorageStrategy + FLOW_AGGREGATION_MAX_MEMORY wired into ConfigBuilder
  • APCu enabled in the nix dev shell (for the PSR-16/APCu storage test)

Note: the filesystem/external backend bounds the ingestion peak; the spill is re-materialized at result time, so fully bounded peak memory (streaming k-way merge + streaming
result) is tracked as a follow-up in #2472.

Resolves: #1390

Change Log


Added

  • Configurable storage for aggregating functions via a new AggregationStorage abstraction with three backends: in-memory (default), filesystem spill+merge, and PSR-16 key-value (APCu/Redis).
  • ConfigBuilder aggregation settings: aggregationStorage(), aggregationMemoryLimit(), aggregationFilesystem(), aggregationStore(), the AggregationStorageStrategy enum and the FLOW_AGGREGATION_MAX_MEMORY environment variable.
  • MergeableAggregatingFunction interface with merge() implemented by all built-in aggregating functions, allowing spilled partial aggregates to be merged.
  • Key-value aggregation storage raises a clear exception on cache eviction / write failure instead of producing an incorrect aggregate.

Fixed

Changed

Removed

Deprecated

Security

GroupBy aggregation can move state out of the PHP heap instead of holding every result in memory. A new AggregationStorage abstraction offers three backends: memory (default, unchanged behavior), filesystem spill+merge, and PSR-16 key-value (APCu/Redis).

- MergeableAggregatingFunction + merge() on all built-in aggregators
- AggregationStorage: Memory / External (filesystem spill+merge) / Kv (PSR-16)
- KV storage fails loudly on eviction / write failure instead of returning a wrong aggregate
- config: AggregationStorageStrategy + FLOW_AGGREGATION_MAX_MEMORY wired into ConfigBuilder
- APCu enabled in the nix dev shell (for the PSR-16/APCu storage test)
@codecov

codecov Bot commented Jun 21, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 242 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.48%. Comparing base (6c5e57e) to head (bbdf2e9).
⚠️ Report is 9 commits behind head on 1.x.

Additional details and impacted files
@@             Coverage Diff              @@
##                1.x    #2473      +/-   ##
============================================
- Coverage     85.12%   83.48%   -1.65%     
+ Complexity    21300      658   -20642     
============================================
  Files          1608     1627      +19     
  Lines         65717    67042    +1325     
============================================
+ Hits          55944    55969      +25     
- Misses         9773    11073    +1300     
Components Coverage Δ
etl 86.37% <0.00%> (-2.06%) ⬇️
cli 89.40% <ø> (ø)
lib-array-dot 81.44% <ø> (ø)
lib-azure-sdk 64.44% <ø> (ø)
lib-doctrine-dbal-bulk 93.61% <ø> (ø)
lib-filesystem 85.03% <ø> (ø)
lib-types 90.06% <ø> (ø)
lib-parquet 70.10% <ø> (ø)
lib-parquet-viewer 82.26% <ø> (ø)
lib-snappy 89.82% <ø> (ø)
lib-dremel 0.00% <ø> (ø)
lib-postgresql 88.59% <ø> (ø)
lib-telemetry 83.94% <ø> (-2.02%) ⬇️
bridge-filesystem-async-aws 92.74% <ø> (ø)
bridge-filesystem-azure 90.45% <ø> (ø)
bridge-monolog-http 96.82% <ø> (ø)
bridge-monolog-telemetry 94.11% <ø> (ø)
bridge-openapi-specification 92.07% <ø> (ø)
symfony-http-foundation 78.57% <ø> (ø)
bridge-psr18-telemetry 100.00% <ø> (ø)
bridge-psr3-telemetry 97.84% <ø> (ø)
bridge-psr7-telemetry 100.00% <ø> (ø)
bridge-telemetry-otlp 84.94% <ø> (-4.96%) ⬇️
bridge-symfony-http-foundation-telemetry 89.47% <ø> (ø)
bridge-symfony-filesystem-bundle 90.66% <ø> (ø)
bridge-symfony-filesystem-cache 98.14% <ø> (ø)
bridge-symfony-postgresql-bundle 93.83% <ø> (ø)
bridge-symfony-postgresql-cache 94.41% <ø> (ø)
bridge-symfony-postgresql-messenger 98.80% <ø> (ø)
bridge-symfony-postgresql-session 93.65% <ø> (ø)
bridge-symfony-telemetry-bundle 65.39% <ø> (-15.44%) ⬇️
adapter-chartjs 84.05% <ø> (ø)
adapter-csv 91.16% <ø> (ø)
adapter-doctrine 90.79% <ø> (ø)
adapter-google-sheet 99.18% <ø> (ø)
adapter-http 72.34% <ø> (ø)
adapter-json 88.63% <ø> (ø)
adapter-logger 50.00% <ø> (ø)
adapter-parquet 77.70% <ø> (ø)
adapter-text 74.13% <ø> (ø)
adapter-xml 83.40% <ø> (ø)
adapter-avro 0.00% <ø> (ø)
adapter-excel 94.21% <ø> (ø)
adapter-postgresql 91.06% <ø> (ø)
adapter-seal 85.42% <ø> (ø)
bridge-phpunit-postgresql 75.30% <ø> (ø)
bridge-phpunit-telemetry 80.08% <ø> (ø)
bridge-phpstan-types 0.00% <ø> (ø)
bridge-postgresql-valinor 100.00% <ø> (ø)
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@datadog-official

datadog-official Bot commented Jun 21, 2026

Copy link
Copy Markdown

Pipelines

⚠️ Warnings

🚦 3 Pipeline jobs failed

Test Suite | tests / tests (locked, 8.3, ubuntu-latest)   View in Datadog   GitHub Actions

Test Suite | tests / tests (locked, 8.4, ubuntu-latest)   View in Datadog   GitHub Actions

Test Suite | tests / tests (locked, 8.5, ubuntu-latest)   View in Datadog   GitHub Actions

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: bbdf2e9 | Docs | Give us feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Storage for Aggregating Functions

1 participant