Add flag for parquet queries concurrency in parquet bucket store#7613
Add flag for parquet queries concurrency in parquet bucket store#7613SungJin1212 wants to merge 1 commit into
Conversation
702a172 to
516c509
Compare
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
516c509 to
b3c42c7
Compare
| // ParquetQueryConcurrency controls the maximum number of concurrent goroutines | ||
| // per query at each level of parquet processing: shard querying, row group | ||
| // processing, and column materialization. |
There was a problem hiding this comment.
The flag gates three nested errgroups, where each level holds its slot while waiting on the level below, so the per-request concurrency compounds: effective concurrency is value^3, not value. With the default 4, a single request fans out to up to 64 goroutines (4 × 4 × 4). With 16 it's 4096.
The three levels:
Seriesover shards (parquet_bucket_store.go:113)parquetBlock.Queryover row groups (parquet_bucket_stores.go:388)search.NewMaterializerover columns (parquet_bucket_stores.go:330)
This applies per request (no shared limiter even within a tenant), so total goroutines = value^3 × in-flight requests.
Either we change the flag help text, or we split into three flags so each level can be tuned independently.
There was a problem hiding this comment.
It is one concurrency flag gating all concurrency in multiple places. Same in the upstream library. We can document that it is not 4 goroutines for the whole query just to be clear. To tune this independently this requires upstream change and introduces more complexity
This PR introduces a new configuration flag for setting the concurrency of parquet queries in the parquet bucket store.
Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]docs/configuration/v1-guarantees.mdupdated if this PR introduces experimental flags