Skip to content

Add flag for parquet queries concurrency in parquet bucket store#7613

Open
SungJin1212 wants to merge 1 commit into
cortexproject:masterfrom
SungJin1212:Add-shard-concurrency
Open

Add flag for parquet queries concurrency in parquet bucket store#7613
SungJin1212 wants to merge 1 commit into
cortexproject:masterfrom
SungJin1212:Add-shard-concurrency

Conversation

@SungJin1212

@SungJin1212 SungJin1212 commented Jun 10, 2026

Copy link
Copy Markdown
Member

This PR introduces a new configuration flag for setting the concurrency of parquet queries in the parquet bucket store.
Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
  • docs/configuration/v1-guarantees.md updated if this PR introduces experimental flags

@dosubot dosubot Bot added component/store-gateway storage/blocks Blocks storage engine labels Jun 10, 2026
@SungJin1212 SungJin1212 force-pushed the Add-shard-concurrency branch 2 times, most recently from 702a172 to 516c509 Compare June 10, 2026 04:45
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
@SungJin1212 SungJin1212 force-pushed the Add-shard-concurrency branch from 516c509 to b3c42c7 Compare June 10, 2026 04:47
@SungJin1212 SungJin1212 changed the title Add flag and validation for parquet shard concurrency Add flag for parquet queries concurrency in parquet bucket store Jun 10, 2026
Comment on lines +341 to +343
// ParquetQueryConcurrency controls the maximum number of concurrent goroutines
// per query at each level of parquet processing: shard querying, row group
// processing, and column materialization.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flag gates three nested errgroups, where each level holds its slot while waiting on the level below, so the per-request concurrency compounds: effective concurrency is value^3, not value. With the default 4, a single request fans out to up to 64 goroutines (4 × 4 × 4). With 16 it's 4096.

The three levels:

  1. Series over shards (parquet_bucket_store.go:113)
  2. parquetBlock.Query over row groups (parquet_bucket_stores.go:388)
  3. search.NewMaterializer over columns (parquet_bucket_stores.go:330)

This applies per request (no shared limiter even within a tenant), so total goroutines = value^3 × in-flight requests.

Either we change the flag help text, or we split into three flags so each level can be tuned independently.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is one concurrency flag gating all concurrency in multiple places. Same in the upstream library. We can document that it is not 4 goroutines for the whole query just to be clear. To tune this independently this requires upstream change and introduces more complexity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants