refactor(models): reusable latest-execution ranking with pre/post-rank success filters#786
refactor(models): reusable latest-execution ranking with pre/post-rank success filters#786lewisjared wants to merge 2 commits into
Conversation
…k success filters Extract the latest-execution-per-group ranking into a reusable row_number() primitive parameterised by an optional pre-rank population filter (among_executions). This lets callers choose which execution population "latest" is computed over, rather than only filtering the winner after the fact. Add a `latest_successful` knob to get_execution_group_and_latest_filtered (pre-rank: "each group's latest successful run") alongside the existing `successful` knob (post-rank: "is the latest run successful?"). The two answer different questions and are documented as such. Also fixes a latent tie-break bug: the previous max(created_at) equijoin duplicated a group row when two executions shared a created_at. The row_number() window (ordered created_at DESC, id DESC) guarantees exactly one winner. Switching reingest to the pre-rank filter is a behaviour change tracked in #782.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughRefactors latest-execution-per-group selection in ChangesLatest execution ranking refactor
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/climate-ref/src/climate_ref/models/execution.py (1)
440-461: 🚀 Performance & Scalability | 🔵 TrivialRanking primitive looks correct; consider a supporting composite index.
The two-level subquery (window rank inner,
rn == 1outer) is the right pattern since window functions can't be filtered inWHERE, andcreated_at DESC, id DESCgives a deterministic single winner per group.For larger
executiontables, a composite index on(execution_group_id, created_at, id)would let the partitionedrow_number()sort resolve from the index rather than a per-partition sort. The existing standaloneexecution_group_id/created_atindexes only partially cover this.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 4d688014-8486-411d-972a-61098ab7ea95
📒 Files selected for processing (2)
packages/climate-ref/src/climate_ref/models/execution.pypackages/climate-ref/tests/unit/test_execution_latest_ranking.py
Codecov Report✅ All modified and coverable lines are covered by tests.
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 9 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Description
Extracts the latest-execution-per-group ranking into a reusable
row_number()primitive parameterised by an optional pre-rank population filter (among_executions), so callers can choose which execution population "latest" is computed over rather than only filtering the winner after the fact.Adds a
latest_successfulknob toget_execution_group_and_latest_filtered(pre-rank: "each group's latest successful run") alongside the existingsuccessfulknob (post-rank: "is the latest run successful?"). The two answer different questions and are documented as such:successful=Truedrops a group whose newest run failed, whereaslatest_successful=Truechanges which run is chosen as newest so an earlier success is still surfaced.Also fixes a latent tie-break bug: the previous
max(created_at)equijoin duplicated a group row when two executions shared acreated_at. Therow_number()window (orderedcreated_at DESC, id DESC) guarantees exactly one winner per group.This is a foundation-only change. The new
latest_successfulparameter has no callers yet; switchingreingestto the pre-rank filter is a behaviour change tracked in #782.Checklist
Please confirm that this pull request has done the following:
changelog/Summary by CodeRabbit