feat(results): add the climate_ref.results read layer#780
Conversation
Introduce climate_ref.results as the single source of truth for filtering and querying metric-value results, so notebooks, the CLI and (later) the ref-app API share one query layer instead of re-implementing it. - _query.py: MetricValueFilter plus select_scalar_values/select_series_values Select builders, count_values and latest_execution_for_group. Exact vs substring diagnostic/provider matching and a promoted-only default close the slug/name and promotion drift between the CLI and the API. - outliers.py: source-id-aware IQR detection as read-model logic. - frames.py: DataFrame and facet-collection helpers. - values.py: typed ScalarValue/SeriesValue DTOs and collections with to_pandas(), plus the Reader facade. Scalar reads SQL-paginate when outlier detection is off and materialise the full set only when it is on. - cli/executions.py: new "ref executions values <group-id>" command that reads through Reader, with scalar/series output, dimension filters, optional outlier detection, pagination and table/json formats.
Move the scalar/series read methods onto a ValuesReader sub-reader exposed as Reader.values, making Reader a thin entry point that will hold further per-domain sub-readers as the layer grows. Curate the top-level package surface to export only what a caller names to make a call: the Reader entry point, the filters you pass in, and the value objects you pass in. DTOs, collections, and sub-reader classes now live in their domain submodule (imported by path when a name is needed), and the Select builders stay private in _query. This keeps the namespace small as new domains are added.
Add the executions domain to the results read layer: an ExecutionsReader sub-reader (reached via reader.executions) exposing groups(), statistics() and latest_execution(), with an ExecutionGroupFilter, detached ExecutionGroupView/ExecutionView/ExecutionStats DTOs, an ExecutionGroupCollection, and a select_execution_statistics() builder. groups() wraps the existing get_execution_group_and_latest_filtered helper rather than reimplementing it, since reingest and the versioning tests also depend on that helper; reimplementing would reintroduce the query drift this layer exists to remove. The stats aggregate, previously inlined only in the CLI, moves into the read layer as a reusable Select builder. Migrate the read-only CLI commands list-groups, stats and values onto reader.executions. delete-groups, reingest and fail-running stay on the ORM helper by design: they mutate rows or need the original (oldest) execution, which the detached read layer does not serve. Export only ExecutionGroupFilter at the package top level, keeping the public surface to the entry point plus constructible filters; views, collection and sub-reader are imported from the domain submodule.
Adds the files/outputs domain to the climate_ref.results read layer (RFC domain D). ArtifactsReader resolves execution output fragments (output_fragment, Execution.path, ExecutionOutput.filename) into filesystem paths, guarded against path escape via os.path.commonpath containment checks. ExecutionsReader gains outputs(execution_id) to read registered ExecutionOutput rows as detached OutputView DTOs. Reader takes an optional results root directly (Reader(database, results=...)) rather than through a wrapper value object -- a single Path field doesn't need one, and the decoupling from Config comes from not taking Config, not from wrapping the path. ref executions inspect now shows registered outputs and resolves its output directory through reader.artifacts instead of joining config.paths.results by hand.
Add a first-class ORM query path for datasets: DatasetFilter and select_datasets in models/dataset_query.py, covering facet, finalised, and execution/diagnostic relationship filters, plus DatasetsReader exposing detached DatasetView and DatasetFileView DTOs via reader.datasets. Latest-version selection reuses the adapter's filter_latest_versions so there is a single definition of that policy.
DatasetAdapter.load_catalog and solve_helpers.load_local_catalogs coerced a catalog's start_time/end_time strings to cftime with identical inline code. Move it to a shared datasets.utils.coerce_catalog_times helper.
Route load_catalog's dataset and file queries through select_datasets so load_catalog and reader.datasets share one definition of the datasets query. Apply the row limit after deduplicating to the latest version. Previously the limit was pushed into the query before deduplication, so it could be spent on superseded versions that were then dropped, returning fewer datasets than requested and occasionally omitting datasets entirely.
Persist a numeric version_key on the base dataset table, computed by version_sort_key, so the latest version of a dataset can be selected in SQL (a following change adds the window-function dedup that orders by it). A before_insert/before_update mapper event on Dataset keeps version_key in sync for every write path — including direct-ORM inserts — rather than relying on register_dataset, and reads the final attribute value so a stripped-metadata upsert cannot leave a stale key. version_key is BigInteger to hold any version string's digits without overflow on PostgreSQL. The migration backfills existing rows in Python via version_sort_key over the four subclass tables; version strings are not parsed in SQL because that mapping has no portable expression across SQLite and PostgreSQL.
Add an optional latest-version dedup to select_datasets: when latest_only is set and the caller supplies the partition columns (the adapter's dataset_id_metadata), keep only the rows at RANK 1 of a RANK() OVER (PARTITION BY <group> ORDER BY version_key DESC) window. RANK (not ROW_NUMBER) preserves ties, matching the pandas select_latest_version it mirrors. The dedup uses an id-membership subquery rather than an aliased entity, so callers' class-bound loader options (selectinload) keep working. reader.datasets and load_catalog now dedup in SQL and, for the dataset-level result, push the limit into the same statement instead of fetching the whole table and trimming in pandas. include_files still bounds files, so its limit is applied after exploding rows. The pandas filter_latest_versions path stays for parquet catalogs, which have no database rows to window over.
Add a diagnostics read domain reachable via reader.diagnostics, with a list() returning provider/slug/name/promoted_version, an all-versions execution-group count, and promoted-version successful/inflight/total counts merged in from the existing execution-statistics builder so the status-count logic stays defined once. stats() delegates to that same builder rather than duplicating the aggregate SQL. Expose the domain through a new ref diagnostics list command that renders the collection and documents every column in its help text. Extract the shared bare-str-rejecting _as_str_tuple converter into results/_converters so the filter DTOs no longer copy it per module.
Document each field on the execution, dataset, diagnostic, outlier, and metric-value filter DTOs as griffe/mkdocstrings attribute docstrings so they render per-attribute, replacing the stray inline comments that the docs tooling would not pick up, and unify wording across parallel fields.
kind is a registered CV dimension, so it has its own column and was surfacing both as the dedicated ScalarValue/SeriesValue.kind field and inside the dimensions mapping. Strip it from dimensions on read so it is exposed once via the typed field; the to_pandas kind column and the kind facet are unaffected.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (14)
🚧 Files skipped from review as they are similar to previous changes (12)
📝 WalkthroughWalkthroughThe PR adds a typed ChangesTyped results readers and CLI integration
SQL-level latest-version dataset deduplication
Possibly related PRs
Sequence Diagram(s)sequenceDiagram
participant CLI as diagnostics list
participant Reader
participant DiagnosticsReader
participant Database
CLI->>Reader: Reader(database)
CLI->>DiagnosticsReader: list(DiagnosticFilter, limit, offset)
DiagnosticsReader->>Database: select_diagnostics()
DiagnosticsReader->>Database: select_execution_statistics()
DiagnosticsReader-->>CLI: DiagnosticCollection
CLI-->>CLI: render_dataframe()
sequenceDiagram
participant CLI as executions values
participant Reader
participant ValuesReader
participant OutlierPolicy
CLI->>Reader: Reader(database)
CLI->>ValuesReader: scalar_values(...) or series_values(...)
ValuesReader->>OutlierPolicy: detect_scalar_outliers() when scalar outliers enabled
ValuesReader-->>CLI: ScalarValueCollection or SeriesValueCollection
CLI-->>CLI: render table or JSON
flowchart TD
DatasetFilter --> select_datasets
select_datasets --> PolymorphicEntityResolution
PolymorphicEntityResolution --> FacetAndJoinFilters
FacetAndJoinFilters --> RankOverVersionKey
RankOverVersionKey --> LatestRows
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (4)
packages/climate-ref/src/climate_ref/results/datasets.py (1)
63-67: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueConsider a concrete type instead of
Anyfor timestamps.
created_at/updated_atare typedAny, weakening the "typed DTO" contract the rest of the results layer establishes. A concrete type (e.g.datetime.datetime) would preserve type safety forto_pandas()consumers.packages/climate-ref/tests/unit/cli/test_datasets.py (1)
92-106: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winMake the multi-file dataset deterministic for this regression test.
first()is unordered, so the extra file may be added to a dataset that the CLI would not select first. In that case, a regressed dataset-level--limit 1path could still return one row and pass. Pick the dataset using the same ordering as the CLI query, or assert the unbounded--include-filesoutput first and target that dataset.packages/climate-ref/src/climate_ref/results/frames.py (1)
79-79: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueReaching into
entity._cv_dimensions(protected attribute) across module boundaries.
collect_facetsaccesses a private-by-convention attribute on the model classes from a different module. Consider exposing this as a public classmethod/property onMetricValuefor a cleaner boundary.packages/climate-ref/src/climate_ref/results/values.py (1)
189-207: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winInconsistent empty-result schema vs.
ExecutionGroupCollection.to_pandas().Unlike
ExecutionGroupCollection.to_pandas()(executions.py, Lines 246-279), which explicitly lists columns so callers get a stable empty table,ScalarValueCollection/SeriesValueCollection.to_pandas()return a DataFrame with zero columns whenitemsis empty (dynamic columns derived fromdict(v.dimensions), so nothing is emitted for an empty list). This is fine for the CLI (which special-caseslen(collection) == 0), but directReaderconsumers (notebooks, future API) calling.to_pandas()on an empty collection get an inconsistent shape.Also applies to: 239-268
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 16a54b94-2742-4b0e-af00-95693261ec49
📒 Files selected for processing (33)
packages/climate-ref/src/climate_ref/cli/__init__.pypackages/climate-ref/src/climate_ref/cli/diagnostics.pypackages/climate-ref/src/climate_ref/cli/executions.pypackages/climate-ref/src/climate_ref/datasets/base.pypackages/climate-ref/src/climate_ref/datasets/utils.pypackages/climate-ref/src/climate_ref/migrations/versions/2026-07-02T0338_f6a7b8c9d0e1_add_version_key.pypackages/climate-ref/src/climate_ref/models/dataset.pypackages/climate-ref/src/climate_ref/models/dataset_query.pypackages/climate-ref/src/climate_ref/results/__init__.pypackages/climate-ref/src/climate_ref/results/_converters.pypackages/climate-ref/src/climate_ref/results/_query.pypackages/climate-ref/src/climate_ref/results/artifacts.pypackages/climate-ref/src/climate_ref/results/datasets.pypackages/climate-ref/src/climate_ref/results/diagnostics.pypackages/climate-ref/src/climate_ref/results/executions.pypackages/climate-ref/src/climate_ref/results/frames.pypackages/climate-ref/src/climate_ref/results/outliers.pypackages/climate-ref/src/climate_ref/results/values.pypackages/climate-ref/src/climate_ref/solve_helpers.pypackages/climate-ref/tests/unit/cli/test_datasets.pypackages/climate-ref/tests/unit/cli/test_diagnostics.pypackages/climate-ref/tests/unit/cli/test_executions.pypackages/climate-ref/tests/unit/cli/test_executions/test_inspect.txtpackages/climate-ref/tests/unit/cli/test_root.pypackages/climate-ref/tests/unit/datasets/test_cmip6.pypackages/climate-ref/tests/unit/datasets/test_dataset_query.pypackages/climate-ref/tests/unit/datasets/test_datasets.pypackages/climate-ref/tests/unit/datasets/test_migrations_version_key.pypackages/climate-ref/tests/unit/results/test_artifacts.pypackages/climate-ref/tests/unit/results/test_datasets.pypackages/climate-ref/tests/unit/results/test_diagnostics.pypackages/climate-ref/tests/unit/results/test_executions.pypackages/climate-ref/tests/unit/results/test_results.py
Fixes across the CLI, dataset query builder, and results read layer raised in review of the results read layer PR: - diagnostics list: always validate/select --column, even when the filtered result set is empty (to_pandas already emits explicit columns for an empty collection). - executions inspect: fall back to the plain filename when an output path would escape the results root instead of propagating the ValueError. - executions values: report the outlier count and --include-unverified hint when every value in a page was flagged as an outlier, instead of a bare "No scalar values found.". - coerce_catalog_times: coerce start_time/end_time independently so a catalog with only one of the two columns doesn't KeyError. - DatasetFilter facets and MetricValueFilter diagnostic/provider filters: guard against a bare str value being iterated character-by-character. - DatasetCollection.to_pandas: emit explicit base columns for an empty collection, matching the other collection types. - reader.datasets: eager-load polymorphic dataset subtypes for mixed (source_type=None) listings to avoid an N+1 lazy load per row. - select_execution_statistics: resolve to a single tie-broken latest execution id per group before joining, so two executions tied on created_at no longer double-count that group's status. - get_execution_group_and_latest: eager-load diagnostic/provider, used by every caller of this query. - OutlierPolicy.method: restrict to the two implemented values. Adds regression tests for the empty-column, tie-count, missing-column guard, and all-outliers-hidden behaviour, plus a --kind series --format json coverage gap for the values command.
Description
Introduces
climate_ref.results, a typed, ORM-free read layer that gives notebooks, the CLI, and (eventually) the ref-app API one shared definition of each results query instead of each consumer re-implementing SQLAlchemy. A singleReaderentry point exposes focused per-domain sub-readers, each returning frozen, session-detached DTOs with ato_pandas()helper, so a DataFrame built inside awith Database(...)block stays valid afterwards.reader.values— scalar/series metric-value DTOs, DB-wide and per-query facets, and outlier detection.kind(model vs reference) is surfaced as a first-class field and stripped from thedimensionsmapping so it appears once. Backsref executions values.reader.executions— execution-group and execution listing plus per-(provider, diagnostic) statistics. The existingref executions list-groupsandstatscommands now go through the reader.reader.datasets— dataset listing with latest-version deduplication pushed into SQL via a newdataset.version_keycolumn (kept in sync by a mapper event, with a backfill migration).DatasetAdapter.load_catalogand the reader share one query builder, so the two can no longer drift.reader.diagnostics— new domain plus a newref diagnostics listcommand that shows each diagnostic's provider, promoted version, all-version execution-group count, and promoted-version successful/inflight/total counts.reader.artifacts— config-independent resolution of output, log, and bundle paths.Checklist
Please confirm that this pull request has done the following:
changelog/Summary by CodeRabbit
Summary by CodeRabbit
New Features
diagnostics listCLI view with filtering, selectable columns, JSON/table output, and pagination.executions values(scalar + series) with table/JSON output, paging, and outlier-aware presentation.Bug Fixes
--limitbehaves correctly for--include-files.