FIX BED-8687: Replace fields that use type Any with fixed types to fix DLT schema validation errors #14
FIX BED-8687: Replace fields that use type Any with fixed types to fix DLT schema validation errors #14d3vzer0 wants to merge 16 commits into
Any with fixed types to fix DLT schema validation errors #14Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the GitHub Actions workflow-related models to replace Any-typed fields with more stable, explicitly typed structures in order to reduce DLT schema validation errors caused by type variance across extracted rows.
Changes:
- Tightened workflow YAML parsing models (
WorkflowStepDefinition,WorkflowJobDefinition,WorkflowDocument) by replacing severalAnyfields with explicit unions / mappings. - Introduced a
Containermodel and updated job/workflow permission typing. - Updated
WorkflowJobto normalizeruns_oninto a stable string when emitting graph node properties.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
src/openhound_github/models/workflow.py |
Replaces Any in workflow YAML parsing models and adds a Container model. |
src/openhound_github/models/workflow_job.py |
Narrows runs_on typing and adds a normalization property used when building graph nodes. |
Comments suppressed due to low confidence (1)
src/openhound_github/models/workflow.py:59
with_/envare now typed asdict[str, str], but the shared validator still returnsdict[str, Any]and doesn't coerce values. If YAML contains non-string scalars (or other types), this can either fail validation or reintroduce mixed DLT types. Consider normalizing keys/values to strings in the validator and updating the return type accordingly.
env: dict[str, str] = Field(default_factory=dict)
@field_validator("with_", "env", mode="before")
@classmethod
def dict_or_empty(cls, value: Any) -> dict[str, Any]:
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/openhound_github/models/workflow.py:60
with_/envare now typed asdict[str, str], but this validator still returns the raw mapping (dict[str, Any]) without coercing keys/values. YAML commonly produces non-string scalars (includingnullfor empty values), which can cause Pydantic validation to fail and the whole workflow document to be dropped. Normalize keys/values tostrhere (consistent with the job-levelenv/secretsnormalization).
with_: dict[str, str] = Field(default_factory=dict, alias="with")
env: dict[str, str] = Field(default_factory=dict)
@field_validator("with_", "env", mode="before")
@classmethod
def dict_or_empty(cls, value: Any) -> dict[str, Any]:
return value if isinstance(value, dict) else {}
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
src/openhound_github/models/workflow.py:60
with_/envare now typed asdict[str, str], but this validator still returnsdict[str, Any]and doesn't normalize values. If a workflow YAML contains non-string scalars (e.g.,nullfor an empty value),WorkflowDocument.model_validate()can now fail anddocumentbecomesNone, dropping all workflow-derived rows. Coerce keys/values to strings here (and map YAMLnullto an empty string) to keep parsing robust and the emitted schema stable.
@field_validator("with_", "env", mode="before")
@classmethod
def dict_or_empty(cls, value: Any) -> dict[str, Any]:
return value if isinstance(value, dict) else {}
…f RunsOn is a RunsOn model or dict
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/openhound_github/models/workflow.py:60
with_/envare now typed asdict[str, str], but the pre-validator still returnsdict[str, Any]without coercion. This can lead to Pydantic validation errors (e.g., numeric YAML scalars) and also defeats the goal of keeping the extracted schema stable.
Consider normalizing both keys and values to strings here (mirroring the WorkflowJobDefinition normalizer).
@field_validator("with_", "env", mode="before")
@classmethod
def dict_or_empty(cls, value: Any) -> dict[str, Any]:
return value if isinstance(value, dict) else {}
Replaced the
Anyfield type with stable types to prevent DLT schema validation issues (when different field types are collected during the same run) for the following models: