OSAC-1323: Update cluster-fulfillment.md to match current codebase#32
OSAC-1323: Update cluster-fulfillment.md to match current codebase#32tzvatot wants to merge 5 commits into
Conversation
- Replace cloudkit-controller references with osac-operator - Fix proto paths to include osac/ directory level - Fix host_class to host_type - Fix node_sets: name is a map key, not a field - Fix playbook name from playbook_cloudkit_ to playbook_osac_ - Fix CLOUDKIT_TEMPLATE_COLLECTIONS to osac_template_collections - Fix cloudkit.service.metallb_ingress to osac.service.metallb_ingress - Add Signal RPC to private API operations list Assisted-by: Claude Code <noreply@anthropic.com> Signed-off-by: Elad Tabak <etabak@redhat.com>
Replace incorrect ClusterAvailable/NodesReady/InfrastructureReady/ TemplateApplied with actual condition types from the codebase: - ClusterOrder CRD: Accepted, Progressing, ControlPlaneAvailable, Available - Fulfillment Service API: PROGRESSING, READY, FAILED, DEGRADED Assisted-by: Claude Code <noreply@anthropic.com> Signed-off-by: Elad Tabak <etabak@redhat.com>
Add new section documenting cluster catalog items, the primary user-facing abstraction for ordering clusters. Covers field definitions, publication controls, tenant scoping, and the relationship between catalog items and underlying templates. Assisted-by: Claude Code <noreply@anthropic.com> Signed-off-by: Elad Tabak <etabak@redhat.com>
…illment.md Add public API section documenting GetKubeconfig, GetPassword, and other tenant-facing operations with their REST endpoints. Fix cluster status to distinguish state (3 values) from conditions (4 types), add node_sets to status fields, and mark hub as private API only. Assisted-by: Claude Code <noreply@anthropic.com> Signed-off-by: Elad Tabak <etabak@redhat.com>
…nt.md Update spec fields to match current code: use JSON tag casing, document templateParameters as JSON string, nodeRequests with resourceClass/numberOfNodes, and add pullSecret, sshPublicKey, releaseImage, and network fields. Add status section documenting phase, conditions, clusterReference, jobs history, and desiredConfigVersion. Assisted-by: Claude Code <noreply@anthropic.com> Signed-off-by: Elad Tabak <etabak@redhat.com>
WalkthroughThis PR updates ChangesCluster-Fulfillment Documentation Alignment
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Risk Assessment:
Mitigation: Verify that actual proto files, CRD types in Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 11✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
The reason these docs are out of sync is because many features have been added without appropriate enhancement requests. For a trivial example, there has been no discussion or agreement on the catalog feature carried out with the public OSAC team. This review should be done before document updates, not after. |
The catalog items EP exists (enhancement-proposals#17) and was merged. The doc update simply brings the architecture doc in line with what's already implemented and merged. |
|
I realize that an enhancement proposal was created for the catalog. However, there was no open discussion and review of this as part of the public project before it was merged. Open items such as testing were not resolved, which would certainly be unacceptable to the MOC. The fact that it was merged doesn't justify updating the architecture document that is public without discussion with at least the MOC people who are participating in the public deployment of OSAC. |
Thanks for the feedback, Heidi. The catalog items enhancement proposal (enhancement-proposals#17) was open for over 3 months before being merged. AFAIK we don't hold a dedicated meeting or discussion for every EP - the PR review process is the discussion. That was the window to raise concerns, request changes, or flag gaps in the design. That said, if there are specific gaps or concerns you'd like to address now, the best path forward would be to either open a new enhancement proposal with the proposed changes, or create a Jira issue so we can discuss the details and plan accordingly. Happy to collaborate on either. Regarding this PR specifically - it's a doc update to bring the architecture document in line with what's already been implemented and merged. Keeping the docs out of sync doesn't help anyone. |
|
Thanks @tzvatot |
|
Thanks for your collaboration. We are working on Jira issues for all of our MOC requirements. |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
architecture/cluster-fulfillment.md (1)
160-172:⚠️ Potential issue | 🟠 MajorFix stale Ansible playbook/variable references in cluster fulfillment docs (will break provisioning).
In
architecture/cluster-fulfillment.md(lines 160-172), the doc referencesosac-aap/playbook_osac_create_hosted_cluster.ymland theosac_template_collectionsvariable, but this repository does not contain the referenced playbook file andosac_template_collectionsdoes not appear in any*.yml/*.yaml.
- Risk (major): operators following these instructions may execute a non-existent playbook / miss required variable wiring, causing hosted cluster creation to fail and extending downtime.
- Update the playbook path to the correct location (or link the correct repository) and ensure
osac_template_collectionsis defined/used consistently with the referenced templates.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@architecture/cluster-fulfillment.md` around lines 160 - 172, The doc references a non-existent playbook and an undefined variable: update the Cluster Creation Playbook reference and variable usage so operators can actually run provisioning; specifically replace or correct the playbook path `osac-aap/playbook_osac_create_hosted_cluster.yml` with the real playbook location (or add a link to the external repo that contains it) and ensure the variable `osac_template_collections` is defined and consistently named where templates are included (the section describing Template Execution and any dynamic include statements that reference the template collection), or rename references to match the actual variable used in the codebase so the docs and playbook variable wiring align.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@architecture/cluster-fulfillment.md`:
- Line 228: The doc currently defines validation_schema but doesn't state
where/when it's enforced; update the cluster-fulfillment.md entry for the
validation_schema field to explicitly state the enforcement point(s) (e.g.,
validated at API ingress via admission webhook, validated again during
controller reconciliation, and/or validated in the CLI before submit), describe
the security boundary (server-side vs client-side), and note any differences in
behavior (errors returned at request time vs reconciliation/failure states) so
integrators know whether client validation is sufficient and what to expect when
constraints are violated.
- Around line 60-61: Add a new "Security" subsection documenting authentication,
authorization, and audit requirements for the credential endpoints
(GetKubeconfig, GetPassword and their HTTP variants GetKubeconfigViaHttp,
GetPasswordViaHttp): specify supported authentication mechanisms (bearer token
and/or mTLS), require strong token validation and session expiry, define an
authorization model (RBAC roles and tenant-scoped checks ensuring only cluster
owners/admins can call these methods), mandate audit logging of all calls
including caller identity, timestamp, cluster ID and outcome, and recommend
additional controls (MFA for UI access, IP allowlists, least-privilege roles,
and short-lived credentials) plus examples of required HTTP headers and error
responses for unauthorized/forbidden access.
- Around line 35-41: The documentation references a non-existent proto file and
an unverified "Signal" RPC in the Private API Operations list; update the
"Private API Operations" section to remove or mark the Signal RPC as unverified
and either (a) point to the correct proto file if it exists elsewhere or (b)
state that Signal is deprecated/unsupported until the proto
(clusters_service.proto) is located and its RPCs confirmed; specifically edit
the "Private API Operations" list (the entry mentioning
`fulfillment-service/proto/private/osac/private/v1/clusters_service.proto` and
the `Signal` RPC) to reflect the verified contract or remove the misleading
claim so clients/servers aren’t misled.
- Around line 52-63: The document references a missing proto
(fulfillment-service/proto/public/osac/public/v1/clusters_service.proto) and
lists RPCs like GetKubeconfig, GetKubeconfigViaHttp, GetPassword,
GetPasswordViaHttp and HTTP mappings
(/api/fulfillment/v1/clusters/{id}/kubeconfig and /password) that can’t be
validated; fix by adding the authoritative clusters_service.proto into the repo
at that path (or update the document to the correct existing proto path), ensure
the proto contains the listed RPCs and options (including any HTTP annotations
and response types), and then update architecture/cluster-fulfillment.md to
exactly match the proto message and HTTP binding names (List, Get, Create,
Update, Delete, optimistic lock field) so the documented secret-bearing
endpoints and authorization mappings are identical to the source.
---
Outside diff comments:
In `@architecture/cluster-fulfillment.md`:
- Around line 160-172: The doc references a non-existent playbook and an
undefined variable: update the Cluster Creation Playbook reference and variable
usage so operators can actually run provisioning; specifically replace or
correct the playbook path `osac-aap/playbook_osac_create_hosted_cluster.yml`
with the real playbook location (or add a link to the external repo that
contains it) and ensure the variable `osac_template_collections` is defined and
consistently named where templates are included (the section describing Template
Execution and any dynamic include statements that reference the template
collection), or rename references to match the actual variable used in the
codebase so the docs and playbook variable wiring align.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: osac-project/coderabbit/.coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 49edf2d5-a9d0-405c-9f0e-48edd7fdc1a6
📒 Files selected for processing (1)
architecture/cluster-fulfillment.md
OSAC-1323: Update cluster-fulfillment.md to match current codebase
Jira: https://redhat.atlassian.net/browse/OSAC-1323
Summary
The cluster fulfillment architecture doc had significant gaps compared to the current codebase: missing documentation for catalog items and the public API, incorrect status condition names, outdated component names (cloudkit -> osac), and wrong field/path references. This PR brings the document in line with the current state of fulfillment-service, osac-operator, and osac-aap.
Changes
New content:
Corrections:
osac/directory levelhost_classtohost_type, clarify node_sets uses map keys not a name fieldTesting
Documentation-only change. All referenced file paths verified to exist in the current codebase.
Acceptance Criteria
Summary by CodeRabbit