MEP-20 L3 only#282
Conversation
✅ Deploy Preview for metal-stack-io ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
|
||
| ## metal-core | ||
|
|
||
| metal-core will need to support additional configuration templates for the boot vrf. |
There was a problem hiding this comment.
We could also install metal-core in a way that it also talks to metal-apiserver from within this boot-vrf which completely eliminates the need for weird routes on the switches
| The L3 only boot and registration process can be described as follows: | ||
|
|
||
| - Every server will be scanned on a regular basis from the metal-bmc if there is IPXE is configured as boot iso payload. This is a additional task on the metal-bmc. metal-bmc already scans all servers on a regular basis to gather power metrics etc. | ||
| - If the boot iso is set to ipxe, the boot source override must be set to CDROM instead of PXE from network and a reboot must be triggered (migration to this approach, not when a machine is allocated). |
There was a problem hiding this comment.
If we don't plan on removing support for the old PXE boot with this change, it could make sense to track the boot mode of each machine in metal-api. The migration step could then be triggered and tracked by metal-api.
|
|
||
| The placement therefore follows from the role given to `metal-boot`. If it only handles the lightweight control functions such as token issuance, `boot.ipxe`, DNS and NTP, placing a container on each leaf is acceptable. The small control steps stay well within the `ip2me` budget, and this also fits the suggestion from the design notes that `metal-boot` could be deployed on each switch with a shared anycast address for redundancy. The downside is that it exposes additional services on critical infrastructure, so the container still needs proper hardening. If `metal-boot` must instead act as a complete proxy that also carries bulk traffic, it should be placed on a fabric reachable host such as a management server. From there the proxied traffic is forwarded in hardware and never punted to a switch CPU, so CoPP does not apply. | ||
|
|
||
| The metal-image-cache-sync is currently placed on the management-servers. One of the stated goals is to remove the need for connections between the production infrastructure and the management infrastrutcure. Since placing or proxying the image cache on the switches is not viable, the image cache has to move to a different location. The image cache can either be hosted on a metal-stack provisioned machine, or on a server outside of metal-stack's scope. |
There was a problem hiding this comment.
When placing the image cache on a metal-stack provisioned machine, how would the bootstrap work here? Temporary server outside of metal-stack?
There was a problem hiding this comment.
Image cache is totally optional, so for the first machine pulling the image directly would only slow down installation
| - Enable automated IPv6 address acquisition via SLAAC (RFC 4862) driven by Router Advertisements (RFC 4861) instead of DHCP | ||
| - IPv6 in a dedicated Boot VRF instead of a Boot VLAN. | ||
|
|
||
| This approach requires that metal-apiserver, metal-hammer, ipxe and a new component running in the partition and connected to the boot-vrf (`metal-boot` for now) are IPv6 ready. |
There was a problem hiding this comment.
Can we get rid of iPXE and its complexity completely?
There was a problem hiding this comment.
I dont think so. I am pretty sure we would end up with a much more complex solution without it.
Description
DRAFT L3 only network
Please only review the Readme.md, the files in the
aifolder where generated during the design process from AI and will be removed in the final MEP. I kept them only for reference during the review process.Used AI-Tools ✨