Skip to content

MEP-20 L3 only#282

Open
majst01 wants to merge 12 commits into
mainfrom
layer-3-only
Open

MEP-20 L3 only#282
majst01 wants to merge 12 commits into
mainfrom
layer-3-only

Conversation

@majst01

@majst01 majst01 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Description

DRAFT L3 only network

Please only review the Readme.md, the files in the ai folder where generated during the design process from AI and will be removed in the final MEP. I kept them only for reference during the review process.

Used AI-Tools ✨

  • Qwen3.6 used for generation of ideas in the ai folder.

@metal-robot metal-robot Bot added the area: documentation Affects the documentation area. label Jun 9, 2026
@metal-robot metal-robot Bot added this to Development Jun 9, 2026
@netlify

netlify Bot commented Jun 9, 2026

Copy link
Copy Markdown

Deploy Preview for metal-stack-io ready!

Name Link
🔨 Latest commit da77748
🔍 Latest deploy log https://app.netlify.com/projects/metal-stack-io/deploys/6a47741abed7980008ba056f
😎 Deploy Preview https://deploy-preview-282--metal-stack-io.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Comment thread community/04-Proposals/MEP20/ai/mep-ra-slaac-boot.md Outdated
Comment thread community/04-Proposals/MEP20/README.md Outdated
Comment thread community/04-Proposals/MEP20/README.md
Comment thread community/04-Proposals/MEP20/README.md Outdated
Comment thread community/04-Proposals/MEP20/README.md Outdated
Comment thread community/04-Proposals/MEP20/README.md Outdated
Comment thread community/04-Proposals/MEP20/README.md Outdated
Comment thread community/04-Proposals/MEP20/README.md Outdated
Comment thread community/04-Proposals/MEP20/README.md Outdated
Comment thread community/04-Proposals/MEP20/README.md Outdated

## metal-core

metal-core will need to support additional configuration templates for the boot vrf.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also install metal-core in a way that it also talks to metal-apiserver from within this boot-vrf which completely eliminates the need for weird routes on the switches

The L3 only boot and registration process can be described as follows:

- Every server will be scanned on a regular basis from the metal-bmc if there is IPXE is configured as boot iso payload. This is a additional task on the metal-bmc. metal-bmc already scans all servers on a regular basis to gather power metrics etc.
- If the boot iso is set to ipxe, the boot source override must be set to CDROM instead of PXE from network and a reboot must be triggered (migration to this approach, not when a machine is allocated).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't plan on removing support for the old PXE boot with this change, it could make sense to track the boot mode of each machine in metal-api. The migration step could then be triggered and tracked by metal-api.

@majst01 majst01 changed the title L3 only MEP-20 L3 only Jun 21, 2026
Comment thread community/04-Proposals/MEP20/README.md Outdated

The placement therefore follows from the role given to `metal-boot`. If it only handles the lightweight control functions such as token issuance, `boot.ipxe`, DNS and NTP, placing a container on each leaf is acceptable. The small control steps stay well within the `ip2me` budget, and this also fits the suggestion from the design notes that `metal-boot` could be deployed on each switch with a shared anycast address for redundancy. The downside is that it exposes additional services on critical infrastructure, so the container still needs proper hardening. If `metal-boot` must instead act as a complete proxy that also carries bulk traffic, it should be placed on a fabric reachable host such as a management server. From there the proxied traffic is forwarded in hardware and never punted to a switch CPU, so CoPP does not apply.

The metal-image-cache-sync is currently placed on the management-servers. One of the stated goals is to remove the need for connections between the production infrastructure and the management infrastrutcure. Since placing or proxying the image cache on the switches is not viable, the image cache has to move to a different location. The image cache can either be hosted on a metal-stack provisioned machine, or on a server outside of metal-stack's scope.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When placing the image cache on a metal-stack provisioned machine, how would the bootstrap work here? Temporary server outside of metal-stack?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image cache is totally optional, so for the first machine pulling the image directly would only slow down installation

- Enable automated IPv6 address acquisition via SLAAC (RFC 4862) driven by Router Advertisements (RFC 4861) instead of DHCP
- IPv6 in a dedicated Boot VRF instead of a Boot VLAN.

This approach requires that metal-apiserver, metal-hammer, ipxe and a new component running in the partition and connected to the boot-vrf (`metal-boot` for now) are IPv6 ready.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get rid of iPXE and its complexity completely?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think so. I am pretty sure we would end up with a much more complex solution without it.

@majst01 majst01 marked this pull request as ready for review July 3, 2026 08:34
@majst01 majst01 requested a review from a team as a code owner July 3, 2026 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: documentation Affects the documentation area.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

7 participants