[AMD] Add MiniMax-M3-FP4 MI355X ATOMESH update 0623 by seungrokj · Pull Request #1940 · SemiAnalysisAI/InferenceX

seungrokj · 2026-06-26T01:28:32Z

Summary

Add minimaxm3-fp4-mi355x-atom-disagg CI recipe: multi-node disaggregated PD on MI355X via ATOM for MiniMax-M3-MXFP4
Refactor server_atom.sh to eliminate all hardcoded MODEL_NAME == "DeepSeek-V4-Pro" / per-model checks — all model-specific config (env vars, parallel flags, MTP flags, KV cache flags, HF overrides) now driven from models_atom.yaml, matching the server_vllm.sh pattern
Update models_atom.yaml schema with new fields for env, tp_dp_flags, tp_dp_env, ep_dp_flags, ep_dp_env, mtp_flags, kv_cache_flags, hf_overrides; add entries for MiniMax-M3-MXFP4 and MiniMax-M3-MXFP8 with EAGLE3 MTP flags
Fix model HuggingFace path: amd/MiniMax-M3-MXFP8 → MiniMaxAI/MiniMax-M3-MXFP8 in minimaxm3-fp8-mi355x-atom-disagg
Image bump for both FP4 and FP8 MI355X ATOM recipes: rocm/atom-dev:MiniMax-M3-20260623

Fields added to `models_atom.yaml`

Field	Purpose
`env`	Space-separated `KEY=VALUE` pairs exported unconditionally
`tp_dp_flags`	Parallel flags for TP+DPA mode
`tp_dp_env`	Env vars exported only in TP+DPA mode
`ep_dp_flags`	Parallel flags for EP+DPA mode
`ep_dp_env`	Env vars exported only in EP+DPA mode
`mtp_flags`	Flags prepended to `SPEC_ARGS` before `$DECODE_MTP_SIZE`
`kv_cache_flags`	Full `--kv_cache_dtype` flag string
`hf_overrides`	JSON string passed to `--hf-overrides`

`minimaxm3-fp4-mi355x-atom-disagg` Recipe Details

Image: rocm/atom-dev:MiniMax-M3-20260623
Model: amd/MiniMax-M3-MXFP4
Framework: atom-disagg, multi-node disaggregated PD
Search space: ISL=8192 and ISL=1024, OSL=1024, 1P1D TP4, conc 1–512

PR Review Checklist

Verified that as of the moment of typing this, this is the latest version of PR_REVIEW_CHECKLIST.md
Verified that the general code quality meets the InferenceX standard and does not make the code quality any worse.
Verified that this PR has passed PR validation. Please link to GitHub Action workflow that shows this.
Verified that this PR passes evals. Please link to GitHub Action workflow that shows this.
Verified that speculative decoding PRs uses chat templates to align the AL distribution to real world
If a company claims that they support vLLM/SGLang as first class LLM inference engines on their hardware, I have verified that the respective vLLM/SGLang submission has been made before additional frameworks (TRT-LLM, ATOM, etc.). The only exceptions are for new hardware, such as MI455X UALoE72, Vera Rubin NVL72, Rubin NVL8, etc., and for new model architectures where there is an actual reason why vLLM/SGLang does not fundamentally support them yet.
Verified that the single-node recipes are similar to the official vLLM recipes and/or the SGLang cookbook:
- If they are not, I have verified that a PR has been opened in vLLM recipe repo or SGLang repo and linked it below in the additional detail section:
If any of the above criteria cannot reasonably be satisfied, I have provided additional reasoning below.

🤖 Generated with Claude Code

github-actions · 2026-06-26T01:28:39Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

感谢你的贡献！对于 vLLM 与 SGLang，请确保你的 recipe 与官方 vLLM recipes 和/或 SGLang cookbook 保持一致

如果不一致，请先创建一个 PR，之后我们才能将你的单节点 PR 合并到 master 分支。让我们确保文档保持一流水准，使整个 ML 社区都能从你的辛勤工作中受益！谢谢

PR 作者有责任确保合并后所有 GitHub Action 任务完全通过。 很多时候失败只是偶发抖动（flake），重新运行失败的任务即可解决。如果选择重新运行失败的任务，PR 作者有责任确保其最终通过。参见 GitHub 关于重新运行失败任务的文档：https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

一般而言，PR 作者应先向相应公司的 CODEOWNERS 请求审阅并获得 PR 批准，然后再请求核心维护者审阅。

如需更多帮助，PR 作者可通过 Slack 联系核心维护者。

…nd server_atom.sh refactor (PR #1940) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-06-26T02:09:26Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=28211509856
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=28211509856

functionstackx · 2026-06-26T03:02:00Z

@seungrokj plz rebase this PR now that 9f02343 is merged to master such that we can avoid delay of resolving conflicts after u do an full perf sweep

… ATOM config; add minimaxm3-fp4-mi355x-atom-disagg Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…nd server_atom.sh refactor (PR #1940) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…sagg launch script Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…, SPEC_DECODING guard - Replace fragile eval "$(python3 -c "...")" with heredoc + source tempfile to avoid nested quote escaping issues that caused MODEL_ENVS to be empty at runtime - Fix PREFILL/DECODE_ENABLE_EP comparison from numeric -gt 1 to string = "true" to match the "true"/"false" values set by launch scripts - Fix SPEC_DECODING guard from hardcoded "mtp" to any non-none/non-empty value so EAGLE3 and future methods also activate SPEC_ARGS from models_atom.yaml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ewline in models_atom.yaml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-06-26T03:06:43Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=28213636410
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=28213636410

…niMax-M3 ATOM recipes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-06-26T03:12:06Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=28214515407
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=28214515407

…ages to 20260623 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-06-26T03:16:54Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=28214688792
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=28214688792

…agg image to 20260622 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-06-26T03:18:27Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=28214844378
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=28214844378

github-actions · 2026-06-26T03:22:51Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=28214881569
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=28214881569

github-actions · 2026-06-26T04:45:57Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=28215040536
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=28215040536

seungrokj requested a review from a team June 26, 2026 01:28

seungrokj added the AMD label Jun 26, 2026

seungrokj requested review from 1am9trash, billishyahao, chunfangamd and yctseng0211 as code owners June 26, 2026 01:28

github-project-automation Bot added this to InferenceMAX Board Jun 26, 2026

seungrokj added a commit that referenced this pull request Jun 26, 2026

[AMD] add perf-changelog entry for minimaxm3-fp4-mi355x-atom-disagg a…

9269a16

…nd server_atom.sh refactor (PR #1940) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

seungrokj changed the title ~~[AMD] Add MiniMax-M3-FP4 MI355X ATOM disagg + refactor server_atom.sh for YAML-driven model config~~ [AMD] Add MiniMax-M3-FP4 MI355X ATOMESH update 0623 Jun 26, 2026

seungrokj added evals-only Suppress throughput and run only eval jobs; combine with all-evals to expand selection full-sweep-enabled labels Jun 26, 2026

claude Bot reviewed Jun 26, 2026

View reviewed changes

Comment thread benchmarks/multi_node/amd_utils/models_atom.yaml

Comment thread benchmarks/multi_node/amd_utils/server_atom.sh Outdated

Comment thread benchmarks/multi_node/amd_utils/server_atom.sh

seungrokj and others added 5 commits June 26, 2026 12:05

[AMD] refactor server_atom.sh and models_atom.yaml for model-specific…

6dd76d4

… ATOM config; add minimaxm3-fp4-mi355x-atom-disagg Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

[AMD] add perf-changelog entry for minimaxm3-fp4-mi355x-atom-disagg a…

1b557ee

…nd server_atom.sh refactor (PR #1940) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

[AMD] add env dump in server_atom.sh and minimaxm3-fp4-mi355x-atom-di…

33525e0

…sagg launch script Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

[AMD] cap minimaxm3-fp8-mi355x-atom-disagg conc to 256; fix missing n…

21e9281

…ewline in models_atom.yaml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

seungrokj force-pushed the amd/m3_atom_pd_fp4_0623 branch from baaf92b to 21e9281 Compare June 26, 2026 03:06

[AMD] update amd-master.yaml: image bumps, search space tweaks for Mi…

805d9f6

…niMax-M3 ATOM recipes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

[AMD] restore minimaxm3-fp4/fp8-mi355x-atom recipes; bump all ATOM im…

345ed5b

…ages to 20260623 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

[AMD] clean up minimaxm3-fp4-mi355x-atom search space; revert fp8-dis…

f3a74bb

…agg image to 20260622 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge branch 'main' into amd/m3_atom_pd_fp4_0623

3a1faca

seungrokj removed the full-sweep-enabled label Jun 26, 2026

Merge branch 'main' into amd/m3_atom_pd_fp4_0623

ee301d0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMD] Add MiniMax-M3-FP4 MI355X ATOMESH update 0623#1940

[AMD] Add MiniMax-M3-FP4 MI355X ATOMESH update 0623#1940
seungrokj wants to merge 10 commits into
mainfrom
amd/m3_atom_pd_fp4_0623

seungrokj commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

functionstackx commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

seungrokj commented Jun 26, 2026

Summary

Fields added to models_atom.yaml

minimaxm3-fp4-mi355x-atom-disagg Recipe Details

PR Review Checklist

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

functionstackx commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fields added to `models_atom.yaml`

`minimaxm3-fp4-mi355x-atom-disagg` Recipe Details