Skip to content

fix: fail fast on invalid serve parsers#4701

Open
CUHKSZzxy wants to merge 1 commit into
InternLM:mainfrom
CUHKSZzxy:fix/parser-fail-fast-dp-supervision
Open

fix: fail fast on invalid serve parsers#4701
CUHKSZzxy wants to merge 1 commit into
InternLM:mainfrom
CUHKSZzxy:fix/parser-fail-fast-dp-supervision

Conversation

@CUHKSZzxy

Copy link
Copy Markdown
Collaborator

Summary

  • Validate serve parser names before expensive engine startup so invalid parser configs fail fast.
  • Reuse the same parser validation in response parser setup while preserving legacy reasoning parser aliases.
  • Surface DP child startup failures to the launcher and terminate sibling process groups instead of blocking behind a long-running join.

Tests

  • python -m py_compile lmdeploy/serve/parsers/response_parser.py lmdeploy/serve/parsers/__init__.py lmdeploy/serve/openai/api_server.py lmdeploy/serve/openai/launch_server.py tests/test_lmdeploy/serve/parsers/test_parser_config.py
  • pytest tests/test_lmdeploy/serve/parsers/test_parser_config.py -q
  • Bad-parser serve reproduction fails fast before model loading with a clear parser validation error.

Assistance

Assisted with Codex + GPT-5.5 xHigh Fast, reviewed manually

Copilot AI review requested due to automatic review settings June 23, 2026 13:11

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves robustness of the OpenAI server startup path by validating configured parser names earlier (before expensive engine/model initialization) and by making DP launch behavior fail fast when any child server process fails.

Changes:

  • Added validate_parser_names() to centrally validate reasoning/tool parser registry names (including legacy reasoning aliases) and reused it in both serve() and ResponseParser.set_parsers().
  • Updated the DP launcher to surface child failures promptly and proactively terminate remaining server process groups rather than blocking behind join().
  • Added unit tests covering parser name validation and set_parsers() behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/test_lmdeploy/serve/parsers/test_parser_config.py Adds tests for parser-name validation and parser setup error handling.
lmdeploy/serve/parsers/response_parser.py Introduces validate_parser_names() and reuses it in BaseResponseParser.set_parsers().
lmdeploy/serve/parsers/init.py Re-exports validate_parser_names from the parsers package.
lmdeploy/serve/openai/launch_server.py Improves failure propagation and process-group termination logic for DP server launching.
lmdeploy/serve/openai/api_server.py Validates parser names early in serve() to fail fast before engine startup.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +166 to +170
alive_processes.remove(process)
if process.exitcode != 0:
logger.error(f'Server process {process.pid} exited with code {process.exitcode}')
terminate_processes(alive_processes)
raise RuntimeError(f'Server process {process.pid} exited with code {process.exitcode}')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants