-
Notifications
You must be signed in to change notification settings - Fork 231
[Enhance]: Add validation for expert parallelism settings #1199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add validation to prevent simultaneous use of `--enable-expert-parallel` and `expert-tensor-parallel-size` configurations. These settings are mutually exclusive. Implementing this check prevents unexpected behavior and improves error tracing. If both settings are enabled concurrently, the system now throws an error, making it easier to identify and resolve configuration issues. Signed-off-by: Jade Zheng <[email protected]>
Signed-off-by: Jade Zheng <[email protected]>
Signed-off-by: Jade Zheng <[email protected]>
Signed-off-by: Jade Zheng <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds a validation step in check_ascend_config
to prevent simultaneous use of --enable-expert-parallel
and an expert_tensor_parallel_size
greater than 1, avoiding conflicting configurations.
- Import
TYPE_CHECKING
and annotatevllm_config
withVllmConfig
for better type safety. - Introduce a runtime check that raises an error when expert parallelism flags conflict.
- Preserve existing experimental ACL graph warnings.
Comments suppressed due to low confidence (1)
vllm_ascend/ascend_config.py:171
- Consider adding a unit test to verify that enabling both
enable_expert_parallel
andexpert_tensor_parallel_size > 1
reliably raises the intended error.
# for expert parallelism
Signed-off-by: Jade Zheng <[email protected]> Co-authored-by: Copilot <[email protected]>
@@ -164,3 +167,10 @@ def check_ascend_config(vllm_config, enforce_eager): | |||
"ACL Graph is currently experimental. Please " | |||
"raise an issue on https://github.com/vllm-project/vllm-ascend/issues" | |||
" if you encourage any Error") | |||
|
|||
# for expert parallelism |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this validation into check_and_update_config
in platform.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function will be called within check_and_update_config
@Yikun @wangxiyuan ready to merge. |
can you update the tests/ut/test_ascend_config.py as well? |
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Add validation to prevent simultaneous use of
--enable-expert-parallel
andexpert-tensor-parallel-size
configurations. These settings are mutually exclusive. Implementing this check prevents unexpected behavior and improves error tracing.If both settings are enabled concurrently, the system now throws an error, making it easier to identify and resolve configuration issues.