-
Notifications
You must be signed in to change notification settings - Fork 716
[CI]Add Disaggregated PD Nightly Test for Qwen3-235B and Qwen3-VL-235B #5502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds a nightly CI test for the Qwen3-235B-A22B model. The new YAML configuration file looks reasonable. However, the changes to the run.sh script introduce a critical issue by hardcoding a pull request number to fetch code. This is a very brittle approach that will likely break the CI in the future and should be removed before merging.
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
ae74860 to
3334472
Compare
3334472 to
4296db8
Compare
0a44979 to
ab1d117
Compare
| @@ -0,0 +1,111 @@ | |||
| test_name: "test Qwen3-235B-A22B pd online" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Angazenn plz take a look at this pr, which adding test for qwen3 235b
|
please refer to three-node-a3-pd-disaggregation for launching server scripts, as this is the typical pd-disaggregation setting for Qwen3-235B-A22B |
8067153 to
d25a973
Compare
I have revised it based on the content of the document. |
eaa07ce to
388aaec
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
388aaec to
3a0e746
Compare
Signed-off-by: MrZ20 <[email protected]>
Signed-off-by: MrZ20 <[email protected]>
Signed-off-by: MrZ20 <[email protected]>
Signed-off-by: MrZ20 <[email protected]>
Signed-off-by: MrZ20 <[email protected]>
Signed-off-by: MrZ20 <[email protected]> modify config Signed-off-by: MrZ20 <[email protected]> add nightly test entrance Signed-off-by: MrZ20 <[email protected]> modify Signed-off-by: MrZ20 <[email protected]> end Signed-off-by: MrZ20 <[email protected]>
3a0e746 to
58982b1
Compare
What this PR does / why we need it?
This PR adds online Disaggregated Prefill/Decode performance and accuracy tests for the Qwen3-235B-A22B and Qwen3-VL-235B-A22B-Instruct models to the Nightly test suite.
These test configurations simulate the deployment of massive MoE and Vision-Language models in a dual-node (32 NPU) environment, utilizing Mooncake (KVCache Transfer) technology to achieve efficient KV cache transfer between the Prefill node and the Decode node.
Test Configuration
Qwen3-235B-A22B
Qwen3-VL-235B-A22B-Instruct
Does this PR introduce any user-facing change?
How was this patch tested?
Nightly test action on CI:
https://github.com/vllm-project/vllm-ascend/actions/runs/20734804044/job/59529925424?pr=5442
Result as following:
Qwen3-235B-A22B(52m13s)
Qwen3-VL-235B-A22B-Instruct(43m2s)