-
Couldn't load subscription status.
- Fork 3.2k
Description
Checklist
- 1. I have searched related issues but cannot get the expected help.
- 2. The bug has not been fixed in the latest version.
- 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
- 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- 5. Please use English, otherwise it will be closed.
Describe the bug
the logs serving success:
Reproduction
model: Qwen3-235B-instruct-2507-w8a8
hardware: 2 * 800I A2, stand alone , NPUs connected back-to-back using 8 fiber lines
deployment: 1P + 1D with PD disaggregation
Environment
prefill node:
python -m sglang.launch_server --model-path /data/modelscope/Qwen3-235B-A22B-Instruct-2507-w8a8
--trust-remote-code
--mem-fraction-static 0.92
--attention-backend ascend
--device npu
--host $ifip
--port 8001
--tp-size 8
--dp-size 1
--nnodes 1
--node-rank 0
--disaggregation-mode prefill
--disaggregation-bootstrap-port 8998
--disaggregation-transfer-backend ascend
--dist-init-addr $prefillip:6688
--context-length 4096
--disable-radix-cache
--cuda-graph-bs 8
--quantization w8a8_int8 &
sleep 200s
python -m sglang_router.launch_router --pd-disaggregation
--prefill http://$prefillip:8001
--decode http://$decodeip:8001
--host $shost --port $sport &
decode node:
python -m sglang.launch_server --model-path /data/modelscope/Qwen3-235B-A22B-Instruct-2507-w8a8
--trust-remote-code
--mem-fraction-static 0.92
--attention-backend ascend
--device npu
--host $ifip
--port 8001
--tp-size 8
--dp-size 1
--nnodes 1
--node-rank 0
--disable-radix-cache
--disaggregation-mode decode
--disaggregation-bootstrap-port 8998
--disaggregation-transfer-backend ascend
--dist-init-addr $decodeip:6688
--cuda-graph-bs 8
--context-length 4096
--quantization w8a8_int8 2>&1 &