Skip to content

Commit 20767a0

Browse files
authored
[CI/UT] Fix disaggregated prefill ci (#1313)
### What this PR does / why we need it? Use eager mode to run disaggregated prefill ci ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with new existing test. --------- Signed-off-by: MengqingCao <[email protected]>
1 parent 9cbce42 commit 20767a0

File tree

2 files changed

+7
-1
lines changed

2 files changed

+7
-1
lines changed

.github/workflows/vllm_ascend_test_pd.yaml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,11 @@ jobs:
4141
if: ${{ contains(github.event.pull_request.labels.*.name, 'pd-test') && contains(github.event.pull_request.labels.*.name, 'ready-for-test') || github.event_name == 'schedule' }}
4242
strategy:
4343
matrix:
44-
vllm_verison: [main, v0.9.1]
44+
vllm_verison: [
45+
# revert me when V1 disaggregation prefill is merged in main
46+
# main,
47+
v0.9.1
48+
]
4549
name: vLLM Ascend prefilling decoding disaggregation test
4650
runs-on: linux-arm64-npu-static-8
4751

tests/e2e/pd_disaggreate/setup_pd.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,7 @@ function run_prefill_instance() {
6666
--served-model-name Deepseek \
6767
--max-model-len 2000 \
6868
--trust-remote-code \
69+
--enforce-eager \
6970
--kv-transfer-config "$KV_CONFIG"
7071
}
7172

@@ -119,6 +120,7 @@ function run_decode_instance() {
119120
--max-num-batched-tokens 2000 \
120121
--trust-remote-code \
121122
--gpu-memory-utilization 0.9 \
123+
--enforce-eager \
122124
--kv-transfer-config "$KV_CONFIG"
123125
}
124126

0 commit comments

Comments
 (0)