Skip to content

Pull requests: HabanaAI/vllm-hpu-extension

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add FusedSDPA QKV slice support for APC
#393 opened Nov 18, 2025 by ikurtchen Loading…
Enable chunked prefill on aice 1.22
#381 opened Oct 23, 2025 by YuJiankang Loading…
[WA] bypass the GLM OOM issue
#380 opened Oct 15, 2025 by czhu15 Loading…
Enable chunked prefill
#362 opened Sep 14, 2025 by jzhoulon Loading…
[HS-6944] Fix for deepseek distill models
#359 opened Sep 10, 2025 by nazneenn Loading…
[aice/v.1.22] refactor chunk size code
#354 opened Sep 1, 2025 by ranzhejiang Loading…
Fix for Llama4 models (targets main)
#341 opened Aug 19, 2025 by vidyasiv Loading…
Add flag pin_memory to call from hpu.py in vllm
#325 opened Aug 5, 2025 by xuechendi Loading…
Add Calibration Script for SGLang FP8
#318 opened Jul 29, 2025 by SKRohit Loading…
Add block_softmax_adjustment and block_softmax kernels
#289 opened Jul 16, 2025 by czhu15 Loading…
Add pre-commit static checks
#247 opened Jun 30, 2025 by kzawora-intel Loading…
Exponential bucketing tweaks
#224 opened Jun 13, 2025 by madamczyk-intel Loading…
Add useful internal vllm test
#200 opened May 27, 2025 by nirda7 Draft
Optimized MoE on Gaudi
#159 opened Apr 18, 2025 by gyou2021 Draft
ProTip! Filter pull requests by the default branch with base:main.