-
Notifications
You must be signed in to change notification settings - Fork 319
Issues: flashinfer-ai/flashinfer
Deprecation Notice: Python 3.8 Wheel Support to End in future...
#682
opened Dec 18, 2024 by
yzh119
Open
2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
How to combine API and implement same function of 'torch.nn.functional.scaled_dot_product_attention'
#1095
opened May 27, 2025 by
alao556
test_sampling.cu is not updated to the newer sampling kernel interface
#1074
opened May 20, 2025 by
842974287
C++ test example failed for test_single_prefill
bug
Something isn't working
#1057
opened May 14, 2025 by
swmobile
Build fail because of "unknown" in metadata while installation
#1053
opened May 11, 2025 by
IzhanVarsky
Can flashinfer's CutlassSegmentGEMMSM90Run function be used for LoRA computing on H20?
#1034
opened Apr 23, 2025 by
chenhongyu2048
flashinfer.decode.single_decode_with_kv_cache: Floating point exception (core dumped)
#1027
opened Apr 20, 2025 by
MenHimChan
[Bug] FP8 scaling factors (k_scale/v_scale) not taking effect in BatchPrefillWithPagedKVCacheWrapper
#1023
opened Apr 17, 2025 by
cscyuge
Low performance of POD Attention compared to BatchPrefillWithPagedKVCache
#1022
opened Apr 17, 2025 by
Edenzzzz
top_k_top_p_sampling_from_logits
incompatible with torch.compile + CUDAGraph
#978
opened Mar 28, 2025 by
sharvil
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.