Open
Description
Hi, i met an error when i run flashinfer.decode.single_decode_with_kv_cache
example program, which is :
import torch
import flashinfer
kv_len = 4096
num_qo_heads = 32
num_kv_heads = 32
head_dim = 128
q = torch.randn(num_qo_heads, head_dim).half().to("cuda:0")
k = torch.randn(kv_len, num_kv_heads, head_dim).half().to("cuda:0")
v = torch.randn(kv_len, num_kv_heads, head_dim).half().to("cuda:0")
o = flashinfer.single_decode_with_kv_cache(q, k, v)
o.shape
The error message is: Floating point exception (core dumped)
.
What reason casues this error?
My environment is:
- GPU: RTX 3060
- torch vesrsion : Torch 2.6
- cuda version : CUDA 12.4
- Flashinfer version: 0.2.5+cu124torch2.6
Looking forward to your relpy!
Metadata
Metadata
Assignees
Labels
No labels