[webgpu] Apply Flash Attention if sliding window exceeds KV cache length #5076
windows_tensorrt.yml
on: pull_request
Windows GPU TensorRT CI Pipeline
41m 16s
Windows GPU TensorRT CI Pipeline Test Job
1h 3m
Artifacts
Produced during runtime
| Name | Size | Digest | |
|---|---|---|---|
|
build-artifacts
Expired
|
1.71 GB |
sha256:c0bb5734ee76d9966d9fe6c829862bb8ae383a374fa1731138c2c8aa65f60454
|
|