-
Notifications
You must be signed in to change notification settings - Fork 62
[benchmarks][ci] Initial integration of sglang kernels to benchmarks #3796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
b43b5e2
to
bbd10a8
Compare
34e464a
to
29e2711
Compare
29e2711
to
4b7241c
Compare
1296327
to
7d4d837
Compare
9651ebe
to
e609f5d
Compare
@LiyangLingIntel the test are passing. Is this PR still going to wait on #3748 and #3749 ? |
@etiotto Yes, these 2 depend on the new agama release. |
e609f5d
to
3853afa
Compare
48b96ec
to
458e06d
Compare
The third party benchmark passed here https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/15210819443/job/42784278091 |
@Egor-Krivov @vlad-penkin Please take another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move all sglang benchmarks to the benchmarks/triton_kernels_benchmark
subfolder so it can be part of the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not suggest to do so.
- Currently we do not have performance goal on sglang. There is no need to integrate these kernels to
triton_kernels_benchmark
, it's better to make it focus on high priority kernels. It can help save our CI time. - SGLang is changing frequently and we do not pin to any stable commits of sglang. Moving these sglang benchmark to
triton_kernels_benchmark
would make it fragile.
I wonder if you can specify our strategy on these third party kernels benchmark systematically, including sglang, liger-kernels and other potential third parties(like FlagGems).
Which kernels we want to trace the peak performance and which we just want to know what performance it can be, or just ensure there is no regression?
Should all third parties be in the same strategy, or there are priorities on these third parties?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we do not have performance goal on sglang. There is no need to integrate these kernels to triton_kernels_benchmark, it's better to make it focus on high priority kernels. It can help save our CI time.
AFAIK we have experimental
category just for this. We can also disable their launch in CI regardless of the location of the benchmark, right?
SGLang is changing frequently and we do not pin to any stable commits of sglang. Moving these sglang benchmark to triton_kernels_benchmark would make it fragile.
We have, I believe, a well-established procedure for working with third-party sources (like PyTorch, spirv-llvm translator). We use a commit pin. If we need to periodically check the pin on the main, we can set up a workflow for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move all sglang benchmarks to the benchmarks/triton_kernels_benchmark
subfolder so it can be part of the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also integrate benchmarks into the package. The sample integration is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move all sglang benchmarks to the benchmarks/triton_kernels_benchmark
subfolder so it can be part of the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also integrate benchmarks into the package. The sample integration is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move all sglang benchmarks to the benchmarks/triton_kernels_benchmark
subfolder so it can be part of the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also integrate benchmarks into the package. The sample integration is
|
||
- name: Install benchmark dependencies | ||
id: install | ||
run: | | ||
pip install transformers pandas pytest | ||
|
||
cd benchmarks | ||
pip install . | ||
pip install intel-pti==0.12.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this line is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cause sglang benchmark has very few to do with Triton, we want leverage our benchmarking tools to do profiling. These lines are same to
cd benchmarks |
Port prefill attn and decode attn from sglang Add validation temp add extend attention disable debug ir dump Update three stage attention benchmark Add sglang kernel benchmark to action use 1e-3 atol remove sglang benchmark from triton-benchmarks Fix setup bdist_wheel Add sglang to thirdparty test Address review comments Remove sglang from tests Fix CI Address review comments Integrate sglang prefill/decode/extend kernel to benchmarks Port prefill attn and decode attn from sglang Add validation temp add extend attention disable debug ir dump Update three stage attention benchmark Add sglang kernel benchmark to action use 1e-3 atol remove sglang benchmark from triton-benchmarks Fix setup bdist_wheel Add sglang to thirdparty test Address review comments Remove sglang from tests Adjust params term Adjust tflops computation
fix bugs rtol atol Move fp8 gemm to sglang benchmark
Address review comments Fix CI XPU not found
458e06d
to
10da795
Compare
10da795
to
ead512a
Compare
The initial enabling for sglang benchmarks.
Include sglang prefill/decode/extended attention and fp8 quant gemm into third-party benchmark.