Skip to content

[benchmarks][ci] Initial integration of sglang kernels to benchmarks #3796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

LiyangLingIntel
Copy link
Contributor

@LiyangLingIntel LiyangLingIntel commented Mar 31, 2025

The initial enabling for sglang benchmarks.
Include sglang prefill/decode/extended attention and fp8 quant gemm into third-party benchmark.

@LiyangLingIntel LiyangLingIntel self-assigned this Mar 31, 2025
@LiyangLingIntel LiyangLingIntel force-pushed the liyang/init_sglang_benchmark branch 3 times, most recently from b43b5e2 to bbd10a8 Compare April 9, 2025 02:34
@LiyangLingIntel LiyangLingIntel force-pushed the liyang/init_sglang_benchmark branch 2 times, most recently from 34e464a to 29e2711 Compare April 10, 2025 06:07
@LiyangLingIntel LiyangLingIntel marked this pull request as ready for review April 10, 2025 07:47
@LiyangLingIntel
Copy link
Contributor Author

Benchmark is still blocked by #3748, #3749.
Let's merge this PR when the blocking issues are resolved with the new agama release, expect in late April.

@LiyangLingIntel LiyangLingIntel force-pushed the liyang/init_sglang_benchmark branch from 9651ebe to e609f5d Compare April 15, 2025 08:14
@etiotto etiotto marked this pull request as draft April 17, 2025 14:27
@etiotto
Copy link
Contributor

etiotto commented Apr 24, 2025

@LiyangLingIntel the test are passing. Is this PR still going to wait on #3748 and #3749 ?

@LiyangLingIntel
Copy link
Contributor Author

@LiyangLingIntel the test are passing. Is this PR still going to wait on #3748 and #3749 ?

@etiotto Yes, these 2 depend on the new agama release.
The target workflow is Triton Third-party benchmark, it is not included in CI and scheduled once per day.

@LiyangLingIntel LiyangLingIntel linked an issue May 15, 2025 that may be closed by this pull request
@LiyangLingIntel LiyangLingIntel force-pushed the liyang/init_sglang_benchmark branch from e609f5d to 3853afa Compare May 22, 2025 08:21
@LiyangLingIntel LiyangLingIntel marked this pull request as ready for review May 22, 2025 08:32
@LiyangLingIntel
Copy link
Contributor Author

Verified the blocking issues #3748, #3749 are fixed on rolling agama 2507.

@LiyangLingIntel LiyangLingIntel force-pushed the liyang/init_sglang_benchmark branch 5 times, most recently from 48b96ec to 458e06d Compare May 23, 2025 12:53
@LiyangLingIntel
Copy link
Contributor Author

LiyangLingIntel commented May 26, 2025

@whitneywhtsang
Copy link
Contributor

@Egor-Krivov @vlad-penkin Please take another look.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move all sglang benchmarks to the benchmarks/triton_kernels_benchmark subfolder so it can be part of the package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not suggest to do so.

  1. Currently we do not have performance goal on sglang. There is no need to integrate these kernels to triton_kernels_benchmark, it's better to make it focus on high priority kernels. It can help save our CI time.
  2. SGLang is changing frequently and we do not pin to any stable commits of sglang. Moving these sglang benchmark to triton_kernels_benchmark would make it fragile.

I wonder if you can specify our strategy on these third party kernels benchmark systematically, including sglang, liger-kernels and other potential third parties(like FlagGems).
Which kernels we want to trace the peak performance and which we just want to know what performance it can be, or just ensure there is no regression?
Should all third parties be in the same strategy, or there are priorities on these third parties?

Copy link
Contributor

@anmyachev anmyachev May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we do not have performance goal on sglang. There is no need to integrate these kernels to triton_kernels_benchmark, it's better to make it focus on high priority kernels. It can help save our CI time.

AFAIK we have experimental category just for this. We can also disable their launch in CI regardless of the location of the benchmark, right?

SGLang is changing frequently and we do not pin to any stable commits of sglang. Moving these sglang benchmark to triton_kernels_benchmark would make it fragile.

We have, I believe, a well-established procedure for working with third-party sources (like PyTorch, spirv-llvm translator). We use a commit pin. If we need to periodically check the pin on the main, we can set up a workflow for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move all sglang benchmarks to the benchmarks/triton_kernels_benchmark subfolder so it can be part of the package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move all sglang benchmarks to the benchmarks/triton_kernels_benchmark subfolder so it can be part of the package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move all sglang benchmarks to the benchmarks/triton_kernels_benchmark subfolder so it can be part of the package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


- name: Install benchmark dependencies
id: install
run: |
pip install transformers pandas pytest

cd benchmarks
pip install .
pip install intel-pti==0.12.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this line is needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cause sglang benchmark has very few to do with Triton, we want leverage our benchmarking tools to do profiling. These lines are same to

.

LiyangLingIntel and others added 3 commits May 29, 2025 01:14
Port prefill attn and decode attn from sglang

Add validation

temp add extend attention

disable debug ir dump

Update three stage attention benchmark

Add sglang kernel benchmark to action

use 1e-3 atol

remove sglang benchmark from triton-benchmarks

Fix setup bdist_wheel

Add sglang to thirdparty test

Address review comments

Remove sglang from tests

Fix CI

Address review comments

Integrate sglang prefill/decode/extend kernel to benchmarks

Port prefill attn and decode attn from sglang

Add validation

temp add extend attention

disable debug ir dump

Update three stage attention benchmark

Add sglang kernel benchmark to action

use 1e-3 atol

remove sglang benchmark from triton-benchmarks

Fix setup bdist_wheel

Add sglang to thirdparty test

Address review comments

Remove sglang from tests

Adjust params term

Adjust tflops computation
fix bugs

rtol

atol

Move fp8 gemm to sglang benchmark
Address review comments

Fix CI XPU not found
@LiyangLingIntel LiyangLingIntel force-pushed the liyang/init_sglang_benchmark branch from 458e06d to 10da795 Compare May 29, 2025 02:33
@LiyangLingIntel LiyangLingIntel force-pushed the liyang/init_sglang_benchmark branch from 10da795 to ead512a Compare May 29, 2025 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants