Update default models to be benchmarked continuously #11610

guangy10 · 2025-06-12T18:20:21Z

Summary

Promoted Qwen3-0.6B to be the default as it's not only small enough to run quickly but also covers most of advanced changes in both etLLM and optimum-executorch.
Removed tinyllama as nobody would care about its perf. We shouldn't use device farm for correctness testing
~~Added google/gemma-3-1b-it to apple perf (private)~~

pytorch-bot · 2025-06-12T18:20:25Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11610

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 7 Pending, 2 Unrelated Failures

As of commit 170f15f with merge base b59f5cc ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / unittest-editable / linux / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / linux / linux-job (gh) (trunk failure)
/pytorch/executorch/backends/vulkan/runtime/api/containers/Tensor.cpp:651:17: error: no matching constructor for initialization of 'vkcompute::api::vTensor::TextureLimits'

This comment was automatically generated by Dr. CI and updates every 15 minutes.

.github/workflows/android-perf-private-device-experiment.yml

guangy10 · 2025-06-12T22:37:57Z

Have to delist the google/gemma-3-1b-it from continuous benchmarking.

@kimishpatel @cbilgin FYI, Still can't run google/gemma-3-1b-it on-device due to this bug #11618. The issue wasn't caught on Android side because of another bug in the Android benchmark app here #11620.

guangy10 · 2025-06-13T00:25:09Z

Can't merge, try rebase with no code changes

guangy10 requested a review from huydhn June 12, 2025 18:20

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 12, 2025

guangy10 marked this pull request as ready for review June 12, 2025 18:20

guangy10 added the release notes: none Do not include this in the release notes label Jun 12, 2025

guangy10 temporarily deployed to upload-benchmark-results June 12, 2025 18:30 — with GitHub Actions Inactive

guangy10 requested a review from kimishpatel June 12, 2025 19:13

guangy10 temporarily deployed to upload-benchmark-results June 12, 2025 19:46 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results June 12, 2025 20:06 — with GitHub Actions Inactive

guangy10 force-pushed the tweak_default_benchmark_models branch from 982e172 to 5b28246 Compare June 12, 2025 20:20

guangy10 commented Jun 12, 2025

View reviewed changes

.github/workflows/android-perf-private-device-experiment.yml Outdated Show resolved Hide resolved

guangy10 force-pushed the tweak_default_benchmark_models branch from 5b28246 to 5276934 Compare June 12, 2025 20:30

guangy10 temporarily deployed to upload-benchmark-results June 12, 2025 20:59 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results June 12, 2025 21:59 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results June 12, 2025 22:05 — with GitHub Actions Inactive

guangy10 force-pushed the tweak_default_benchmark_models branch from 5276934 to b096669 Compare June 12, 2025 22:34

guangy10 temporarily deployed to upload-benchmark-results June 12, 2025 22:52 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results June 12, 2025 22:53 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results June 12, 2025 23:13 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results June 12, 2025 23:32 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results June 12, 2025 23:35 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results June 12, 2025 23:54 — with GitHub Actions Inactive

shoumikhin approved these changes Jun 13, 2025

View reviewed changes

Update default models to be benchmarked continuously

170f15f

guangy10 force-pushed the tweak_default_benchmark_models branch from b096669 to 170f15f Compare June 13, 2025 00:24

guangy10 temporarily deployed to upload-benchmark-results June 13, 2025 00:56 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results June 13, 2025 01:10 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results June 13, 2025 01:26 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results June 13, 2025 01:27 — with GitHub Actions Inactive

guangy10 merged commit f5b711f into main Jun 13, 2025
132 of 134 checks passed

guangy10 deleted the tweak_default_benchmark_models branch June 13, 2025 01:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update default models to be benchmarked continuously #11610

Update default models to be benchmarked continuously #11610

Uh oh!

guangy10 commented Jun 12, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jun 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

guangy10 commented Jun 12, 2025

Uh oh!

guangy10 commented Jun 13, 2025

Uh oh!

Uh oh!

Uh oh!

Update default models to be benchmarked continuously #11610

Update default models to be benchmarked continuously #11610

Uh oh!

Conversation

guangy10 commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

pytorch-bot bot commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11610

⏳ 7 Pending, 2 Unrelated Failures

Uh oh!

Uh oh!

guangy10 commented Jun 12, 2025

Uh oh!

guangy10 commented Jun 13, 2025

Uh oh!

Uh oh!

Uh oh!

guangy10 commented Jun 12, 2025 •

edited

Loading

pytorch-bot bot commented Jun 12, 2025 •

edited

Loading