Cast Nodes Fusion #24842

nenad1002 · 2025-05-22T20:09:26Z

Description

We might have a case where multiple Cast nodes in the chain cast back to the original type. This fusion will remove extra nodes.
E.g.
A ('float32') -> Cast (to='float16') -> Cast (to='int4') -> Cast (to='float32') -> Cast (to='float16') -> B
will reduce to
A ('float32') -> Cast (to='float16') -> B
All the Cast nodes throughout the path need to have one input and one output to be considered for the fusion.

Motivation and Context

Gemma3 ONNX models used to have double casting, and many new models created by the model builder might have as well. Extra Casts might reduce accuracy and increase inference time.

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/core/optimizer/cast_elimination.h

tianleiwu · 2025-05-23T16:57:32Z

We used to have similar Cast removing logic but it causes a lot of accuracy issues. That was reversed in #17953 to be more conservative. I suggest to add an option in onnxruntime_session_options_config_keys.h. The default is off, and user can turn it on if needed.

It is easy to process model offline like

onnxruntime/onnxruntime/python/tools/transformers/onnx_model.py

Line 667 in ad7b0e3

def remove_cascaded_cast_nodes(self):

.
So another way to avoid the issue to use such post-processing in model builder.

nenad1002 · 2025-05-23T18:26:24Z

We used to have similar Cast removing logic but it causes a lot of accuracy issues. That was reversed in #17953 to be more conservative. I suggest to add an option in onnxruntime_session_options_config_keys.h. The default is off, and user can turn it on if needed.

It is easy to process model offline like

onnxruntime/onnxruntime/python/tools/transformers/onnx_model.py

Line 667 in ad7b0e3

def remove_cascaded_cast_nodes(self):

.
So another way to avoid the issue to use such post-processing in model builder.

Ok, interesting, did not know this could cause accuracy drop. I can go either way, add a feature filter option or just rely on offline model processing. It is nice to have onnxruntime remove it for us automatically though since we often end up with multiple casts when experimenting.

nenad1002 added 3 commits May 22, 2025 16:32

Cast fusion

72d58df

Exit early on a hit

5d290cb

Add test files

776b6ea

nenad1002 changed the title ~~[DO NOT REVIEW YET] Fusion Cast~~ [DO NOT REVIEW YET] Draft PR - Fusion Cast May 22, 2025

nenad1002 added 5 commits May 22, 2025 20:50

Add check later

9c01f08

Fix a bug in the check

d8510c8

Fix special case where Cast has multiple outputs

1d31f0d

Fix special case where Cast has multiple outputs

d1c5431

Improve comments

e8bff72

nenad1002 changed the title ~~[DO NOT REVIEW YET] Draft PR - Fusion Cast~~ Draft PR - Fusion Cast May 22, 2025

github-actions bot reviewed May 22, 2025

View reviewed changes

onnxruntime/core/optimizer/cast_elimination.h Outdated Show resolved Hide resolved

Fix linter

e892009

nenad1002 changed the title ~~Draft PR - Fusion Cast~~ Cast Nodes Fusion May 23, 2025

nenad1002 marked this pull request as ready for review May 23, 2025 15:04

nenad1002 requested a review from tianleiwu May 23, 2025 16:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cast Nodes Fusion #24842

Cast Nodes Fusion #24842

nenad1002 commented May 22, 2025 •

edited

Loading

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

tianleiwu commented May 23, 2025 •

edited

Loading

Uh oh!

nenad1002 commented May 23, 2025

Uh oh!

Uh oh!

Cast Nodes Fusion #24842

Are you sure you want to change the base?

Cast Nodes Fusion #24842

Conversation

nenad1002 commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tianleiwu commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nenad1002 commented May 23, 2025

Uh oh!

Uh oh!

nenad1002 commented May 22, 2025 •

edited

Loading

tianleiwu commented May 23, 2025 •

edited

Loading