[ET-VK][Ops] dequantize_per_channel shaders and impl #12435

pytorchbot · 2025-07-14T14:58:53Z

This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #12207 by @ahmtox
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/ahmtox/35/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/ahmtox/35/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/ahmtox/34/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/ahmtox/35/orig
@diff-train-skip-merge

Pull Request resolved: #12199 # Context A few operators have been recently created, namely: - quantize_per_tensor - quantize_per_token - dequantize_per_tensor - dequantize_per_token - choose_qparams.tensor - choose_qparams_per_token_asymmetric They don't have a namespace associated with them, and since we are trying to align with the ATen implementation in their respective quantized_decomposed namespace, this diff is necessary to align in that regard. Furthermore, our operators need to match inputs with the ATen version, so we also pass dtypes. # Changes The primary change is adding the namespace quantized_decomposed to all the above named operators. Furthermore, we change the testing framework to pass dummy dtypes that is expected for the ATen implementation. We also change the `choose_qparams` logic to properly pass the eps, since this is actually a relevant variable and cannot be set by default, despite the existing op_quantize cpu reference in executorch not distinctly using this variable. ghstack-source-id: 295972783 @exported-using-ghexport Differential Revision: [D77746144](https://our.internmc.facebook.com/intern/diff/D77746144/)

Pull Request resolved: #12200 # Context Certain quantization operators need scales and zeros to be set with a storage layout as buffers. Since the existing op_registry does not allow specifying how input parameters are set with their memory or storage layout, we need to specify that the optimal storage type is buffer so that is conversion pass is added to ensure that the inputs are also buffers. # Changes This moves the quantized_decomposed operators in their own registration, while also specifying that buffer is preferred. ghstack-source-id: 295972779 @exported-using-ghexport Differential Revision: [D77746131](https://our.internmc.facebook.com/intern/diff/D77746131/)

Pull Request resolved: #12201 # Context We need this conversion so that certain operators can handle floating point values that need to be 64bit. This is predominantly applicable to choose_qparams.tensor where it expects a 64bit output. # Changes Simply adding an additional conversion for float64 to vulkan fp32. ghstack-source-id: 295972781 @exported-using-ghexport Differential Revision: [D77746137](https://our.internmc.facebook.com/intern/diff/D77746137/)

Pull Request resolved: #12203 # Context The quantize_per_channel was not perfectly aligned with the ATen implementation, and demonstrated errors when specifying different axis. This bug wasn't distinctly acknowledged given that the test cases only has one test for the whole operator. In order to align more closely with ATen this change simply does a single loop imlpementation with direct channel index calculation over the old `apply_over_dim_list` approach. # Changes We change the core logic for quantize_per_channel to more properly align with ATen's implementation, and we also change it from `apply_over_dim_list` approach to a single loop implementation with direct channel index calculation. This also adds more comprehensive testing for quantize_per_channel so that a bug isn't missed again. ghstack-source-id: 295972782 @exported-using-ghexport Differential Revision: [D77746130](https://our.internmc.facebook.com/intern/diff/D77746130/)

Pull Request resolved: #12204 # Context In order to properly enable dynamic quantization, we create the quantize_per_channel operator as its seemingly useful to have for the pipeline. # Changes This creates the wrapper for the cpu reference implementation, and also a dummy reference implementation I created just to test against it. ghstack-source-id: 295972785 @exported-using-ghexport Differential Revision: [D77746132](https://our.internmc.facebook.com/intern/diff/D77746132/)

Pull Request resolved: #12205 # Context We need to enable the core logic for quantize_per_channel in the vulkan shader. This implements the shader itself and its cpp header. TODO: add more of a description regarding the operator # Changes This creates an extension of the existing files for quantize_per_channel. ghstack-source-id: 295972786 @exported-using-ghexport Differential Revision: [D77746140](https://our.internmc.facebook.com/intern/diff/D77746140/)

Pull Request resolved: #12206 # Context In order to properly enable dynamic quantization, we create the dequantize_per_channel operator as its seemingly useful to have for the pipeline. To provide some more context on the ATen to ETen change, there was an issue that the optionals did not perfectly handle cases that were const and ref, so this change primarily plans to add functionality to handle these templating mismatch cases. # Changes This creates the wrapper for the cpu reference implementation, and also a dummy reference implementation I created just to test against it. We also created a test case for ATen to ETen for the new changes. ghstack-source-id: 295972788 @exported-using-ghexport Differential Revision: [D77746138](https://our.internmc.facebook.com/intern/diff/D77746138/)

Pull Request resolved: #12207 # Context We need to enable the core logic for dequantize_per_channel in the vulkan shader. This implements the shader itself and its cpp header. TODO: add more of a description regarding the operator # Changes This creates an extension of the existing files for dequantize_per_channel. ghstack-source-id: 295972778 @exported-using-ghexport Differential Revision: [D77746141](https://our.internmc.facebook.com/intern/diff/D77746141/)

pytorch-bot · 2025-07-14T14:58:58Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12435

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 91 Pending

As of commit ae226d6 with merge base 8f3eb3e ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-07-14T15:43:07Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

morelos added 8 commits July 13, 2025 21:36

pytorchbot requested a review from SS-JIA as a code owner July 14, 2025 14:58

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 14, 2025

Base automatically changed from gh/ahmtox/34/orig to main July 14, 2025 15:36

SS-JIA requested review from manuelcandales, swolchok and JacobSzwejbka as code owners July 14, 2025 15:36

SS-JIA approved these changes Jul 14, 2025

View reviewed changes

Merge branch 'main' into gh/ahmtox/35/orig

ae226d6

SS-JIA merged commit 63431bd into main Jul 14, 2025
95 of 96 checks passed

SS-JIA deleted the gh/ahmtox/35/orig branch July 14, 2025 15:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ET-VK][Ops] dequantize_per_channel shaders and impl #12435

[ET-VK][Ops] dequantize_per_channel shaders and impl #12435

Uh oh!

pytorchbot commented Jul 14, 2025

Uh oh!

pytorch-bot bot commented Jul 14, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 14, 2025

Uh oh!

Uh oh!

Uh oh!

[ET-VK][Ops] dequantize_per_channel shaders and impl #12435

[ET-VK][Ops] dequantize_per_channel shaders and impl #12435

Uh oh!

Conversation

pytorchbot commented Jul 14, 2025

Uh oh!

pytorch-bot bot commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12435

⏳ No Failures, 91 Pending

Uh oh!

github-actions bot commented Jul 14, 2025

This PR needs a release notes: label

Uh oh!

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 14, 2025 •

edited

Loading

This PR needs a `release notes:` label