Skip to content

[QNN EP] Fix 16x16 MatMul translation #24846

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

quic-tirupath
Copy link
Contributor

Description

  • QNN's 16x16 FC doesn't support asymmetric int16 weight
  • QNN's 16x16 MatMul doesn't support asymmetric int16 weight initializer.
  • Insert Convert Op to convert from asymmetric uint16 weight to symmetric int16 weight.
  • Add unit tests to verify 16x16 MatMul translations.

Motivation and Context

  • This fix schedules 16x16 MatMul Ops on QNN HTP accelerator.
  • This improves inference time of Models contain 16x16 MatMul operators

 - QNN's 16x16 FC doesn't support asymmetric int16 weight
 - QNN's 16x16 MatMul doesn't support asymmetric int16 weight
   initializer.
 - Insert Convert Op to convert from asymmetric uint16 weight
   to symmetric int16 weight.
 - Add unit tests to verify 16x16 MatMul translations.
@HectorSVC HectorSVC added the ep:QNN issues related to QNN exeution provider label May 23, 2025
@HectorSVC
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 5 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:QNN issues related to QNN exeution provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants