Skip to content

Add bias support to QQLinear#3215

Open
mdepree wants to merge 1 commit intoml-explore:mainfrom
mdepree:qqlinear-bias
Open

Add bias support to QQLinear#3215
mdepree wants to merge 1 commit intoml-explore:mainfrom
mdepree:qqlinear-bias

Conversation

@mdepree
Copy link

@mdepree mdepree commented Mar 6, 2026

Proposed changes

Currently, QQLinear does not support bias terms, which prevents quantization of models that use biased linear layers. This limitation is noted in the existing docstring: "Note: This layer does not support a bias term yet." This PR adds support for an optional bias term to the QQLinear layer, bringing it in line with Linear and QuantizedLinear.

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

@angeloskath angeloskath requested a review from nastya236 March 7, 2026 00:59
Copy link
Collaborator

@nastya236 nastya236 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this. Unfortunately, the PR is incomplete. Adding a bias requires changes to how layouts are initialized for QQ matmul in mlx/backend/cuda/quantized/cublas_qqmm.h. To support bias, the implementation needs to set epilogue. This should be implemented similarly to how it is handled in regular GEMM and Matmul::eval_gpu().
Happy to iterate on this with you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants