Skip to content

[float8] Support power of 2 scales with PerRow scales for inference #2182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
danielvegamyhre opened this issue May 7, 2025 · 0 comments · May be fixed by #2256
Open

[float8] Support power of 2 scales with PerRow scales for inference #2182

danielvegamyhre opened this issue May 7, 2025 · 0 comments · May be fixed by #2256
Labels

Comments

@danielvegamyhre
Copy link
Contributor

Summary

  • float8 training w/ rowwise scales uses power of 2 scales by default, to reduce quantization error
  • float8 inference w/ Float8DynamicActivationFloat8WeightConfig using PerRow scaling doesn't support power of 2 scales
  • users have reported they want to be able to use power of 2 scales for inference after training with them.

cc @drisspg @vkuzo

@danielvegamyhre danielvegamyhre linked a pull request May 23, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant