-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Labels
featureNew feature or requestNew feature or request
Description
currently, we use pretty much float32 tensors all around, which yields pretty huge models.
after discussion with @martinjaggi, training is hard to do without float32, but inference can probably utilize uint8 tensors, dividing up to 4x the size of trained models.
note: check that the model is still behaving correctly after quantization
Metadata
Metadata
Assignees
Labels
featureNew feature or requestNew feature or request