Skip to content

Consider further speeding up division by const-generic specialization. #4

@eddyb

Description

@eddyb

The main implementation of the ieee::sig::div "arbitrary-precision division" function could be made into a separate function with a const SPECIALIZE_FOR_KNOWN_DIVISOR: u128 const-generic parameter, which it would use as such:

// The parameter being `0` is like `None` - could use `Option<NonZeroU128>` in the future.
if SPECIALIZE_FOR_KNOWN_DIVISOR != 0 {
    assert_eq!(divisor[0], SPECIALIZE_FOR_KNOWN_DIVISOR);
    assert!(is_all_zeros(&divisor[1..]));
}

(Hopefully this is enough for the rest of the body to be specialized by LLVM, but it can be further forced if necessary)

Then ieee::sig::div would become a "dispatch" fn, which invokes N+1 different instantiations of the const-generic implementation, for N "commonly used divisors" (10 comes to mind, tho there may be a whole sequence of powers of 5 for the conversion from decimal strings IIRC), and one 0 instantiation (which isn't specialized at all), and because the code still does the same division, we're only relying on the optimizer to actually turn the divisions into multiplications.


Whatever we do to the division algorithm, we shouldn't forget to add benchmarks first (unless the "from decimal" benchmark would cover enough interesting cases).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions