Docs say the FPU is deeply pipelined (e.g., ~10 stages for FMA) and can issue 1 op/cycle except FDIV/FSQRT/subnormals, but I can’t find the latency for FADD.D. Is there a canonical number or a config‑dependent answer? If it’s the latter, what’s the recommended way to read the add pipeline depth from FpuCore.scala? Any pointer would help. Thanks!
Docs say the FPU is deeply pipelined (e.g., ~10 stages for FMA) and can issue 1 op/cycle except FDIV/FSQRT/subnormals, but I can’t find the latency for FADD.D. Is there a canonical number or a config‑dependent answer? If it’s the latter, what’s the recommended way to read the add pipeline depth from FpuCore.scala? Any pointer would help. Thanks!