Open
Description
Consider the following LLVM IR:
declare half @llvm.fma.f16(half %a, half %b, half %c)
define half @do_fma(half %a, half %b, half %c) {
%res = call half @llvm.fma.f16(half %a, half %b, half %c)
ret half %res
}
On targets without native half
FMA support, LLVM turns this into the equivalent of:
declare float @llvm.fma.f32(float %a, float %b, float %c)
define half @do_fma(half %a, half %b, half %c) {
%a_f32 = fpext half %a to float
%b_f32 = fpext half %b to float
%c_f32 = fpext half %c to float
%res_f32 = call float @llvm.fma.f32(float %a_f32, float %b_f32, float %c_f32)
%res = fptrunc float %res_f32 to half
ret half %res
}
This is a miscompilation, however, as float
does not have enough precision to do a fused-multiply-add for half
without double rounding becoming an issue. For instance (raw bits of each half
are in brackets): do_fma(48.34375 (0x520b), 0.000013887882 (0x00e9), 0.12438965 (0x2ff6)) = 0.12512207 (0x3001)
, but LLVM's lowering to float
FMA gives an incorrect result of 0.125 (0x3000)
.
A correct lowering would need to use double
(or larger): a double
FMA is not required as double
is large enough to represent the result of half * half
without any rounding. In summary, a correct lowering would look something like this:
declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
define half @do_fma(half %a, half %b, half %c) {
%a_f64 = fpext half %a to double
%b_f64 = fpext half %b to double
%c_f64 = fpext half %c to double
%res_f64 = call double @llvm.fmuladd.f64(double %a_f64, double %b_f64, double %c_f64)
%res = fptrunc double %res_f64 to half
ret half %res
}