You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As soon as either m or n are 4 or higher, function evaluations get orders of magnitudes slower and the code seems messed up, though I am no llvm expert.
x-ref: #20637. A couple of observations: writing pow(x, 4) in C does not result in three multiplications, it results in a call to the libm pow function, just like in Julia. The expression x*x*x*x is numerically distinguishable from and less accurate than calling pow(x, 4):
julia>f1(x) = x^4
f1 (generic function with 1 method)
julia>f2(x) = x*x*x*x
f2 (generic function with 1 method)
julia>f1(0.1)
0.00010000000000000002
julia>f2(0.1)
0.00010000000000000003
Note that this is also distinct from (x*x)*(x*x) since floating-point is not associative:
julia>f3(x) = (x*x)*(x*x)
f3 (generic function with 1 method)
julia>f3(0.1)
0.00010000000000000005
This last f3 version is even more efficient (only two multiplies), but even less accurate than f2:
I'm going to close this is "not a bug" since this behavior is correct and intentional, and just leave #20637 as the issue tracking making code generation for literal power expressions generate more efficient (and ideally no less accurate) code.
I wanted to write functions of the form x^m*y^n for integer m and n. For m and n less than 4 I the code generated seems reasonable
define double @julia_foo_60944(double, double) #0 !dbg !5 {
top:
%2 = fmul double %0, %0
%3 = fmul double %2, %0
%4 = fmul double %1, %1
%5 = fmul double %4, %1
%6 = fmul double %3, %5
ret double %6
}
As soon as either m or n are 4 or higher, function evaluations get orders of magnitudes slower and the code seems messed up, though I am no llvm expert.
define double @julia_foo_60943(double, double) #0 !dbg !5 {
top:
%2 = call double @llvm.pow.f64(double %0, double 4.000000e+00)
%3 = fadd double %0, 4.000000e+00
%notlhs = fcmp ord double %2, 0.000000e+00
%notrhs = fcmp uno double %3, 0.000000e+00
%4 = or i1 %notrhs, %notlhs
br i1 %4, label %L12, label %if
if: ; preds = %top
call void @jl_throw(i8** inttoptr (i64 140367321804096 to i8**))
unreachable
L12: ; preds = %top
%5 = fmul double %1, %1
%6 = fmul double %5, %1
%7 = fmul double %6, %2
ret double %7
}
For me this seems to be a bug as I would expect f2 to produce code similar to f3
define double @julia_foo_60945(double, double) #0 !dbg !5 {
top:
%2 = fmul double %0, %0
%3 = fmul double %2, %0
%4 = fmul double %3, %0
%5 = fmul double %4, %1
%6 = fmul double %5, %1
%7 = fmul double %6, %1
ret double %7
}
This behaviour has been produced using the following setup:
The text was updated successfully, but these errors were encountered: