indicator for fast FMA #9855

simonbyrne · 2015-01-20T11:22:01Z

Now we have a fma function, it would be useful to have some method for determining whether or not this is more efficient than the naive x*y+z, particularly for use in algorithms that use double-double style arithmetic (C provides the FP_FAST_FMA macro for this purpose).

From #8112 (comment), it seems that the best option is to expose TargetLowering::isFMAFasterThanFMulAndFAdd (presumably in base/sysinfo.jl?)

The text was updated successfully, but these errors were encountered:

ivarne · 2015-01-20T11:35:58Z

Isn't this the problem muladd from #9840 is intended to solve?

simonbyrne · 2015-01-20T11:50:36Z

Not quite: they are related but distinct.

The purpose muladd is to compute x*y+z in the fastest way possible (e.g. in a Horner evaluation scheme for a polynomial).
The purpose of this is to determine whether fma is fast (i.e. in hardware), or slow (i.e. in software) in which case you might want to use a different approach altogether. An example is here: if you have a hardware fma, it can reduce to 5 operations, otherwise you need 12 (using a software fma would most likely be even slower).

simonbyrne · 2015-01-20T12:35:01Z

As a rough point of reference, a software fma is about 10x slower than a (non-fused) multiply-add on my computer.

nalimilan · 2015-01-20T12:38:36Z

Couldn't a function be designed so that it would automatically use the fastest solution? Maybe I'm being naive though.

simonbyrne · 2015-01-20T12:42:16Z

@nalimilan I'm not sure what you mean: do you mean something like specifying two code paths and letting the compiler pick the one it likes the most?

nalimilan · 2015-01-20T12:49:41Z

@simonbyrne No, I wonder whether u+muladd(fma(-u,f,2(f-u)),g,q) couldn't be automatically translated to the detailed steps you show here if fma is known to be slow. But I guess it's more complex than that.

simonbyrne · 2015-01-20T13:03:49Z

Ah I see: you could do that, but it would probably be around twice as slow as the alternative: a decent software fma (e.g. openlibm) does a few extra operations to avoid things like overflow and double rounding, which are not needed in this particular case.

eschnett · 2015-01-20T15:56:20Z

Another point: fma is guaranteed to avoid rounding the intermediate result, which improves accuracy, and this is what allows alternative algorithms to be used. muladd, on the other hand, may be fast, but may still round the intermediate result. For example, ARM has both vmla and fma instructions -- the former rounds the intermediate result, the latter doesn't.

JeffreySarnoff · 2016-03-20T11:52:30Z

When using double-double algorithms for extended precision math or to get Float64 results from Float32 friendly GPUs, fma is essential (it is possible to do without -- but everything takes much, much longer and gets more complicated). It is that the fma only-rounds-once that matters for this.

After trying it both ways, I am using fma everywhere possible -- whether or not the fma is slow (software emulated); the alternative is backwards facing and confounding for careful numerics.

musm · 2016-09-20T18:47:12Z

how can TargetLowering::isFMAFasterThanFMulAndFAdd be exposed, its a cpp fun?

eschnett · 2016-09-21T13:27:06Z

The canonical way would be to either introduce an intrinsic function that returns a Bool, or to provide a C wrapper that can be called via ccall. Both require changes to Julia's core.

simonbyrne · 2019-06-13T17:02:00Z

There is currently an attempt to do this in the code via comparing muladd to fma, however it doesn't seem to work:

julia/base/special/log.jl

Line 144 in 4a04600

    
           const FMA_NATIVE = muladd(nextfloat(1.0),nextfloat(1.0),-nextfloat(1.0,2)) == -4.930380657631324e-32

as on my machine (which does have FMA) I get:

julia> Base.Math.FMA_NATIVE
false

JeffreySarnoff · 2019-06-13T17:12:06Z

The sign is wrong, try this:

const FMA_NATIVE = muladd(nextfloat(1.0),nextfloat(1.0),-nextfloat(1.0,2)) == 4.930380657631324e-32

simonbyrne · 2019-06-13T17:14:47Z

🤦‍♂

simonbyrne · 2019-06-13T17:16:45Z

(fixed in #32318)

simonbyrne · 2021-11-18T01:05:17Z

I'll close this for now, move all discussion to #33011.

eschnett mentioned this issue Jan 23, 2015

fma calls the system libm, not Julia's openlibm #9890

Closed

ihnorton added the maths Mathematical functions label Jan 30, 2015

simonbyrne mentioned this issue Feb 1, 2015

fma operation? #6330

Closed

simonbyrne mentioned this issue Apr 16, 2015

use Tang's algorithm for log and log1p #10008

Merged

simonbyrne mentioned this issue Sep 6, 2016

Fma JuliaMath/Libm.jl#17

Closed

simonbyrne mentioned this issue Jun 13, 2019

improved implementation of hypot(a,b) #31922

Merged

mbauman mentioned this issue Aug 21, 2019

The test for checking FMA_NATIVE is faulty. #33011

Closed

simonbyrne mentioned this issue Nov 15, 2021

Constant prop gives a different result in the presence of FMA #41450

Closed

simonbyrne closed this as completed Nov 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

indicator for fast FMA #9855

indicator for fast FMA #9855

simonbyrne commented Jan 20, 2015

ivarne commented Jan 20, 2015

simonbyrne commented Jan 20, 2015

simonbyrne commented Jan 20, 2015

nalimilan commented Jan 20, 2015

simonbyrne commented Jan 20, 2015

nalimilan commented Jan 20, 2015

simonbyrne commented Jan 20, 2015

eschnett commented Jan 20, 2015

JeffreySarnoff commented Mar 20, 2016

musm commented Sep 20, 2016

eschnett commented Sep 21, 2016

simonbyrne commented Jun 13, 2019

JeffreySarnoff commented Jun 13, 2019

simonbyrne commented Jun 13, 2019

simonbyrne commented Jun 13, 2019

simonbyrne commented Nov 18, 2021

indicator for fast FMA #9855

indicator for fast FMA #9855

Comments

simonbyrne commented Jan 20, 2015

ivarne commented Jan 20, 2015

simonbyrne commented Jan 20, 2015

simonbyrne commented Jan 20, 2015

nalimilan commented Jan 20, 2015

simonbyrne commented Jan 20, 2015

nalimilan commented Jan 20, 2015

simonbyrne commented Jan 20, 2015

eschnett commented Jan 20, 2015

JeffreySarnoff commented Mar 20, 2016

musm commented Sep 20, 2016

eschnett commented Sep 21, 2016

simonbyrne commented Jun 13, 2019

JeffreySarnoff commented Jun 13, 2019

simonbyrne commented Jun 13, 2019

simonbyrne commented Jun 13, 2019

simonbyrne commented Nov 18, 2021