-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trap floating point exceptions #27705
Comments
I'd say it would be relatively easy to get this working on linux as there's the It will require minor changes to the runtime (see, eg, https://github.com/JuliaLang/julia/blob/master/src/signals-unix.c#L743 ) so that the SIGFPE is turned into something other than a As to the correct julia API for calling
See also #5234 for somewhat related discussion. |
As it turns out, we can do the following (at least on linux x86_64 with julia >= 0.6) without changing the runtime: # Bits for x86 FPU control word
const FE_INVALID = 0x1
const FE_DIVBYZERO = 0x4
const FE_OVERFLOW = 0x8
const FE_UNDERFLOW = 0x10
const FE_INEXACT = 0x20
fpexceptions() = ccall(:fegetexcept, Cint, ())
function setfpexceptions(f, mode)
prev = ccall(:feenableexcept, Cint, (Cint,), mode)
try
f()
finally
ccall(:fedisableexcept, Cint, (Cint,), mode & ~prev)
end
end [edit: fixed some brokenness] Thence, julia> x = 0.0
0.0
julia> 1.0/x
Inf
julia> setfpexceptions(FE_DIVBYZERO) do
1.0/x
end
ERROR: DivideError: integer division error
Stacktrace:
[1] /(::Float64, ::Float64) at ./float.jl:0
[2] setfpexceptions(::##1#2, ::UInt8) at /home/tcfoster/sigfpe.jl:13 Unfortunately the system throws an integer division error, but at least you get a backtrace. |
Seems like a good thing to have official support for and throw the right exception. |
Yep. I wonder how these are best mapped to exceptions. The IEEE 754 standard defines five standard exceptions. We could just map these to our existing exceptions where possible:
Alternatively we could just define a new |
That seems like the best option since, as you say, trapping FPE is basically a debugging tool, and it would be annoying to have them caught by code that is not expecting them. Maybe an abstract HardwareFPException with more specific exceptions inheriting from it? |
That would also work and is easy to implement. On balance I'm inclined to have a single type for simplicity. Given it's probably a debugging tool, and that we don't catch exceptions by type in any case. |
I just discovered significant prior discussion related to these issues, particularly at @simonbyrne are you still interested in thinking about floating point exceptions? This issue is slightly different from the previous ones, in that it asks whether we should have a way to turn SIG_FPEs into julia exceptions immediately via the signal handler. That should be fairly easy, but I'm not completely sure about the correct API. Currently I think it should be a debugging tool only, perhaps emitting a single I do think the prior discussion (eg, #5234 (comment)) shows that using dynamically scoped FPE masks leads to inherently non-composable code, and should not be used for "real work". This is also my experience in trying to turn on |
The other issue is that LLVM isn't aware exceptions, so may reorder operations or propagate constants in a way so that exceptions aren't triggered. The situation has changed somewhat with the addition of LLVM constrained intrinsics, but we need to figure out how to integrate them. My current thinking is that floating point exceptions and rounding should be done using Cassette.jl, as this would let you overload the necessary intrinsics and allow users to add custom hooks. |
That's interesting, thanks. I figured having a solid general solution for FPEs would require some fairly deep integration with the compiler. Would that subsume the feature request in this issue (ie, the ability to do simple fail-fast SIGFPE trapping for debugging)? To me these seem like they might be somewhat orthogonal features. |
My comment was specifically referring to your concerns about the dynamic scoping, but you're right they are somewhat orthogonal. I actually did try this out on a branch 4 years ago, and I was surprised how well it worked given my scant knowledge of C, but there were a few issues that would need to be figured out. |
Hah, I had an extremely similar branch with the following relevant commit: adeaa4b Enabling and disabling the FPE processor flags seemed pretty ugly and system dependent when I looked into it. |
So, if we were going to implement a version of this for debugging purposes, how about the following concrete and minimal proposal:
|
The interface is a bit low-level; adding exception types for every exception would allow for an interface like |
Yes, I'm not sure about the bitmasks. But I think you want old_fpes = setfpe(new_fpes)
some_code_to_be_debugged()
setfpe(old_fpes) and this seems like the simplest way to achieve it with the least number of new functions and types. I guess setfpe(new_fpes) do
some_code_to_be_debugged()
end though I'm not sure we should encourage that! |
Given all the trouble we had with |
Ok, so the single exception type and support for recognizing SIGFPE is the minimal possible change, though testing this properly will also require |
This issue has been quiet for a long time - but this would be a very useful debugging tool! Just arrived here searching for the ability to use |
One thing that may make |
This would be a killer feature. Especially in the age of machine learning. |
This is really cool. I'm imagining defining a debug mode where all sNaN demo, first a normal qNaN and then the sNaN: julia> x = NaN
NaN
julia> setfpexceptions(FE_INVALID) do
2.0*x
end
NaN
julia> x = reinterpret(Float64,8189<<50)
NaN
julia> setfpexceptions(FE_INVALID) do
2.0*x
end
ERROR: DivideError: integer division error
Stacktrace:
[1] *(x::Float64, y::Float64)
@ Base ./float.jl:410
[2] (::var"#19#20")()
@ Main ./REPL[43]:2
[3] setfpexceptions(f::var"#19#20", mode::UInt8)
@ Main ./REPL[18]:4
[4] top-level scope
@ REPL[43]:1 Another fun use case is to use julia> x = Float64.(1:256);
julia> function mysum(x) # simd
s = zero(eltype(x))
for i ∈ eachindex(x)
@fastmath s += x[i]
end
s
end
mysum (generic function with 1 method)
julia> setfpexceptions(FE_INEXACT) do
mysum(x)
end
32896.0
julia> x[5] = 1e18 # too big for exact
1.0e18
julia> x[101] = -1e18 # cancels
-1.0e18
julia> mysum(x) # intermediate rounding inside SIMD code
32812.0
julia> Float64(mysum(big.(x)))
32790.0
julia> setfpexceptions(FE_INEXACT) do
mysum(x)
end
ERROR: DivideError: integer division error
Stacktrace:
[1] add_fast
@ ./fastmath.jl:172 [inlined]
[2] mysum(x::Vector{Float64})
@ Main ./REPL[59]:4
[3] (::var"#35#36")()
@ Main ./REPL[69]:2
[4] setfpexceptions(f::var"#35#36", mode::UInt8)
@ Main ./REPL[18]:4
[5] top-level scope
@ REPL[69]:1 In theory, you could |
Unfortunately, this doesn't seem to work on my M1/ARM Linux. |
For anyone reading this thread looking for a temporary solution while there remains no supported way of trapping these exceptions, if you want to replicate c42f's code snippet from earlier in this thread on other architectures, you can modify it thus: if Sys.ARCH == :x86_64
const FE_INVALID = 0x1
const FE_DIVBYZERO = 0x4
const FE_OVERFLOW = 0x8
const FE_UNDERFLOW = 0x10
const FE_INEXACT = 0x20
elseif Sys.ARCH == :aarch64
const FE_INVALID = 0x1
const FE_DIVBYZERO = 0x2
const FE_OVERFLOW = 0x4
const FE_UNDERFLOW = 0x8
const FE_INEXACT = 0x10
else
error("You need to look up the corresponding values for FE exceptions in your architecture, which is: $(Sys.ARCH)")
end
fpexceptions() = ccall(:fegetexcept, Cint, ())
function setfpexceptions(f, modes...)
mode = foldl(|, modes)
prev = ccall(:feenableexcept, Cint, (Cint,), mode)
try
f()
finally
ccall(:fedisableexcept, Cint, (Cint,), mode & ~prev)
end
end
Also made a modification to it so you can set multiple flags in the arguments: setfpexceptions(FE_DIVBYZERO, FE_INVALID, FE_OVERFLOW) do
# your code here
end Tested on Julia 1.10.4 on two linux machines, one with x86 and the other M1 arm architecture. |
Some languages and compilers allow trapping of floating point exceptions, e.g. gfortran -ffpe-trap https://gcc.gnu.org/onlinedocs/gfortran/Debugging-Options.html
Is it possible to have a similar functionality in julia? That would be very useful to debug a NaN or Inf suddenly appearing in a program.
#6170 looks related
The text was updated successfully, but these errors were encountered: