NaN vs wild (or, what's a DomainError, really?) #5234

jiahao · 2013-12-25T22:45:53Z

(Context: this issue has come up recently in #4967 and elsewhere)

We currently don't treat the results of indeterminate computations consistently between real and complex arithmetic.

Example 1: cosine

julia> cos(Inf)
ERROR: DomainError
 in cos at math.jl:277

julia> cos(complex(Inf,0))
NaN - 0.0im

Example 2: inverse

inverse of 0

julia> inv(0.0)
Inf

julia> inv(complex(0.0, 0.0))
NaN + NaN*im

This illustrates a problem with infinity and signed zero. Unlike real infinities, of which there are only two FloatingPoint values, you can have 13 possible representations of complex infinities which convey different information about the phase (argument) of the complex infinity:

complex(+Inf, +0.0) #phase is exactly 0 or approaches 0 from the first quadrant
complex(+Inf, +Inf) #phase is in (0, pi/2), i.e. the first quadrant excluding the edges
complex(+0.0, +Inf) #phase is exactly pi/2 or approaches pi/2 from the first quadrant
complex(-0.0, +Inf) #phase approaches pi/2 from the second quadrant
complex(-Inf, +Inf) #phase is in (pi/2, pi), i.e. the second quadrant excluding the edges
complex(-Inf, +0.0) #phase is exactly pi or approaches pi from the second quadrant
complex(-Inf, -0.0) #phase approaches pi from the third quadrant
complex(-Inf, -Inf) #phase is in (pi, 3pi/2), i.e. the third quadrant excluding the edges
complex(-0.0, -Inf) #phase approaches 3pi/2 from the third quadrant
complex(+0.0, -Inf) #phase is exactly 3pi/2 or approaches 3pi/2 from the fourth quadrant
complex(+Inf, -Inf) #phase is in (3pi/2, 2pi), i.e. the fourth quadrant excluding the edges
complex(+Inf, -0.0) #phase approaches 2pi from the fourth quadrant
complex(NaN, NaN) #phase cannot be determined to lie in exactly one of the above regions
#                  and hence the infinity has no valid non-NaN representation in floating point

The result is correct if we work with unsigned zeros. However, after accounting for the signed zeros in complex(+0.0, +0.0), this result should be complex(+Inf, +Inf). A DivideError might also be a reasonable alternative here.

inverse of infinities

The inverse mapping is also problematic:

julia> inv(Inf)
0.0

julia> inv(complex(Inf,0))
0.0 - 0.0im

julia> inv(complex(Inf,Inf))
NaN + NaN*im

inverse of indeterminates

julia> inv(NaN)
NaN

julia> inv(complex(0,NaN))
NaN + NaN*im

julia> inv(complex(NaN, 0))
NaN + NaN*im

julia> inv(complex(NaN, NaN))
NaN + NaN*im

Is it meaningful to distinguish between these three possible complex NaNs? It seems silly in this example, but consider also:

julia> complex(0, NaN) + complex(0, NaN)
complex(0.0,NaN)

julia> complex(0, NaN) * complex(0, NaN)
NaN + NaN*im

etc.

Example 3: roots (nonintegral powers)

julia> sqrt(-1)
ERROR: DomainError

julia> sqrt(complex(-1))
0.0 + 1.0im

julia> (-1)^(-1/2)
NaN

julia> complex(-1)^(-1/2)
6.123233995736766e-17 - 1.0im

Whereas sqrt(-1) returns the notorious DomainError, the inverse square root (as computed by x->x^-1/2 does not, but returns a NaN instead. This use of NaN is sanctioned by IEEE 754 in the specific case of a real operation with no real output.

tl;dr: if a floating-point computation returns an indeterminate value, when should it return a NaN (or any of its complex variants), and when should it throw an error like DivideError or DomainError? By my reading of IEEE 754, both of these behaviors are allowed (throwing an error would correspond to a signaling NaN which is trapped by the error handler). Each of these examples in isolation show valid behavior; however, we should be consistent.

The text was updated successfully, but these errors were encountered:

StefanKarpinski · 2013-12-25T23:16:57Z

I think we should throw an error whenever the condition was explicitly checked for anyway.

jiahao · 2014-01-10T04:53:57Z

To summarize the prevailing thinking with @StefanKarpinski @loladiro @JeffBezanson:

Operations whose results would produce NaNs should throw Exceptions if the cost of testing a NaN would not dominate the cost of the operation.

By this rule of thumb, elementary arithmetic on real floating-point numbers would be allowed to return NaNs if necessary, but more complicated operations like complex division should throw a DivideError.

ivarne · 2014-01-10T07:38:06Z

I don't understand. What operations should be allowed to return NaN, (unless the argument is NaN)? Only /(a::FloatingPoint, b::FloatingPoint)?

jiahao · 2014-01-10T08:01:57Z

I think we only have a clear idea that real elementary arithmetic (+, -, x, /) should be allowed to produce and propagate NaNs, since the cost of testing for a NaN or inputs that would produce NaNs would be comparable (or more expensive) than the actual operation itself, but not much else.

ivarne · 2014-01-10T08:16:44Z

So all other functions (also those who does not return NaN for any real input) must check for NaN on input and throw DomainError?

JeffBezanson · 2014-01-10T08:24:39Z

No, NaN input can just propagate. We mostly don't want to produce NaNs from non-NaNs.

StefanKarpinski · 2014-01-10T08:28:39Z

The poster child is complex division: rather than generating a complex NaN when dividing by zero, just raise an error. Since complex division is so, um, complex, checking for a zero divisor is not exactly going to affect it's performance in any meaningful way. Raising and error is simpler and alerts the user to the problem sooner.

nalimilan · 2014-01-10T08:56:44Z

Why not, but if that's the only operation which would be affected, is it worth introducing an inconsistency? It would be good to have a simple rule for which functions return NaN, and which throw an error (I mean, an user-understandable rule, not something based on computing costs).

JeffBezanson · 2014-01-10T09:05:07Z

My guess at the moment is that the rule is: only real floating-point +, -, *, / can produce NaNs from non-NaNs.

JeffBezanson · 2014-01-10T09:09:57Z

I would also add complex floating-point +, -, *, / to that list. The 1/complex(0.0,0.0) case is a divide-by-zero error, so you can still get NaNs from complex / in other ways.

simonbyrne · 2014-11-19T16:51:28Z

As outlined in this discussion, a more systematic solution would be to take advantage of floating-point exception masking. This would allow users to turn exceptions on and off as required (e.g. have exceptions on for development/debugging, and off for production code), reduce the overhead (by fewer range checks) and generally make functions more vectorisation friendly.

elextr · 2014-11-20T03:38:29Z

As noted in the discussion linked above, it becomes necessary to turn off hardware floating point exceptions in some cases:

calling libraries that take the valid design decision to allow NaNs to appear and then propagate and to test for them at the end
calling libraries that are not expecting exceptions
calling libraries that are written in non-exception aware languages (C, Fortran etc)
preventing exceptions escaping Julia code that is called from languages that are not exception aware

So it becomes rather difficult to decide when to allow floating point (and by extension complex) exceptions and when to disallow them, and the one piece of Julia code may be used in many situations.

simonbyrne · 2014-11-20T11:21:36Z

How about the following proposal:

We provide functions for masking/unmasking different floating point exceptions.
In a Julia session, the invalid exception is unmasked (i.e. 0.0/0.0 throws InvalidException) by default, but all others are masked (1.0/0.0 returns an Inf). We provide a command line argument (--mask-invalid?) to disable this behaviour.
Julia code embedded in other languages, or called via cfunction should use the floating point masks of the caller.
We can provide wrappers (ccallmask?) to enable masking for the duration of the ccall.
Functions and libraries should not change floating point masks without resetting them to their original setting.
Functions which return floating point values should be made to trigger appropriate behaviour, especially for invalid values (see point 2 below). The IEEE standard (chapter 9) provides guidance for basic math functions, which openlibm seems to follow. DomainErrors can be retained for integer-valued functions.

I feel that this should address the above concerns of @elextr, as well as address the needs of @eschnett and @mlubin mentioned on the mailing list discussion, and generally make us much closer to getting the proverbial Kahan tick of approval.

Possible issues:

Users will not be able to reliably catch InvalidException (unless, of course, they explicitly disable masking beforehand).
There doesn't appear to be a set way to trigger an InvalidException. Just using 0.0/0.0 doesn't work, as LLVM changes it to a constant NaN. The openlibm trick of using return (x-x)/0.0, where x is the argument to the function, seems to work though: this should always raise an invalid exception unless x is already a NaN (which is the usual desired behaviour).
LLVM will sometimes rearrange operations involving MXCSR access incorrectly (see LLVM bug #6393). This shouldn't be a big issue for the above, as most changes will occur either globally (invoked by the user in the REPL), or around ccalls (for which LLVM should keep the correct order). But it does mean that you won't be able to enable it reliably for short snippets of Julia code.
- Given the lack of activity on this issue, it doesn't seem like something LLVM are rushing to address. Perhaps this could be a good GSOC project for a student interested in compilers?

elextr · 2014-11-20T13:09:08Z

@simonbyrne looks a good basic proposal but for the notes below.

In a Julia session, the invalid exception is unmasked (i.e. 0.0/0.0 throws InvalidException) by default, but all others are masked (1.0/0.0 returns an Inf). We provide a command line argument (--mask-invalid?) to disable this behaviour.

Thats ok, but it forces all ccalls to mask it off. So a default of all (or most?) exceptions on is not going to be any worse and that will provide the benefits to pure Julia code. Otherwise all off.

We can provide wrappers (ccallmask?) to enable masking for the duration of the ccall.

If at least one exception is going to be enabled by default (as above) then it is probably better that the default ccall turns them all off and restores the mask and ccallmask is used for the exceptional circumstances where it is known safe to leave some on. Turning all exceptions off is going to be needed for nearly all ccalls and doing it manually is a subtlety that shouldn't be required of newcomers to Julia. Note this will also need to be applied to all ccalls inside Julia, and its libraries and packages where it can't be proven that the code ccalled is exception safe or cannot generate exceptions.

Julia code embedded in other languages, or called via cfunction should use the floating point masks of the caller.

Functions and libraries should not change floating point masks without resetting them to their original setting.

I would just add "and must catch all exceptions" to both of these.

simonbyrne · 2014-11-20T13:34:21Z

I would disagree that ccall should mask by default, as it does add to the overhead of calling external functions.

elextr · 2014-11-20T13:37:13Z

Thinking about it some more (and at this point I am talking about Julia code only, the issues with foreign code remain as above) if the FPE masks can be changed:

code that is written expecting exceptions but which is called with exceptions off might still work if any NaNs just propogate. But it may also fail if it finds an unexpected NaN.
code that is written expecting NaNs to propagate, and then tests for them, will fail if called with exceptions on, since it will never get to the test if a NaN occurs.

So to avoid failures all Julia FP code is going to have to do one of:

every function will have to set and restore the FPE mask to match its expectations, or
callers must always know what the expectations of functions they call are and set and restore the masks if the called function needs something different to the current setting, or
callers must know which functions expect the same FP mask as the caller is using and only call those functions.

None of these is particularly attractive, the first is very noisy and the other two are exceptionally error prone. At this point I admit I don't have a "nice" solution.

StefanKarpinski · 2014-11-20T13:37:44Z

I fear that this would work about as well as dynamically scoped rounding modes – i.e. not well at all :-\

simonbyrne · 2014-11-20T14:20:36Z

@elextr I don't think this should be an issue: the way I see it, the whole point of raising floating point exceptions is so that you can get rid of whatever is causing them, you don't actually want to try/catch them (which is why I don't think my issue 1 above is such a problem). So a function should not expect an exception to be raised (because the function will simply stop anyway if one is raised), and should always accept NaN as an argument.

Basically, masks should only be changed for two reasons:

by the user in the REPL
around a ccall for "exception-unaware" numerical libraries

I would go so far as to say that packages which modify masks in any other way shouldn't be accepted in METADATA unless they can provide a good justification for why they need to do it.

simonbyrne · 2014-11-20T14:46:17Z

@StefanKarpinski I guess the flip side is that I feel that DomainError is basically trying to be an InvalidException, except that it's slow, inconsistent and inflexible.

eschnett · 2014-11-20T16:41:46Z

@simonbyrne It is often very difficult to ensure ahead of time that no nans appear in a calculation. Thus in my code, I often rather accept that the result is a nan, and fix up the problem later. These calculations happen in a tight inner loop, and handling exceptions in this loop is a non-starter, as it would prevent vectorization.

I appreciate that many people don't like having to deal with nans explicitly, because in most cases, nans result from errors e.g. in the input data. However, I quite like this part of the IEEE standard, and would like a way to use nans in Julia.

I'm not saying this should be the default, or that all library functions should be nan-safe in this way -- but there needs to be a way to use nans for efficiency in an HPC environment. I'd be happy to use a macro similar to @inbounds for this. I'd also be happy to re-write Julia's standard math functions (sqrt, exp, log) myself. However, there needs to be a reasonable way for experts to do so.

mlubin · 2014-11-20T16:45:01Z

Maybe the solution is to introduce alternate versions of the built-in math functions which return NaN instead of throwing an exception?

simonbyrne · 2014-11-20T16:50:30Z

@mlubin The problem is then the functions that call the built-in ones, and the ones that call those...

I do think the IEEE exception functionality can be made to work in a sensible fashion, and that it provides our best hope for ensuring consistency and interoperability with other programs.

mlubin · 2014-11-20T16:52:35Z

@simonbyrne, any library/package functions that call log should be expecting it to throw an exception, changing that behavior with a flag seems crazy.

mlubin · 2014-11-21T23:07:55Z

What I belatedly came to realise, is that being consistent throughout the program becomes impossible once you use external libraries that may be written to the opposite strategy to yours.

Agreed here. The logical alternative is to provide versions of the basic functions that return NaNs. This at least allows everyone to be consistent according to the situation. I'm guessing that typically one would want a DomainError to be thrown for user friendliness, but in specialized cases, e.g., vectorization, interacting with C code, etc., switching log to log_nan (horrible name) makes more sense.

staticfloat · 2014-11-21T23:16:10Z

log�()? ;)

elextr · 2014-11-21T23:33:11Z

@mlubin that requires that the code of every library that you want to use (including the C ones) is changed to use log_nan since thats what the current behaviour of log is.

It would be better if the new behaviour has the new name that generates exceptions, eg ex_log as an example of an equally bad name :)

mlubin · 2014-11-22T17:04:02Z

Following the open-source philosophy, I've created a package which solves my problem: https://github.com/mlubin/NaNMath.jl.

simonbyrne · 2014-11-24T15:26:23Z

I had a bit of time on Friday to try out my idea:
https://github.com/simonbyrne/julia/tree/fpe
when combined with my mxcsr script gives some interesting results.

On Linux it seems to work fairly well, the only problem being that after throwing one error it seems re-enables the mask. Hopefully there is a sensible way to fix this. One idea would be to use the "official" feenableexcept interface, but I haven't had a chance to look at it yet.

On OS X, my experience was more frustrating: it does throw exceptions at the correct times, but it doesn't seem to return the correct exception code that indicates the type of exception (Invalid, Inexact, DivByZero, etc). Unfortunately, it seems that OS X doesn't officially support throwing floating point exceptions at all (e.g. see here), so we might have a hard time getting Apple to fix it.

If anyone has used this sort of stuff in C or C++, I'd appreciate any help or suggestions that I could try.

eschnett · 2015-08-31T22:41:42Z

@simonbyrne This example http://www-personal.umich.edu/~williams/archive/computation/fe-handling-example.c can be found by chasing the link you provided. It seems to provide an implementation of the necessary functionality for OS X, both for PPC (!) and Intel Macs.

The respective function calls translate to a few machine instructions, so we don't need any Apple support for this.

simonbyrne · 2015-09-01T00:17:26Z

Thanks @eschnett. It looks pretty similar to what I was doing, but I'll try it out. I seem to recall that the problem was the signals returned by siginfo were completely arbitrary.

StefanKarpinski · 2016-11-10T17:40:29Z

Bump.

oscardssmith · 2016-11-10T18:12:46Z

What would happen if functions had a nan mode in them, so sqrt would be sqrt(x,nan=false) with a default value? To me this seems to let libraries do what they want, let users do what they want, and not mess with code generation.

simonp0420 mentioned this issue Dec 26, 2013

RFC: Add error checking to Amos package-based Airy and Bessel methods. Correct # arguments in ccall to :zbiry. #4967

Merged

JeffBezanson added a commit that referenced this issue Dec 27, 2013

add DomainError for ^. part of #5234

1668d1b

jiahao mentioned this issue Jan 7, 2014

numeric equality ignoring type with NaNs equal #5314

Closed

jiahao mentioned this issue Jan 11, 2014

Unexpected return value of NaN when raising to power #5361

Closed

nalimilan mentioned this issue Apr 29, 2014

Inconsistent handling of NaNs by median and mean functions #6486

Closed

jiahao mentioned this issue Sep 22, 2014

should min(0.0, NaN) be NaN? #7866

Closed

jiahao added the error handling Handling of exceptions by Julia or the user label Nov 20, 2014

mlubin mentioned this issue Nov 22, 2014

Register NaNMath JuliaLang/METADATA.jl#1771

Merged

simonbyrne mentioned this issue Jan 8, 2015

Vectorizing sqrt #9672

Closed

andreasnoack mentioned this issue Jan 15, 2015

Should complex numbers be closed at infinity by default? #9790

Closed

jiahao added the complex Complex numbers label Feb 1, 2015

simonbyrne mentioned this issue Apr 16, 2015

RFC: basic floating point exception functionality #6170

Closed

4 tasks

eschnett mentioned this issue Aug 26, 2015

Constant Folding openlibm functions #9942

Open

jiahao mentioned this issue Feb 12, 2016

fix #14111 dividing sparse vector by constant should give sparse vector #14963

Closed

StefanKarpinski added design Design of APIs or of the language itself and removed needs decision A decision on this change is needed labels Sep 13, 2016

StefanKarpinski added this to the 0.6.0 milestone Sep 13, 2016

vtjnash removed this from the 0.6.0 milestone Dec 22, 2016

StefanKarpinski mentioned this issue Feb 10, 2017

API consistency review #20402

Closed

19 tasks

stevengj mentioned this issue Nov 12, 2017

Faster, more correct complex^complex #24570

Merged

3 tasks

vtjnash mentioned this issue Dec 14, 2017

have a single bad index error type? #25005

Closed

c42f mentioned this issue Aug 16, 2018

Trap floating point exceptions #27705

Open

nalimilan mentioned this issue Sep 3, 2018

Mean of an empty collection #28777

Open

simonbyrne mentioned this issue Feb 14, 2019

Fix incorrect sign of atanh(complex(x,y)) if x == -1 #31061

Merged

musm mentioned this issue Jul 24, 2020

Revisit erroring behaviour of special functions #36786

Open

nalimilan mentioned this issue Jan 23, 2021

Allow passing 0-length vectors to cor JuliaStats/Statistics.jl#70

Open

kgryte mentioned this issue Nov 28, 2022

Add complex number support to equal data-apis/array-api#528

Merged

KristofferC mentioned this issue Dec 26, 2022

Complex division:when divisor is 0+0im #47999

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NaN vs wild (or, what's a DomainError, really?) #5234

NaN vs wild (or, what's a DomainError, really?) #5234

jiahao commented Dec 25, 2013

StefanKarpinski commented Dec 25, 2013

jiahao commented Jan 10, 2014

ivarne commented Jan 10, 2014

jiahao commented Jan 10, 2014

ivarne commented Jan 10, 2014

JeffBezanson commented Jan 10, 2014

StefanKarpinski commented Jan 10, 2014

nalimilan commented Jan 10, 2014 •

edited

Loading

JeffBezanson commented Jan 10, 2014

JeffBezanson commented Jan 10, 2014

simonbyrne commented Nov 19, 2014

elextr commented Nov 20, 2014

simonbyrne commented Nov 20, 2014

elextr commented Nov 20, 2014

simonbyrne commented Nov 20, 2014

elextr commented Nov 20, 2014

StefanKarpinski commented Nov 20, 2014

simonbyrne commented Nov 20, 2014

simonbyrne commented Nov 20, 2014

eschnett commented Nov 20, 2014

mlubin commented Nov 20, 2014

simonbyrne commented Nov 20, 2014

mlubin commented Nov 20, 2014

mlubin commented Nov 21, 2014

staticfloat commented Nov 21, 2014

elextr commented Nov 21, 2014

mlubin commented Nov 22, 2014

simonbyrne commented Nov 24, 2014

eschnett commented Aug 31, 2015

simonbyrne commented Sep 1, 2015

StefanKarpinski commented Nov 10, 2016

oscardssmith commented Nov 10, 2016

NaN vs wild (or, what's a DomainError, really?) #5234

NaN vs wild (or, what's a DomainError, really?) #5234

Comments

jiahao commented Dec 25, 2013

Example 1: cosine

Example 2: inverse

inverse of 0

inverse of infinities

inverse of indeterminates

Example 3: roots (nonintegral powers)

StefanKarpinski commented Dec 25, 2013

jiahao commented Jan 10, 2014

ivarne commented Jan 10, 2014

jiahao commented Jan 10, 2014

ivarne commented Jan 10, 2014

JeffBezanson commented Jan 10, 2014

StefanKarpinski commented Jan 10, 2014

nalimilan commented Jan 10, 2014 • edited Loading

JeffBezanson commented Jan 10, 2014

JeffBezanson commented Jan 10, 2014

simonbyrne commented Nov 19, 2014

elextr commented Nov 20, 2014

simonbyrne commented Nov 20, 2014

elextr commented Nov 20, 2014

simonbyrne commented Nov 20, 2014

elextr commented Nov 20, 2014

StefanKarpinski commented Nov 20, 2014

simonbyrne commented Nov 20, 2014

simonbyrne commented Nov 20, 2014

eschnett commented Nov 20, 2014

mlubin commented Nov 20, 2014

simonbyrne commented Nov 20, 2014

mlubin commented Nov 20, 2014

mlubin commented Nov 21, 2014

staticfloat commented Nov 21, 2014

elextr commented Nov 21, 2014

mlubin commented Nov 22, 2014

simonbyrne commented Nov 24, 2014

eschnett commented Aug 31, 2015

simonbyrne commented Sep 1, 2015

StefanKarpinski commented Nov 10, 2016

oscardssmith commented Nov 10, 2016

nalimilan commented Jan 10, 2014 •

edited

Loading