-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NaN vs wild (or, what's a DomainError, really?) #5234
Comments
I think we should throw an error whenever the condition was explicitly checked for anyway. |
To summarize the prevailing thinking with @StefanKarpinski @loladiro @JeffBezanson: Operations whose results would produce NaNs should throw By this rule of thumb, elementary arithmetic on real floating-point numbers would be allowed to return NaNs if necessary, but more complicated operations like complex division should throw a |
I don't understand. What operations should be allowed to return NaN, (unless the argument is NaN)? Only |
I think we only have a clear idea that real elementary arithmetic (+, -, x, /) should be allowed to produce and propagate NaNs, since the cost of testing for a NaN or inputs that would produce NaNs would be comparable (or more expensive) than the actual operation itself, but not much else. |
So all other functions (also those who does not return NaN for any real input) must check for NaN on input and throw DomainError? |
No, NaN input can just propagate. We mostly don't want to produce NaNs from non-NaNs. |
The poster child is complex division: rather than generating a complex NaN when dividing by zero, just raise an error. Since complex division is so, um, complex, checking for a zero divisor is not exactly going to affect it's performance in any meaningful way. Raising and error is simpler and alerts the user to the problem sooner. |
Why not, but if that's the only operation which would be affected, is it worth introducing an inconsistency? It would be good to have a simple rule for which functions return NaN, and which throw an error (I mean, an user-understandable rule, not something based on computing costs). |
My guess at the moment is that the rule is: only real floating-point +, -, *, / can produce NaNs from non-NaNs. |
I would also add complex floating-point +, -, *, / to that list. The |
As outlined in this discussion, a more systematic solution would be to take advantage of floating-point exception masking. This would allow users to turn exceptions on and off as required (e.g. have exceptions on for development/debugging, and off for production code), reduce the overhead (by fewer range checks) and generally make functions more vectorisation friendly. |
As noted in the discussion linked above, it becomes necessary to turn off hardware floating point exceptions in some cases:
So it becomes rather difficult to decide when to allow floating point (and by extension complex) exceptions and when to disallow them, and the one piece of Julia code may be used in many situations. |
How about the following proposal:
I feel that this should address the above concerns of @elextr, as well as address the needs of @eschnett and @mlubin mentioned on the mailing list discussion, and generally make us much closer to getting the proverbial Kahan tick of approval. Possible issues:
|
@simonbyrne looks a good basic proposal but for the notes below.
Thats ok, but it forces all ccalls to mask it off. So a default of all (or most?) exceptions on is not going to be any worse and that will provide the benefits to pure Julia code. Otherwise all off.
If at least one exception is going to be enabled by default (as above) then it is probably better that the default ccall turns them all off and restores the mask and ccallmask is used for the exceptional circumstances where it is known safe to leave some on. Turning all exceptions off is going to be needed for nearly all ccalls and doing it manually is a subtlety that shouldn't be required of newcomers to Julia. Note this will also need to be applied to all ccalls inside Julia, and its libraries and packages where it can't be proven that the code ccalled is exception safe or cannot generate exceptions.
I would just add "and must catch all exceptions" to both of these. |
I would disagree that |
Thinking about it some more (and at this point I am talking about Julia code only, the issues with foreign code remain as above) if the FPE masks can be changed:
So to avoid failures all Julia FP code is going to have to do one of:
None of these is particularly attractive, the first is very noisy and the other two are exceptionally error prone. At this point I admit I don't have a "nice" solution. |
I fear that this would work about as well as dynamically scoped rounding modes – i.e. not well at all :-\ |
@elextr I don't think this should be an issue: the way I see it, the whole point of raising floating point exceptions is so that you can get rid of whatever is causing them, you don't actually want to Basically, masks should only be changed for two reasons:
I would go so far as to say that packages which modify masks in any other way shouldn't be accepted in METADATA unless they can provide a good justification for why they need to do it. |
@StefanKarpinski I guess the flip side is that I feel that |
@simonbyrne It is often very difficult to ensure ahead of time that no nans appear in a calculation. Thus in my code, I often rather accept that the result is a nan, and fix up the problem later. These calculations happen in a tight inner loop, and handling exceptions in this loop is a non-starter, as it would prevent vectorization. I appreciate that many people don't like having to deal with nans explicitly, because in most cases, nans result from errors e.g. in the input data. However, I quite like this part of the IEEE standard, and would like a way to use nans in Julia. I'm not saying this should be the default, or that all library functions should be nan-safe in this way -- but there needs to be a way to use nans for efficiency in an HPC environment. I'd be happy to use a macro similar to |
Maybe the solution is to introduce alternate versions of the built-in math functions which return NaN instead of throwing an exception? |
@mlubin The problem is then the functions that call the built-in ones, and the ones that call those... I do think the IEEE exception functionality can be made to work in a sensible fashion, and that it provides our best hope for ensuring consistency and interoperability with other programs. |
@simonbyrne, any library/package functions that call |
Agreed here. The logical alternative is to provide versions of the basic functions that return |
|
@mlubin that requires that the code of every library that you want to use (including the C ones) is changed to use It would be better if the new behaviour has the new name that generates exceptions, eg |
Following the open-source philosophy, I've created a package which solves my problem: https://github.com/mlubin/NaNMath.jl. |
I had a bit of time on Friday to try out my idea: On Linux it seems to work fairly well, the only problem being that after throwing one error it seems re-enables the mask. Hopefully there is a sensible way to fix this. One idea would be to use the "official" On OS X, my experience was more frustrating: it does throw exceptions at the correct times, but it doesn't seem to return the correct exception code that indicates the type of exception (Invalid, Inexact, DivByZero, etc). Unfortunately, it seems that OS X doesn't officially support throwing floating point exceptions at all (e.g. see here), so we might have a hard time getting Apple to fix it. If anyone has used this sort of stuff in C or C++, I'd appreciate any help or suggestions that I could try. |
@simonbyrne This example http://www-personal.umich.edu/~williams/archive/computation/fe-handling-example.c can be found by chasing the link you provided. It seems to provide an implementation of the necessary functionality for OS X, both for PPC (!) and Intel Macs. The respective function calls translate to a few machine instructions, so we don't need any Apple support for this. |
Thanks @eschnett. It looks pretty similar to what I was doing, but I'll try it out. I seem to recall that the problem was the signals returned by siginfo were completely arbitrary. |
Bump. |
What would happen if functions had a nan mode in them, so sqrt would be sqrt(x,nan=false) with a default value? To me this seems to let libraries do what they want, let users do what they want, and not mess with code generation. |
(Context: this issue has come up recently in #4967 and elsewhere)
We currently don't treat the results of indeterminate computations consistently between real and complex arithmetic.
Example 1: cosine
Example 2: inverse
inverse of 0
This illustrates a problem with infinity and signed zero. Unlike real infinities, of which there are only two
FloatingPoint
values, you can have 13 possible representations of complex infinities which convey different information about the phase (argument) of the complex infinity:The result is correct if we work with unsigned zeros. However, after accounting for the signed zeros in
complex(+0.0, +0.0)
, this result should becomplex(+Inf, +Inf)
. ADivideError
might also be a reasonable alternative here.inverse of infinities
The inverse mapping is also problematic:
inverse of indeterminates
Is it meaningful to distinguish between these three possible complex NaNs? It seems silly in this example, but consider also:
etc.
Example 3: roots (nonintegral powers)
Whereas
sqrt(-1)
returns the notoriousDomainError
, the inverse square root (as computed byx->x^-1/2
does not, but returns aNaN
instead. This use ofNaN
is sanctioned by IEEE 754 in the specific case of a real operation with no real output.tl;dr: if a floating-point computation returns an indeterminate value, when should it return a
NaN
(or any of its complex variants), and when should it throw an error likeDivideError
orDomainError
? By my reading of IEEE 754, both of these behaviors are allowed (throwing an error would correspond to a signalingNaN
which is trapped by the error handler). Each of these examples in isolation show valid behavior; however, we should be consistent.The text was updated successfully, but these errors were encountered: