Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent behavior for min, findmin, and indmin wrt negative zero #14621

Closed
kgryte opened this issue Jan 10, 2016 · 31 comments
Closed

Inconsistent behavior for min, findmin, and indmin wrt negative zero #14621

kgryte opened this issue Jan 10, 2016 · 31 comments
Labels
docs This change adds or pertains to documentation

Comments

@kgryte
Copy link

kgryte commented Jan 10, 2016

The behavior of min accommodates negative zero such that

julia> min( 0.0, -0.0 )
-0.0

while indmin, which uses findmin, does not accommodate negative zero

julia> findmin( [2.0, 1.0, 0.0, -0.0] )
(0.0,3)
julia> indmin( [2.0, 1.0, 0.0, -0.0] )
3

What is the reasoning behind this discrepancy? I would expect the behavior to be consistent, but possibly there exists a rationale for when to consider and not consider the distinction between positive and negative zero in core functions.

@zhmz90
Copy link
Contributor

zhmz90 commented Jan 10, 2016

julia> -0.0 == 0.0
true

julia> -0.0 === 0.0
false

Details at Floating-point zero

min{T<:AbstractFloat}(x::T, y::T) = ifelse((y < x) | (signbit(y) > signbit(x)),
                                    ifelse(isnan(y), x, y), ifelse(isnan(x), y, x))
function findmin(a)
    if isempty(a)
    throw(ArgumentError("collection must be non-empty"))
    end
    s = start(a)
    mi = i = 1
    m, s = next(a, s)
    while !done(a, s)
        ai, s = next(a, s)
        i += 1
        if ai < m || m!=m
            m = ai
            mi = i
        end
    end
    return (m, mi)
end

Hope those above will help you.

@kgryte
Copy link
Author

kgryte commented Jan 10, 2016

@zhmz90 Thanks for your comment, but this does not answer my question. I know the actual implementation, as I actually link to the relevant files and lines in my original post. I am more curious why findmin, findmax, indmin, and indmax do not check (by design) for negative zero.

@timholy
Copy link
Member

timholy commented Jan 10, 2016

This seems to be a side effect of the implementation for fixing #10729. I don't think functions should ever deliberately distinguish between -0.0 and 0.0.

@simonbyrne
Copy link
Contributor

I agree with @timholy on this. I think if we are to change this we would have to think hard about this.

As far as I know, the only real case where signed zeros should give "distinguishable" results is for things like 1/x.

@StefanKarpinski
Copy link
Member

Arguably, this should reflect whatever isles does, which is currently to distinguish -0.0 and 0.0, but then again by that criterion if there is a NaN it should be the maximum of an array, which is weird as well.

@kgryte
Copy link
Author

kgryte commented Jan 10, 2016

When comparing to other platforms, not a whole lot of consistency. For example,

  • R does not seem to make a distinction.

    which.min( list(1, 2, 0, -0) ) # => 3, not 4
    
  • Python does not make a distinction.

    min( 0.0, -0.0 ) # => 0.0
    min( -0.0, 0.0 ) # => -0.0
  • Java does make a distinction.

  • GNU Octave and, I believe, MATLAB do make a distinction.

  • BLAS routine idamax (reference implementation) does not make a distinction.

@StefanKarpinski Regarding NaN, I suppose this is one motivation for MATLAB having NaN versions of various functions; e.g., nanmax, nanmin, nanmean.

Admittedly, a bit awkward that -0.0 == 0.0 is true and then they are treated as unequal in implementations. A possible gotcha if people are unaware.

However, I can imagine making a distinction patches imperfect machine arithmetic, where -0.0 is the result of underflow, and thus, the result of some computation is actually smaller than 0.0, but just cannot be represented.

@timholy
Copy link
Member

timholy commented Jan 10, 2016

I think we should file a bug against IEEE 754 and get rid of signed zeros 😄.

@StefanKarpinski
Copy link
Member

And NaN while we're at it.

@simonbyrne
Copy link
Contributor

@StefanKarpinski What does isless have to do with this?

@kgryte Your examples seem to be referring to 2 different functions:

a) The result of min(0.0,-0.0)
b) The result of "find the index of the smallest element in an array"

For (a), I've found:

Language min(0.0,-0.0)/min(-0.0,0.0)
R left
Python left
Octave -0.0
Java -0.0
JavaScript -0.0

No languages seem to treat zeros differently for (b).

From my understanding of it, the general idea in the IEEE754 spec is that 0.0 and -0.0 are to be treated as equal and should give equal results (note that this includes the current behaviour of min). The exception is when the function is undefined at 0, in which case the result can be obtained from the appropriate limit. For example:

julia> 1.0/0.0
Inf

julia> 1.0/-0.0
-Inf

julia> atan2(0.0,-1.0)
3.141592653589793

julia> atan2(-0.0,-1.0)
-3.141592653589793

For a more detailed justification, see "Branch Cuts for Complex Elementary Functions, or Much Ado About Nothing's Sign Bit", by William Kahan.

@nalimilan
Copy link
Member

Agreed, there's no problem here. findmin returns the first element equal (in the sense of ==) to the minimum. The fact that it returns -0.0 gives you a bit more information if you're interested in it, but shouldn't have an effect in the cases where you wouldn't care about the difference between -0.0 and 0.0. So that's a win-win.

@kgryte
Copy link
Author

kgryte commented Jan 11, 2016

@simonbyrne Thanks for link. Yes, my examples interleave min and findmin because that is partly the point. If I look for a minimum value and I get back a particular element in a list, intuitively, I would hope that I would also be able to return the index for that particular element. Perhaps subtle bugs do not play out in practice, but I feel like this could be problematic when, say, comparing bit sequences.

@kgryte
Copy link
Author

kgryte commented Jan 11, 2016

@simonbyrne And actually, your example is also illustrative. Here is a contrived example:

julia> atan2(min(0.0, -0.0), -1.0)
-3.141592653589793

julia> atan2(min(-0.0, 0.0), -1.0)
-3.141592653589793

but then

julia> a = [0.0,-0.0]

julia> atan2(a[indmin(a)], -1.0)
3.141592653589793

julia> b = [-0.0,0.0]

julia> atan2(b[indmin(b)], -1.0)
-3.141592653589793

Just by switching the order of -0.0 and 0.0 I have computed different results. min will consistently return the same result, but indmin won't, as the observed behavior is dependent on element order. For this particular use case, I fail to see why element order should be a determining factor.

@simonbyrne
Copy link
Contributor

I was originally somewhat skeptical of the changes to min/max, but I did come to appreciate that it could be useful for cases like 1/max(0.0,x), and as @nalimilan pointed out, there's no real downside to it.

On the other hand, this would be a breaking change, and give different results than any other language. Thus there is a significant onus to establish why it would be worthwhile, and so far I don't think those examples are very convincing (if the array has values on both sides of the branch cut, why would you take the minimum?).

One possible option might be to provide the comparison function as an argument, similar to sort, so that you could optionally use isless.

@StefanKarpinski
Copy link
Member

@simonbyrne, isless is relevant (as you're alluding to at the end here) as an ordering in which -0.0 and 0.0 are distinct.

@tomasaschan
Copy link
Member

A suggestion I haven't seen in this thread (sorry for barging in on the discussion - stumbled on this while looking for something else and found it interesting...) is to change isless so that

isless(-0.0, 0.0) == isless(0.0, -0.0) == false

with the motivation that isless(-0.0,0.0) == (0.0 == -0.0) (the current behavior) doesn't really make sense anyway.

@StefanKarpinski
Copy link
Member

A possibly non-obvious implication of that is that you would need to use a stable sort for floating-point numbers – or simulate doing so for the zeros at least, which is tricky since we don't know where the zeros need to end up until after the sorting has been done. I don't know how to do this efficiently without just using a stable sort, which is far less efficient than using an unstable sort like quicksort.

@nalimilan
Copy link
Member

I remember discussing this in the past, you should be able to find the issue in GitHub.

@simonbyrne
Copy link
Contributor

As we're unlikely to change behaviour here, can we close this?

@StefanKarpinski
Copy link
Member

I'd like to have the logic here fully spelled out in a clear way.

@simonbyrne
Copy link
Contributor

How about: signed zeros are equivalent, and should give equivalent results, unless the function either
a. operates explicitly on the sign bit (e.g. abs, copysign) or bit patterns (reinterpret)
b. is undefined or in some way ambiguous at 0 (e.g 1/0, complex branch cuts)

@eschnett
Copy link
Contributor

There are three functions to compare numbers, ==, ===, and isequal. Since === and == cover the "strict" and "non-strict" cases, we can choose isequal either way.

But what about < and isless? The operators < and == go together -- is there a notion of ordering that corresponds to ===, i.e. where -0.0 comes before +0.0? Currently, that's isless. (Maybe the operator <<< could be introduced, if that wasn't confusing wrt. the existing >>>?)

A different bag are SIMD vectors. There, the operators == and < should ideally return vectors of booleans. To make this work nicely, containers would need to use isequal and isless for comparisons in all cases.

@StefanKarpinski
Copy link
Member

@simonbyrne, that's a pretty good start but what about min and max? In some sense their behavior can be described as "not harmful" since it's correct if you care about the sign of zero and also correct if you don't.

@StefanKarpinski
Copy link
Member

I'm going to propose this modification to @simonbyrne's proposal for rules:


Functions should obey the identity that f(-0.0) == f(0.0) unless

  1. f explicitly operates on the sign bit or bit patterns – e.g. signbit, copysign, reinterpret.
  2. f is undefined or in some way ambiguous at zero – e.g 1/0, complex branch cuts.

Note that this leaves room for functions like min to return the "lesser" of the two zeros:

julia> min(-0.0, 0.0)
-0.0

julia> min(0.0, -0.0)
-0.0

julia> max(-0.0, 0.0)
0.0

julia> max(0.0, -0.0)
0.0

Or to otherwise return the more appropriate of the two signed zeros:

julia> sign(0.0)
0.0

julia> sign(-0.0)
-0.0

This does not leave room for indmin or findmin to distinguish between -0.0 and 0.0:

julia> indmin([2.0, 1.0, 0.0, -0.0])
3

julia> indmin([2.0, 1.0, -0.0, 0.0])
3

julia> findmin([2.0, 1.0, 0.0, -0.0])
(0.0,3)

julia> findmin([2.0, 1.0, -0.0, 0.0])
(-0.0,3)

Since both indmin and findmin return an index, they must return the same index, regardless of which of the zeros may or may not be negative. On the other hand, minimum can return the "lesser" zero:

julia> minimum([2.0, 1.0, 0.0, -0.0])
-0.0

julia> minimum([2.0, 1.0, -0.0, 0.0])
-0.0

julia> minimum([2.0, 1.0, 0.0, 0.0])
0.0

@StefanKarpinski StefanKarpinski added the docs This change adds or pertains to documentation label Jan 26, 2016
@StefanKarpinski
Copy link
Member

I'm adding the documentation label here. Where's an appropriate place to put this in the docs?

@StefanKarpinski
Copy link
Member

Interesting point here – the documentation for indmin and findmin nowhere guarantees which minimum it will return in the case that there are multiple, so any of them should do, which sort of makes it seem plausible to allow returning the index and value of -0.0 when present:

help?> findmin
search: findmin findmin! findmax findmax!

  findmin(A, dims) -> (minval, index)

  For an array input, returns the value and index of the minimum over the
  given dimensions.

  findmin(itr) -> (x, index)

  Returns the minimum element and its index.

help?> indmin
search: indmin findmin findmin!

  indmin(itr) -> Integer

  Returns the index of the minimum element in a collection.

Of course, that violates the above rules, but maybe that's an argument for different rules?

@nalimilan
Copy link
Member

+1 for the rules above. As for where to document this, maybe it's the occasion to start a section in the manual about conventions one should respect when programming in Julia, which could also be used to document the expectations of standard interfaces? It could go right after the style guide.

I would think indmin/indmax should also document that they return the first minimum (considering -0.0 and 0.0 equal), as I don't see any reason to return anything else, and this guaranty can be useful.

@kgryte
Copy link
Author

kgryte commented Jan 26, 2016

As long as behavior is clearly documented and the design decision is clear, the proposed rules sound reasonable.

Just to circle back to my original concern regarding consistency and the reason why

julia> a = [2.0, 1.0, 0.0, -0.0]

julia> minimum(a)
-0.0

julia> a[indmin(a)]
0.0

return different results. The main gist is that min, max, minimum, maximum, et al accommodate the special use case of distinguishing 0.0 and -0.0, but this is not a general design principle. And importantly, I, as a user, should not expect consistent universal behavior and should be cognizant of which functions to use when I do care about the sign of 0.

In terms of documentation, might be beneficial to actually list the operations where the sign of 0 is considered and possibly add a bit of background as to why the sign is considered in the first place.

The fact that various numeric computing environments seem to implement the same functionality differently suggests that no "rules" exist, beside the bare minimum set forth by IEEE 754, detailing when the sign of 0 should be considered or not. Awareness of this fact is important when, say, porting a program from one environment to another, as interpretations may differ when considering signs.

@StefanKarpinski
Copy link
Member

I have to say it's tempting to have indmin and findmin return the index of a negative zero if present.

@eschnett
Copy link
Contributor

@StefanKarpinski It's expensive, since one may have to generate multiple machine instructions for this. LLVM's intrinsic minnum does not distinguish between the sign of zeros, so one has to follow up with extracting the sign bit.

@StefanKarpinski
Copy link
Member

We're already returning -0.0 for minimum([2.0, 1.0, 0.0, -0.0]).

@JeffBezanson
Copy link
Member

Fixed by #23155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation
Projects
None yet
Development

No branches or pull requests

9 participants