Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make findmin/findmax behavior match min/max #23209

Closed
StefanKarpinski opened this issue Aug 11, 2017 · 11 comments
Closed

make findmin/findmax behavior match min/max #23209

StefanKarpinski opened this issue Aug 11, 2017 · 11 comments
Assignees
Labels
maths Mathematical functions
Milestone

Comments

@StefanKarpinski
Copy link
Member

See #23155. These should be made to match and documentation updated accordingly.

@StefanKarpinski StefanKarpinski added the maths Mathematical functions label Aug 11, 2017
@StefanKarpinski StefanKarpinski added this to the 1.0 milestone Aug 11, 2017
@KlausC
Copy link
Contributor

KlausC commented Aug 11, 2017

My idea would be to remove the special treatment of NaN from the documentation and implementation of [f]indmin etc. Which are the exact requirements / test cases for floats?

@StefanKarpinski
Copy link
Member Author

What behavior are you proposing then? Treating NaN as a maximal value larger than Inf?

@KlausC
Copy link
Contributor

KlausC commented Aug 11, 2017

In the case, the collection contains any NaN, I could propose to make all functions return the first Nan detected and the corresponding index. That would be in line with max, min, maximum, minimum, which all return NaN in that case.
Unfortunately that is not in line with the implementations of findmin... for array dimension reduction (minimun per row etc.), nor for sparse arrays, which both ignore NaN.
In any case, a change would break existing software.
I think, we should ask for more opinions on that.

Compromise: default behaviour stay as it is, introduce a new flag, which allows to return NaN.

I saw, that in Octave, the NaNs are ignored not only by findmax... but also by max... . So another option would be to change those and leave findmax allone. But a lot of thoughts have already been spent in this forum on the NaN and I don't know, if there was a agreed policy.

@StefanKarpinski
Copy link
Member Author

We're still pre-1.0 which means some breakage is expected and acceptable. Since switching from not getting a NaN to getting a NaN is likely to break code in a rapidly visible way, I think we should just change find{min,max} to match min and max and put a breaking change entry in NEWS.md. The decision was made a while ago to make NaNs poisonous in min/max operations and find{min,max} were just overlooked when that change was made. There does need to be some way to efficiently ignore NaNs, but that's a separate issue which can be generalized to ignoring any subset of values.

@KlausC
Copy link
Contributor

KlausC commented Aug 11, 2017

So I will change:

findmin(A::SparseMatrixCSC) in Base.SparseArrays at sparse/sparsematrix.jl:1782
findmin(A::SparseMatrixCSC{Tv,Ti}, region) where {Tv, Ti} in Base.SparseArrays at sparse/sparsematrix.jl:1780
findmin(A::AbstractArray{T,N} where N, region) where T in Base at reducedim.jl:654
findmin(a) in Base at array.jl:1645

and same for findmax to deliver NaN, whenever the relevant regions(s) contain any NaN.

@StefanKarpinski
Copy link
Member Author

Yes, exactly. Thank you for tackling this – it's greatly appreciated.

@KlausC
Copy link
Contributor

KlausC commented Aug 12, 2017

I am nearly through with the reducedim -case. But found 2 additional inconsistencies.

  1. Corner case: A contains only Inf and -Inf.
julia> A = [Inf -Inf];
julia> findmin(A)
(-Inf, 2)
julia> findmin(A, 1)
([Inf -Inf], [0 2])
julia> findmax(A, 1)
([Inf -Inf], [1 0])

The index is improperly set to zero. (Actually I have the impression it is uninitialized memory)

  1. The usage of typemin and typemax prevents the application to types like BigInt and String,
    which both have a proper ordering, but lack the boundaries.
julia> A = ["a", "b"];
julia> findmin(A)
("a", 1)
julia> findmin(A,1)
ERROR: MethodError: no method matching typemax(::Type{String})

I want to fix both cases by 2 measures:

  1. (Verify) initialization of the output index array with 0.
  2. discard the use of type{min,max}
    Then use the index value 0 as an indication for the infinity value, currently stored in the output data array.

@StefanKarpinski
Copy link
Member Author

Great, I'm glad you're kicking the tires on this – I suspect it hasn't been touched in some time. It seems that initializing with the first slice would address both problems at the same time, no? If there is no first slice, then it should fallback to whatever minimum(T[]) does but initializing with that would be problematic since we don't really know if that's going to work or not until we try it.

@KlausC
Copy link
Contributor

KlausC commented Aug 12, 2017

In principle, yes. I could not yet figure out, how to do that, though. I will try it out for the minimum(array, region cases in reducedim.jl. The second advice is not realistic, because minimum(T[]) throws exception, as we decided in the discourse about extending the domain of minimum to empty collections. https://discourse.julialang.org/t/extend-the-domain-of-maximum-and-minimum-to-include-empty-collection/5135/22. Nevertheless it is not required, because minimum(array, region) does not return any values (matrix dimension of result collapses to zero).

@StefanKarpinski
Copy link
Member Author

But throwing the same exception as minimum is exactly what findmin should do in that case.

@JeffBezanson
Copy link
Member

Fixed by #23227

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maths Mathematical functions
Projects
None yet
Development

No branches or pull requests

4 participants