Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of special-casing of ranges in == and isequal() for AbstractArrays #16364

Closed
nalimilan opened this issue May 14, 2016 · 14 comments · Fixed by #16401
Closed

Get rid of special-casing of ranges in == and isequal() for AbstractArrays #16364

nalimilan opened this issue May 14, 2016 · 14 comments · Fixed by #16401
Assignees
Labels
breaking This change will break code
Milestone

Comments

@nalimilan
Copy link
Member

The AbstractArray default == operator special-cases ranges:

function (==)(A::AbstractArray, B::AbstractArray)
    if size(A) != size(B)
        return false
    end
    if isa(A,Range) != isa(B,Range)
        return false
    end
    for (a, b) in zip(A, B)
        if !(a == b)
            return false
        end
    end
    return true
end

isequal is defined in the same way.

This is obviously not great. A new array type is considered as equal to a standard Array if their elements are all equal, but it is considered as different from a range containing the same elements. This inconsistency is visible even inside Base:

julia> sparse([1,2]) == [1,2]
true

julia> [1,2] == 1:2
false

So it's not clear whether two AbstractArrays need to be of the same type or not to be ==. I can see two consistent solutions:

  1. remove the special-case for ranges (i.e. return true when elements are equal)
  2. change if isa(A,Range) != isa(B,Range) to typeof(A) !== typeof(B)

The second choice would clearly be much more disruptive than the first, and it would make == almost useless for arrays.

@timholy
Copy link
Member

timholy commented May 14, 2016

#13565

@nalimilan
Copy link
Member Author

Ah, funny how I keep bumping into this issue without remembering... :-)

Anyway, should we reopen one of the older issues about hashing ranges? This looks like a significant inconsistency in the language to me (and equality is already complex to master without it...).

@timholy
Copy link
Member

timholy commented May 14, 2016

The concern is that it's an unsolvable problem, in which case reopening the issue won't help.

@nalimilan
Copy link
Member Author

Well, a few ideas were advanced here: #12226 (comment)

It would be really sad that == wouldn't give the expected result just to ensure fast hashing of ranges. It seems to me that == is a much more common operation on ranges than hash (let alone hashing float ranges...).

@timholy
Copy link
Member

timholy commented May 14, 2016

👍 Skip the discussion, go straight for the PR, then 😄. EDIT: meaning, fix the hashing algorithm so you get the best of all worlds.

@nalimilan
Copy link
Member Author

nalimilan commented May 17, 2016

I didn't expect I would be able to do something about it, but it might be easier than I/we thought. Please comment on #16401.

@StefanKarpinski StefanKarpinski added this to the 0.6.0 milestone Sep 13, 2016
@StefanKarpinski
Copy link
Member

I rather suspect this is not going to happen in 0.6.

@StefanKarpinski
Copy link
Member

Note that this is also not breaking – it's technically an optimization.

@StefanKarpinski StefanKarpinski removed this from the 0.6.0 milestone Feb 2, 2017
@nalimilan
Copy link
Member Author

Note that this is also not breaking – it's technically an optimization.

You mean that we should make ranges and arrays hash equal by iterating over all of their elements, and try to optimize this later?

@StefanKarpinski
Copy link
Member

Ah yes, I thought that was what we were already doing, actually!

@StefanKarpinski
Copy link
Member

I kind of suspect that hashing ranges is not all that common, tbh.

@nalimilan
Copy link
Member Author

Yes, that's also my opinion. == is much more useful.

@mbauman
Copy link
Member

mbauman commented Feb 15, 2017

So should we squeeze the breaking portion of this into 0.6? It's a very easy change to make — just delete some code and fix a few tests. But changing the algorithmic complexity of hash from O(1) to O(n) could be terribly breaking… if anyone happens to be hashing large ranges their program suddenly becomes (effectively) non-terminating. hash(1:typemax(Int64)) goes from taking under a microsecond to over a hundred millennia.

Or we could alternatively add a depwarn for hash of relatively large ranges.

@mbauman mbauman added the breaking This change will break code label Feb 15, 2017
@StefanKarpinski
Copy link
Member

I'd vote for leaving this alone until we have a real plan for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants