Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Make LinearIndices consistent with linearindices for vectors #26682

Merged
merged 3 commits into from
Apr 4, 2018

Conversation

nalimilan
Copy link
Member

Change LinearIndices(x::AbstractVector) to use the same indices as axes(x, 1). This is consistent with linearindices(x) and allows indexing into x.
Change CartesianIndices to use custom axes so that it works with indices returned by LinearIndices for vectors.

This matches the description of linear indices in the manual, and should allow deprecating linearindices in favor of LinearIndices if we want. See #25901 (comment).

If we want this PR, docstrings could be improved to be more explicit.

Before:

julia> include("test/TestHelpers.jl");

julia> x = TestHelpers.OAs.OffsetArray(1:3, (3,))
Main.TestHelpers.OAs.OffsetArray{Int64,1,UnitRange{Int64}} with indices 4:6:
 1
 2
 3

julia> linearindices(x)
4:6

julia> LinearIndices(x)
LinearIndices{1,Tuple{UnitRange{Int64}}} with indices 4:6:
 1
 2
 3

julia> CartesianIndices(x)
3-element CartesianIndices{1,Tuple{UnitRange{Int64}}}:
 CartesianIndex(4,)
 CartesianIndex(5,)
 CartesianIndex(6,)

julia> y = TestHelpers.OAs.OffsetArray([1 2; 3 4], (3, 2))
Main.TestHelpers.OAs.OffsetArray{Int64,2,Array{Int64,2}} with indices 4:5×3:4:
 1  2
 3  4

julia> linearindices(y)
Base.OneTo(4)

julia> LinearIndices(y)
LinearIndices{2,Tuple{UnitRange{Int64},UnitRange{Int64}}} with indices 4:5×3:4:
 1  3
 2  4

julia> CartesianIndices(y)
2×2 CartesianIndices{2,Tuple{UnitRange{Int64},UnitRange{Int64}}}:
 CartesianIndex(4, 3)  CartesianIndex(4, 4)
 CartesianIndex(5, 3)  CartesianIndex(5, 4)

After:

julia> LinearIndices(x)
LinearIndices{1,Tuple{UnitRange{Int64}}} with indices 4:6:
 4
 5
 6

julia> CartesianIndices(x)
CartesianIndices{1,Tuple{UnitRange{Int64}}} with indices 4:6:
 CartesianIndex(4,)
 CartesianIndex(5,)
 CartesianIndex(6,)

julia> LinearIndices(y)
LinearIndices{2,Tuple{UnitRange{Int64},UnitRange{Int64}}} with indices 4:5×3:4:
 1  3
 2  4

julia> CartesianIndices(y)
CartesianIndices{2,Tuple{UnitRange{Int64},UnitRange{Int64}}} with indices 4:5×3:4:
 CartesianIndex(4, 3)  CartesianIndex(4, 4)
 CartesianIndex(5, 3)  CartesianIndex(5, 4)

@nalimilan nalimilan added the arrays [a, r, r, a, y, s] label Apr 2, 2018
@mbauman
Copy link
Member

mbauman commented Apr 2, 2018

I think LinearIndices can now have an IndexLinear IndexStyle. I like how this improves the symmetry between LinearIndices and CartesianIndices. In short — both are idempotent, but one returns the n-dimensional tuple and the other returns the linear index.

This seems like the right thing to do.

@@ -418,6 +419,7 @@ module IteratorsMD
@inbounds result = reshape(1:prod(dims), dims)[(I .- first.(iter.indices) .+ 1)...]
return result
end
@inline Base.getindex(iter::LinearIndices{1,R}, i::Int) where {R} = i
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really true that this definition could apply to all LinearIndices and not just LinearIndices{1}?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only used for LinearIndices here. Do you suggest it could work for any dimensionality, i.e. that we could remove the more general method above?

Regarding switching to IndexLinear, I'm concerned about performance, given the comment that @timholy has put in the method above. I've tested it, and without bounds checks this test fails:

@test_throws BoundsError LinearIndices(i)[2]

Given that checking bounds against the tuple of axes is going to be slow in the multidimensional case, maybe we need to keep IndexCartesian? I must say I don't understand where the bounds are checked currently since the method above uses @inbounds.

Copy link
Member

@mbauman mbauman Apr 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that method is sketchy — it's not checking bounds at all. I'm proposing:

shell> git diff
diff --git a/base/multidimensional.jl b/base/multidimensional.jl
index 54124ed0b3..d56386b88b 100644
--- a/base/multidimensional.jl
+++ b/base/multidimensional.jl
@@ -410,16 +410,13 @@ module IteratorsMD
     LinearIndices(A::AbstractArray) = LinearIndices(CartesianIndices(A))

     # AbstractArray implementation
-    Base.IndexStyle(::Type{LinearIndices{N,R}}) where {N,R} = IndexCartesian()
+    Base.IndexStyle(::Type{LinearIndices{N,R}}) where {N,R} = IndexLinear()
     Base.axes(iter::LinearIndices{N,R}) where {N,R} = iter.indices
     Base.size(iter::LinearIndices{N,R}) where {N,R} = length.(iter.indices)
-    @inline function Base.getindex(iter::LinearIndices{N,R}, I::Vararg{Int, N}) where {N,R}
-        dims = length.(iter.indices)
-        #without the inbounds, this is slower than Base._sub2ind(iter.indices, I...)
-        @inbounds result = reshape(1:prod(dims), dims)[(I .- first.(iter.indices) .+ 1)...]
-        return result
+    @inline function Base.getindex(iter::LinearIndices, i::Int)
+        @boundscheck checkbounds(iter, i)
+        i
     end
-    @inline Base.getindex(iter::LinearIndices{1,R}, i::Int) where {R} = i
 end  # IteratorsMD

Amazingly, it seems to have identical performance comparing srand(0); @btime L[i,j,k] setup=(i=rand(1:3);j=rand(1:4);k=rand(1:5)) with L = LinearIndices(rand(3,4,5)). And it fixes this case:

# This PR + my patch above:
julia> L = LinearIndices(rand(3,4,5));
julia> srand(0); @btime L[i,j,k] setup=(i=rand(1:3);j=rand(1:4);k=rand(1:5))
  31.524 ns (0 allocations: 0 bytes)
22
julia> L[3,4,6]
ERROR: BoundsError: attempt to access 3×4×5 LinearIndices{3,Tuple{Base.OneTo{Int64},Base.OneTo{Int64},Base.OneTo{Int64}}} at index [3, 4, 6]

# A 7-day old master: bf45de8c33 (2018-03-26 16:43 UTC)
julia> srand(0); @btime L[i,j,k] setup=(i=rand(1:3);j=rand(1:4);k=rand(1:5))
  31.048 ns (0 allocations: 0 bytes)
22
julia> L[3,4,6]
72

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what I had in mind too. Glad to hear it doesn't affect performance. I've updated the PR.

Change LinearIndices(x::AbstractVector) to use the same indices as axes(x, 1).
This is consistent with linearindices(x) and allows indexing into x.
Change CartesianIndices to use custom axes so that it works with indices returned by
LinearIndices for vectors.
Base.IndexStyle(::Type{CartesianIndices{N,R}}) where {N,R} = IndexCartesian()
@inline Base.getindex(iter::CartesianIndices{N,R}, I::Vararg{Int, N}) where {N,R} = CartesianIndex(first.(iter.indices) .- 1 .+ I)
@inline Base.getindex(iter::CartesianIndices{N,R}, I::Vararg{Int, N}) where {N,R} = CartesianIndex(I)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, on second thought — we need to add @boundscheck checkbounds(iter, I) to this implementation, too. Now's the time to do it since this is definitely a performance gain right now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I've added a commit (AFAICT that's checkbounds(iter, I...)).

Copy link
Member

@mbauman mbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@nalimilan nalimilan merged commit b64a8e0 into master Apr 4, 2018
@nalimilan nalimilan deleted the nl/linearindices branch April 4, 2018 13:50
@martinholters
Copy link
Member

Um, can someone explain the change to axes(::CartesianIndices) to me? That breaks indexing with CartesianIndices:

julia> A=rand(5,5);

julia> A[CartesianIndices((2:3, 3:5))]
ERROR: MethodError: no method matching similar(::Array{Float64,2}, ::Type{Float64}, ::Tuple{UnitRange{Int64},UnitRange{Int64}})

Is that intended? (If I understand correctly, this tries to return an array for indices (2:3, 3:5), i.e. an offset array, which fails.)
Similarly, but probably less important:

julia> map(identity, CartesianIndices((2:3, 3:5)))
ERROR: MethodError: no method matching similar(::CartesianIndices{2,Tuple{UnitRange{Int64},UnitRange{Int64}}}, ::Type{CartesianIndex{2}}, ::Tuple{UnitRange{Int64},UnitRange{Int64}})

Of course, collecting the CartesianIndices beforehand works, but allocates an unnecessary array.

@mbauman
Copy link
Member

mbauman commented Apr 9, 2018

Dang, that's unfortunate. Yes, your understanding is correct. The properties I'm striving for here are:

  • A[keys(A)] == A[LinearIndices(A)] == A[CartesianIndices(A)] == A
  • LinearIndices(A)[CartesianIndices(A)] == LinearIndices(A)
  • CartesianIndices(A)[LinearIndices(A)] == CartesianIndices(A)

… although now upon closer examination it looks like that third one is still broken for OffsetArrays.

@nalimilan
Copy link
Member Author

These things are so tricky...

What's the use case for indexing an array with a CartesianIndices object constructed manually? Sorry, I'm not familiar enough with these to figure out whether it should work or not.

Regarding the failure of CartesianIndices(A)[LinearIndices(A)] == CartesianIndices(A) for OffsetArrays, I've traced it down to this:

include("test/TestHelpers.jl")
A = TestHelpers.OAs.OffsetArray([1 4; 2 3], (3,2))
CartesianIndices(A)[1:4]
Base._unsafe_getindex!(collect(CartesianIndices(A)), CartesianIndices(A), 1:4, 1:4)

It appears that _unsafe_getindex! assumes indices start at (1,1), which is of course not correct for non-standard indexing. I'm not sure yet what's the correct way of implementing this (it works fine for OffsetArrays, which probably use a different function).

@martinholters
Copy link
Member

What's the use case for indexing an array with a CartesianIndices object constructed manually?

I'm not sure; I just stumbled upon this because it broke a test in Compat.jl (ref JuliaLang/Compat.jl#529 / JuliaLang/Compat.jl@956bda9).

I think the problem is that CartesianIndices now describes an object primarily used for converting other (in particular: linear) indices to cartesian ones. But at the same time, CartesianRange was deprecated to CartesianIndices, were it's basically used to hold, well, a cartesian range. Unfortunately, these two uses seem to require similar, but maybe not identical objects. Consider the following table for possibilities of indexing A::Matrix:

single element range
A[3] A[2:5]
A[2,3] A[1:3,2:3]
A[CartesianIndex(4,3)] A[CartesianRange(2:3,1:5)] (Julia 0.6 only)

Note that we currently don't have anything for the lower right entry. Not that I have a use case, but it leaves me a bit uneasy to leave a gap there. Also note that LinearIndices doesn't occur there. Another reason for me to believe that LinearIndices and CartesianIndices play another role than CartesianRange. So maybe we actually need both, CartesianIndices and CartesianRange?

@nalimilan
Copy link
Member Author

Good summary. All of this comes from the fact that we cannot distinguish linear from cartesian indices when indexing into vectors, so linear indices do not always start at 1.

Maybe we need a separate CartesianRange type. Though it sounds too bad to add a separate type just because indexing with a CartesianIndices object doesn't work when it does not cover exactly the indices of the indexed array. Maybe we should introduce methods to get this working? IIUC, in theory the similar(::CartesianIndices{2,Tuple{UnitRange{Int64},UnitRange{Int64}}}, ::Type{CartesianIndex{2}}, ::Tuple{UnitRange{Int64},UnitRange{Int64}}) method which isn't defined is supposed to return an OffsetArray (and indeed that's what it does after loading test/TestHelpers.jl). Would it be an heresy to return a plain Array instead?

@martinholters
Copy link
Member

The current status makes indexing with CartesianIndices idempotent, so for c::CartesianIndices, we have A[c] == A[c][c] (assuming a suitable OffsetArray to be defined and returned by similar, e.g. by loading test/TestHelpers.jl). But I admittedly don't know why that is desirable.

@mbauman
Copy link
Member

mbauman commented Apr 10, 2018

The more important part is that we can use these objects to convert between linear and cartesian indices.

There's an alternative here that I thought would be more disruptive, but we could give it a shot: have axes for OffsetArrays return idempotent ranges (via Base.Slice). Then the axes of a CartesianIndices could be the axes of the contained ranges.

@martinholters
Copy link
Member

IIUC, that would make A[r1, ..., rN] == A[CartesianIndices((r1, ..., rN))], and also A[axes(A)...] == A[CartesianIndices(A)] == A, where A could possibly be an OffsetArray, and r1, ..., rN can be UnitRanges, OneTos, or Slices? So far, that sounds quite sane to me. What are the downsides that I've missed?

@martinholters
Copy link
Member

Would a resolution here potentially be breaking? (I.e., do we need to hurry?)

@mbauman
Copy link
Member

mbauman commented May 8, 2018

Yes, it would. I just started on exploring potential solutions yesterday afternoon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants