Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes, tests, & the future of Base.Slice #21052

Closed
wants to merge 1 commit into from
Closed

Conversation

timholy
Copy link
Member

@timholy timholy commented Mar 16, 2017

@mbauman added Base.Slice in #19730 as part of a major simplification of indexing rules. I confess I didn't really spend long enough thinking about it, and just mentally said to myself, "Colon got renamed." What I missed is that it's really quite simple and deserving of independent existence. If isa(r, Base.Slice), then r is an AbstractUnitRange that should satisfy the following identities (which nearly imply one another):

  • r[i] == i for any in-bounds i;
  • r[r] === r
  • collect(r) == collect(indices(r, 1))

Now, thinking of this as "Colon got renamed" isn't strictly wrong, because in 0.5 Colon() satisfies the first two of these (and indices(:) is not defined). But the name Slice doesn't convey this; consequently, for Julia 1.0 I think we should rename it, as well as reparametrize it more along the lines of other AbstractUnitRange objects. In my mind the leading contenders are IdentityRange (because of the first property) and IdempotentRange (because of the second).

More immediately, this type turns out to have useful properties that we're not currently exploiting or testing; in particular, it's a simple mechanism for creating "indices-preserving views", i.e. v = view(a, Base.Slice(ind1), Base.Slice(ind2), ...) satisfies v[i, j, ...] == a[i, j, ...] but might nevertheless represent a sub-region of a (and does not necessarily start indexing at 1). Wanting to exploit this, I noticed that the test suite for Base.Slice was not very extensive. I decided to add quite a few tests, and unsurprisingly I discovered (and then fixed) a number of bugs.

There's also a looming issue to address: without anyone really noticing, Base.Slice turns out to have introduced a non-1-indices AbstractArray type to Base. (I and others have put a lot of work into supporting non-1 arrays in Base, but Base does not actually allow you to create such an object without defining new types or loading packages.) The problem, then, is that unlike all other AbstractArrayss in Base, we can't support arithmetic operations on Base.Slice. For example, we can't return a conceptually-correct value for r::Base.Slice + 1 unless r happens to start indexing at 1. I see two options:

  • Leave it like this, with arithmetic operations throwing errors
  • Support arithmetic operations by bringing OffsetArrays into Base

Given where we are, going back to Colon seems like a step backwards, so I don't really consider that to be a viable option.

This is in preparation for a later reparametrization and renaming
Copy link
Member

@mbauman mbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, I just realized that if Slice no longer always represents all indices within a dimension then the index style computation for SubArray will need to change. That will significantly slow down many views with : since they'll be indexed cartesian much more often. I'm no longer as certain that using this type more generally is a good idea.

Just as a side-note: As it stands, Base should only create offset Slices when you're using OffsetArrays in the first place.

@@ -440,13 +440,15 @@ false
checkindex(::Type{Bool}, inds::AbstractUnitRange, i) = throw(ArgumentError("unable to check bounds for indices of type $(typeof(i))"))
checkindex(::Type{Bool}, inds::AbstractUnitRange, i::Real) = (first(inds) <= i) & (i <= last(inds))
checkindex(::Type{Bool}, inds::AbstractUnitRange, ::Colon) = true
checkindex(::Type{Bool}, inds::AbstractUnitRange, ::Slice) = true
checkindex(::Type{Bool}, inds::AbstractUnitRange, s::Slice) =
(first(s) >= first(inds)) & (last(s) <= last(inds))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little unfortunate. I suppose it's not a deal-breaker, but it stinks if we need to check bounds for indexing with colons.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was my main concern too. Colon is effectively an unlimited Slice, which is why we never needed to check bounds before; now that it's finite, we have to check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we resolve this by making Colon a subtype of AbstractUnitRange?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that'd put us right back to where we are now, only s/Slice/Colon/.

In order for Colon to be a range, it must know where to start and how long it is. That can only be resolved upon indexing (within to_indices). Thus, there must be some way of constructing a colon with a given start and length. But if a user ends up using that same construction then it might not accurately describe an entire dimension. And now we're right back to where we started, except this time it's a more visible, exported object that users regularly construct.

@timholy
Copy link
Member Author

timholy commented Mar 16, 2017

Just as a side-note: As it stands, Base should only create offset Slices when you're using OffsetArrays in the first place.

Agreed, unless people create them manually: v = view(1:8, Base.Slice(2:4)) is a bug on current master, despite the absence of any OffsetArrays besides Slice itself. As long as one views Slice as a purely internal type, perhaps we can live with it; if that's the solution, the docstring should be updated to say "don't create these manually."

@timholy
Copy link
Member Author

timholy commented Mar 16, 2017

Let's get some data on how bad this would be for performance: @nanosoldier runbenchmarks(ALL, vs = ":master").

@mbauman
Copy link
Member

mbauman commented Mar 16, 2017

I think the largest regressions would come from deleting these methods:

julia/base/subarray.jl

Lines 44 to 47 in 9f90b9a

# Slices may begin a section which may be followed by any number of Slices
viewindexing(I::Tuple{Slice, Slice, Vararg{Any}}) = (@_inline_meta; viewindexing(tail(I)))
# A UnitRange can follow Slices, but only if all other indices are scalar
viewindexing(I::Tuple{Slice, UnitRange, Vararg{ScalarIndex}}) = IndexLinear()

Slice(convert(I, max(first(r), first(s)):min(last(r), last(s))))
intersect(r::Slice, s::Slice) =
Slice(max(first(r), first(s)):min(last(r), last(s)))
reverse(r::Slice) = error("reverse is not supported for Base.Slice")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this #7512 or avoiding a fallback?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently:

julia> reverse(Base.Slice(-10:10))
10:-1:-10

julia> indices(reverse(Base.Slice(-10:10)))
(Base.OneTo(21),)

Those indices should be the same as the indices of the original Slice object (-10:10).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that Slices problem, or indices' ?

Copy link
Member

@mbauman mbauman Mar 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general AbstractArray reverse would throw an error for an offset array, but OrdinalRange just always creates a 1-indexed array step range via colon().

Copy link
Member

@mbauman mbauman Mar 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… and the correct result isn't representable without general offset array support. Slice itself cannot represent reverse(Base.Slice(-10:10)) — which should be an offset array with values 10:-1:-10 and indices -10:10. It could be implemented like I did for length and friends above, where it succeeds for 1-based slices but errors otherwise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Base, we can't currently create an array with indices -10:10 unless it also holds values -10:10 (courtesy of Base.Slice). It's a byproduct of being able to create some but not all types of OffsetArrays.

@timholy
Copy link
Member Author

timholy commented Mar 16, 2017

Perhaps we could add a fromColon::Bool type parameter to Base.Slice?

@mbauman
Copy link
Member

mbauman commented Mar 16, 2017

Yeah, I'd be on board with that. We could make it easy to construct Slice{…, #=fromColon=# false} with Slice(…) outer constructors, and colon conversions could use the "expert" inner constructor with Slice{…, true}(…). That'd make it pretty hard to mistakenly do the wrong thing.

Maybe call it entire_dimension or some such that describes what it does rather than where it came from?

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@timholy
Copy link
Member Author

timholy commented Jul 2, 2018

Superseded by #27038

@timholy timholy closed this Jul 2, 2018
@timholy timholy deleted the teh/idempotent_range branch July 2, 2018 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants