Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: exportable tuple-manipulation utilities #20370

Closed
wants to merge 1 commit into from
Closed

Conversation

timholy
Copy link
Member

@timholy timholy commented Feb 1, 2017

Fancy indexing almost invariably involves manipulation of tuples. Currently we either force users to exploit the unexported Base.tail and Base.front, or isolate type-unstable code in separate functions using @noinline. This proposes the following syntax:

t = (1,2,3)
rf = ReferenceFront((5,5))
front, back = split(t, rf)  # front == (1,2), back == (3,) (front has the same length as the reference)
rb = ReferenceBack((5,5))
front, back = split(t, rb)  # front == (1,), back == (2,3) (back has the same length as the reference)

It uses a reference-tuple to make the output inferrable. These operations are composable to implement all sorts of fancy indexing. For example, if you want to perform an operation along dimension d, and you have a "reference slice index" ipost for dimensions d+1, ..., n, then

indstmp, indsback = split(indices(A), ReferenceBack(ipost))
indsfront, indsd = split(indstmp, ReferenceBack((true,))

will allow iteration

for i in indsd
    for ipre in CartesianRange(indsfront)
        # do something with A[ipre, i, ipost]
    end
end

This is incomplete (I'd add support for CartesianIndex and CartesianRange). If folks like this or something similar to it, perhaps the central question is whether one should throw an error if the tuple is shorter than the reference, or whether we should return what's available so that t == (front..., back...).

@mbauman
Copy link
Member

mbauman commented Feb 1, 2017

Ah, so the values in you pass in the tuple to Reference(Front|Back) don't really matter — all that matters is the length of the tuple? I think it'd be a little more natural to just specify the number of dimensions as a type parameter — e.g., ReferenceFront{2} or ReferenceBack{3}.

As far as your central question, I don't think there's necessarily a right choice. The permissive behavior is sometimes useful, but it can also really throw you for a loop. We effectively have both behaviors in base right now — IteratorsMD.split is permissive but front and tail throw errors. I've added _maybetail in a case where I needed the permissive behavior, and I've goofed in using IteratorsMD.split where I didn't expect the returned tuple to be smaller than I asked for.

This still doesn't quite feel right. Really, the holy grail here would be to add special cases for indexing with constant UnitRanges in inference like we already have for constant scalars. My impression is that it'd be a lot of work, though… and may not even be possible given the general nature of the colon operator. We don't even infer t[end-1] right now, let alone t[1:end-1].

@ararslan ararslan added the collections Data structures holding multiple items, e.g. sets label Feb 1, 2017
@timholy
Copy link
Member Author

timholy commented Feb 1, 2017

I agree that making sz[3:end-2] inferrable would be best. Perhaps it's worth not having exportable interfaces, holding out hope for the best option.

Re the interface here, I thought of your option but disliked how it was basically just two flavors of Val, and thought that holding the tuple might be a little nicer while also add some new possibilities for customization (don't know if that's good or bad...). Now that I think about it, we could just re-use the existing flavor of Val: Val{(:front,N)} and Val{(:back,N)}. That seems like enough of a simplification that, were I to re-do this, I'd just go that way (as far as the internal implementation goes, that is). However, I'm inclined to wait and see if the inference magic materializes.

@timholy timholy closed this Feb 1, 2017
@mbauman
Copy link
Member

mbauman commented Feb 1, 2017

Sorry — I didn't mean to immediately poo-poo this! I don't think we should let perfect get in the way of good enough here… especially if the inference improvements are as hard as I think they are. I think there's value in getting a "good enough" interface here in the mean time.

@timholy
Copy link
Member Author

timholy commented Feb 1, 2017

No, I didn't take it that way. I merely decided that you had made some excellent points. If you'd like me to back off that opinion... 😄

I'm happy to discuss this further here in this issue. I just suspect it can stay closed until there's something we're excited about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants