Efficient getindex implementations for subsetting AcceleratedArrays #4

kcajf · 2019-09-13T18:02:54Z

Hi @andyferris - this project looks great, and is something I am considering extending / building on. I already have some hacky implementations of something similar, but not as nicely integrate and as general.

What are you thoughts on adding a family of getindex(A::AcceleratedVector, idx::AbstractVector{Int}) methods? I.e. indexing into AcceleratedVectors with integer StepRanges, integer Vectors, etc, and returning AcceleratedVectors of the same type.

For SortIndex, this would be quite easy. When indexing with StepRanges, we would have to check the step is positive. For indexing with arrays, we would have to check the array is sorted first. Since we would be constructing SortIndexes from known-sorted arrays, it would be good to have a SortIndex constructor that doesn't do any checks. This might also be useful generally.

For HashIndex, the hash table would have to be modified, but there are likely lots of optimisations / shortcuts to minimise the work. For example, when indexing with UnitRange{Int}s, we could add to a global integer offset that is subtracted from the values in the Dict when they are accessed. Similar to the above, we might want a more direct HashIndex constructor that accepts a pre-built dictionary & offset, etc.

I'd be happy to do some initial work on this after hearing your thoughts.

The text was updated successfully, but these errors were encountered:

andyferris · 2019-09-14T06:14:23Z

Yes, that sounds great - the other one I’be thought of is the way indexing a sorted array with a sorted array is also sorted (and we also know that ranges are naturally sorted...). So indexing can basically “preserve” SortIndex (as well as uniqueness).

When I think of hashes - I think it might be simplest to just drop them (except for maybe uniqueness?). I’m trying to think of compelling end-use cases.

andyferris · 2019-09-14T06:16:08Z

(To say it differently - indexing is definitely in the “roadmap” as it is in my head but it’s been a matter of finding time for me to contribute to this project - and help is always appreciated!)

kcajf · 2019-09-14T12:27:51Z

When I think of hashes - I think it might be simplest to just drop them (except for maybe uniqueness?). I’m trying to think of compelling end-use cases.

What do you mean by this?

andyferris · 2019-09-14T21:23:32Z

I just mean - it seems quite complicated when indexing to preserve the hash table for some (possibly repeated?) subset. I also think it could be slow - if you want to shrink the size of the hash table you need to recompute hashes etc (slow for large selections) and for small selections the overhead of the hash table is large while not providing any acceleration benefits. It seems it might be best to leave the decision in the hand of the user, who can always accelerate the result when it makes sense for their use case.

ivirshup · 2020-02-23T02:49:15Z

Would it be reasonable to encode propagation through getindex in it's type? Something like HashIndex{Propagating}?

My use-case is using an accelerated array for the dimensions of a DimensionalArray (from DimensionalData). I'd like to be able to subset a DimensionalArray, and have the result still have accelerated indices. I currently don't see a way to enforce this without some type-piracy.

kcajf mentioned this issue Sep 13, 2019

WIP: The Plan JuliaCollections/AxisArraysFuture#1

Open

ivirshup mentioned this issue Feb 23, 2020

Checking whether dimension is ordered currently requires values to have order rafaqz/DimensionalData.jl#93

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Efficient getindex implementations for subsetting AcceleratedArrays #4

Efficient getindex implementations for subsetting AcceleratedArrays #4

kcajf commented Sep 13, 2019 •

edited

Loading

andyferris commented Sep 14, 2019

andyferris commented Sep 14, 2019

kcajf commented Sep 14, 2019

andyferris commented Sep 14, 2019

ivirshup commented Feb 23, 2020

Efficient getindex implementations for subsetting AcceleratedArrays #4

Efficient getindex implementations for subsetting AcceleratedArrays #4

Comments

kcajf commented Sep 13, 2019 • edited Loading

andyferris commented Sep 14, 2019

andyferris commented Sep 14, 2019

kcajf commented Sep 14, 2019

andyferris commented Sep 14, 2019

ivirshup commented Feb 23, 2020

kcajf commented Sep 13, 2019 •

edited

Loading