-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Towards array nirvana #7941
Comments
Another fun ingredient in my own personal array nirvana will be https://github.com/timholy/NamedAxesArrays.jl. More joy from stagedfunctions 😄. But I don't propose that for inclusion in base. |
@timholy the contiguous rank of StridedView is important, consider: v1 = view(a,:,1:2:7)
v2 = view(v1,:,2) With the type system in ArrayViews, it can determine statically that v2 is a contiguous view. |
I see, we probably want both then. |
The StridedView part is tightly integrated with ContiguousView. It is possible that lower rank slices of a non-contiguous Strided view is contiguous. The ArrayViews type system preserves this piece of information, so that something above can be determined statically. |
Although how one decides which one wants is an interesting question---I guess we could return your ArrayView types when the parent is an array and no slicing is desired, and the ArrayViewAPL view type otherwise? |
Looking forward to 0.4 already... |
Jutho, you've contributed a lot already, and it would be lovely to have your help implementing some of this! |
@timholy ArrayViewAPL is what I've been dreaming of since discovering Julia! Hopefully sliceview can become the standard indexing semantics. I'll be happy to test and help in any way I can once this lands in master (or is usable without having to jump over lots of branch fences)! |
@timholy I’d be happy to help when time permits. Let me know if you had anything specific in mind that I can contribute to. On 10 Aug 2014, at 14:15, Tim Holy [email protected] wrote:
|
+1 for named axes arrays |
@lindahua, I added "Move |
Great list! Just to get a broader context, we also want more efficient strings (and possibly BigInts), which also require any-length, immutable homogeneous arrays. Similar enough to tuples to make one think. Instead of mutable fixed-size arrays, we might want to use the "assignable cell" model, where you have a mutable cell that can hold a single value. Then you just store different array values into it at different times. Local mutations of single elements can be optimized. |
@timholy Thanks for taking the lead on this. I don't have particular preference as to whether to use ArrayViews or ArrayViewsAPL or a hybrid of both, or something afresh, as long as it is efficient enough for common cases. I think there are two issues here can be discussed separately.
|
I remember this has been brought up a while ago in some issue. But it would be useful to mention it here. It can be useful to support syntax like: a[..., i]
# so it means a[:,i] when ndims(a) == 2
# and a[:,:,i] when ndims(a) == 3, etc |
Those are good questions. What's funny is that I am not actually sure they all need to be settled first. For example, in ArrayViewsAPL I've implemented both the current Regarding Bounds-checking upon construction is probably more important to settle early (this is #4044). I now suspect that we may need to have the base types check bounds upon construction. People who want to pull the tricks in #4044 might need to implement an AbstractArray type that doesn't check upon construction but has a Overall, my view is that the real effort here is in the "underlying technologies" section of the list. I'd include the changes to the parser in this. The rest of the core "view" infrastructure (ignoring fixed-size arrays, etc) seems likely to be something one could finish banging out in a couple of days. Writing tests and dealing with the consequences will take rather longer. |
We can also consider whether we want more kinds of indexes, such as #1032. This is important since it may ask for a more general approach to |
Good thought. I think stagedfunctions will make it a lot easier to support a diversity of indexes. Prior to stagedfunctions, we'd have had to worry about what to do if argument 1 was a
With a single function we can basically handle anything we want to support, and all the tricky work is done at compile time by straightforward Julia code. You couldn't write something more runtime efficient by hand if you tried. It's a total game-changer. |
I share your enthusiasm for staged functions, but not for |
Fair enough. |
I will try to rewrite ArrayViews with staged functions and see how much it may simplify codes. |
Still, critical information like whether views are contiguous should be encoded by the type itself (this is essential for type stability). |
I see that "collaborator" tag, so feel free to add them to the bullet list. |
While we're at this array overhaul, can we try to make it so that switching between APL and Matlab style indexing is relatively simple? I.e. Just a matter of surface syntax. |
That was my idea with ArrayViewsAPL: just choose
or
Seemed the safest way given that we don't yet know what we want. |
Since these enhancements are going to break most array code anyway, what is the status of {} arrays going forward? The syntax seems like an historical vestige. I feel it would be conceptually simpler to have just one array syntax. |
I have been following from a distance, but I am wondering if we are any closer now than 3 months ago. Quite a bit has happened with staged functions, sub arrays and cartesian iterators since then, which makes me hopeful that we are closer to replacing array indexing with views. Is it likely we can achieve this in 0.4? |
Some and hopefully most of the work should be done in #9150. The main issue is to handle |
Oh, and I forgot perhaps even the most important one: it would be insane to turn EDIT: Demo: julia> A = reshape(1:15, 3, 5)
3x5 Array{Int64,2}:
1 4 7 10 13
2 5 8 11 14
3 6 9 12 15
julia> b = slice(A, 1:2, 1)
2-element SubArray{Int64,1,Array{Int64,2},(UnitRange{Int64},Int64),2}:
1
2
julia> b[3]
3
julia> b = sub(A, 1, :)
1x5 SubArray{Int64,2,Array{Int64,2},(UnitRange{Int64},Colon),2}:
1 4 7 10 13
julia> b[2,2]
5 |
Indeed I did, thanks for catching. (Corrected above.) |
It looks like much of this will make its way into 0.4. Should we refactor this issue into the bits that we want in 0.4, and a separate issue for what will be in 0.5? |
Indeed, the large majority has been in 0.4 for some time---that's what all the checkmarks at the top are for 😄. If you want to split out separate issues for the remaining items, I'm fine with that, but I don't really see the point. |
#10525 was long overdue for a spot on that list, so I added it. |
I was just initially puzzled at seeing the 0.5 milestone. We can leave it as it is. |
The question I really had was - from a 0.4 perspective, which of the unchecked ones we want to get done, or are we already there? |
I think we should still aim for extensible bounds checking removal, #7799, in 0.4. |
I was puzzled about the 0.5 milestone too. I wondered if you had changed it (I didn't). That said, I think we won't get most of the remaining items by 0.4, so bumping the milestone seems reasonable. Realistically, I expect us to check only one more box: I think we will get "sweet custom array types" (aka #10525, hopefully supported by #10911 if I can get my act together in time). This is a really big step---one that I did not anticipate at the time I opened this issue---and it will be amazing to have. That would also open the way for fixing Re "return views from indexing": I think the big show-stopper is #7799, since our views don't check bounds (in my opinion, we'd have a catastrophic loss of safety if we switched to returning views before addressing that issue). It's also worth reminding folks that @carlobaldassi has raised a number of concerns specifically about |
In case you hadn't seen it, this is in the issue's log, squeezed between comments--it was changed on March 7 by @vtjnash. |
@vtjnash went around and randomized all the milestones :-| |
I'm looking at this right now in the context of #10525. I think I may be able to rectify this since it puts ( |
To get back on that, my current thinking is that a lot of methods should be specialized to work on BitArray Views efficiently. It's going to take time -- and to increase BTW I'm assuming that in most relevant cases views of views can be flattened, otherwise this is probably not going to work. The only issue which I don't think we can really solve is treating efficiently views with generic, non contiguous indexing. However, this is only going to make a real difference when indexing repeatedly into the view, and maybe for nested views. |
I believe the last two items could be closed |
Superseded by JuliaLang/LinearAlgebra.jl#255. If I missed anything from this issue, please add it there. |
For 0.4 several of us want to see significant improvements in arrays. I'm opening this meta-issue to help organize the effort. I'll be brief here and link out to issues/packages for more detailed explanations. Please add to these bullet points.
Underlying technologies
The first two are essential, the third is nearly essential.
(WIP: Add Cartesian product iteration. Fixes #1917 #6437)(Efficient cartesian iteration (new version of #6437) #8432)stagedfunctions
can be coaxed into doing this for you)[a, b]
non-concatenating make [a, b] not concatenate #3737/make{a,b}
give better-typed arrays [original title: make [a,b] non-concatenating] #2488I'm guessing the approach for implementing #7799 will also allow one to specify manual inlining (which is where the idea was originally proposed), which is the main bottleneck for #6437. So that's almost a 2-for-1 deal.
Implementation
The first two are essential.
In my opinion, we likely just want theContiguousView
part of it.AbstractVector
index? (These would not be "strided".)Colon
translation out of the parser (RFC: Move Colon translation out of the parser #10331)Opportunities
AbstractArray
s, RFC: Give AbstractArrays smart and performant indexing behaviors for free #10525ArrayViewsAPL is something I've not yet broadly announced. While the APL in the package's title is in reference to #5949 (for which it can be a test bed to see how we'd like it and how much would break), its real purpose is to exploit stagedfunctions for creating efficient and general view operations. Please see the README for explanations.
The text was updated successfully, but these errors were encountered: