Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

range indexing should produce a subarray, not a copy #3701

Closed
stevengj opened this issue Jul 12, 2013 · 35 comments
Closed

range indexing should produce a subarray, not a copy #3701

stevengj opened this issue Jul 12, 2013 · 35 comments
Labels
breaking This change will break code
Milestone

Comments

@stevengj
Copy link
Member

I would prefer that range indexing of an Array (e.g. X[1:10, :]) created a SubArray (a view/reference), not a copy. This seems more in the spirit of Julia's pass-by-reference semantics, and would eliminate some of the performance gotchas with range indexing.

It might also make future loop-devectorization optimization easier, because subarray references can be devectorized back into references to the original array without worrying that you will be changing the semantics.

It would reduce Matlab compatibility, but we already do that with pass-by-reference and so I doubt many users will be surprised by having the same reference semantics for range indexing.

This is something that has come up informally several times on the mailing list, but I didn't see any issue for it.

@StefanKarpinski
Copy link
Member

Thanks for making the issue. We should definitely do this.

@ViralBShah
Copy link
Member

+1

@timholy
Copy link
Member

timholy commented Jul 13, 2013

At the moment, the issue is not that it would reduce Matlab compatibility, but that it would almost certainly reduce performance---for many operations SubArrays have nowhere near the performance of Arrays. #3503 will help when it can be merged (but there are a few "deep" issues that need to be addressed first).

@stevengj
Copy link
Member Author

@timholy, there is no technical reason (as far as I can tell) why accessing a SubArray should not generate the same performance (and in fact, almost exactly the same code) as accessing the same data directly in the original Array. If that's not the case now, then this is a performance bug that should be fixed, but I see that as a somewhat orthogonal issue to this one.

(There will be cases where it is advantageous for cache reasons to make a contiguous copy of a subarray that is discontiguous in memory, but of course the user can always call copy explicitly in that case.)

@lindahua
Copy link
Contributor

As @timholy mentioned, we are working on an improved version of subarray -- the current implementation is way too slow.

However, the main obstacle is performance degradation. Even a very simple immutable wrapper over Array would lead to 3x - 5x slow down --- things often do not get inlined. I believe Jeff has been working on improving the compiler and hopefully such issues would be resolved when it is done.

@timholy
Copy link
Member

timholy commented Jul 13, 2013

@stevengj, there are two issues. Dahua mentioned one, inlining the wrapper.

The other principal hurdle is linear indexing, which is heavily used. Suppose A is a 10x20 Array, I can access the memory location of each element using a simple for loop. But if A is a 10x20 SubArray of a 27x43 Array, then I need to do something more complicated. These operations have never been as well optimized as they could be.

Heck, even getting good performance from Arrays for operations like getindex took some work, e.g., make_arrayind_loop_nest. I never got around to doing that for SubArrays, although in principle it's a pretty small change. The issue in part is that you might need something like this even for "simple" operations like taking the sum of all elements of the subarray, whereas with arrays you can confine such complexity to a few operations.

@StefanKarpinski has been doing some work on the linear indexing problem, hopefully this will help.

@stevengj
Copy link
Member Author

Regarding inlining, that is a compiler issue; there is no technical reason why subarray accesses could not be inlined with zero overhead. It seems like a bad idea to decide fundamental language semantics based on temporary performance issues.

Regarding linear indexing, exactly the same issue applies to looping over a portion of the original Array directly. As I said, there are cases where one would want to make a contiguous copy of a subarray for performance reasons—the appropriate performance comparison here is between looping over a portion of an Array versus looping over a SubArray abstraction of the same memory locations.

@StefanKarpinski
Copy link
Member

Regarding inlining, that is a compiler issue; there is no technical reason why subarray accesses could not be inlined with zero overhead. It seems like a bad idea to decide fundamental language semantics based on temporary performance issues.

Absolutely. We're all for range slices producing views. These performance issues are clearly surmountable.

@toivoh
Copy link
Contributor

toivoh commented Jul 13, 2013

The linear indexing issue with subarrays was actually one reason that I started to work on the broadcast stuff. I think that with the proper broadcast primitives, we could handle most interesting uses of linear indexing, almost as efficiently and in a more powerful way (though generating a bit more code than the linear indexing approach.) But we're not there yet.

@JeffBezanson
Copy link
Member

I strongly agree with having a good design and improving the compiler to match it. I know it will be possible to generate much better code for subarrays than we do now.

@quinnj
Copy link
Member

quinnj commented Aug 21, 2014

Is this already implemented in @lindahua's ArrayView's branch? #5556

@StefanKarpinski
Copy link
Member

Since we're in the chaos period, I'd be down to just rebase that branch and merge it as soon as @lindahua get's a chance to do so.

@timholy
Copy link
Member

timholy commented Aug 21, 2014

We can add it, but we can't remove SubArrays until the AbstractArray problem is solved. That's basically waiting on stagedfunctions.

@tknopp
Copy link
Contributor

tknopp commented Aug 21, 2014

To answer @quinnj 's question: No #5556 provides the infrastructure to make this happen and the getindex methods have to be adapted.

@vtjnash vtjnash modified the milestones: 0.5, 0.4 Mar 7, 2015
@JeffBezanson JeffBezanson modified the milestones: 0.4, 0.5 Mar 8, 2015
@ViralBShah ViralBShah modified the milestones: 0.5, 0.4.0 Jun 5, 2015
@Sisyphuss
Copy link

Is this issue the reason why the v0.5 release will be called "Arraymaddon"?

@StefanKarpinski
Copy link
Member

One of several.

@tknopp
Copy link
Contributor

tknopp commented Feb 26, 2016

@stevengj: Since you initiated this issue it would be very interesting whats your opinion on the recent discussion on the topic in #13157

@JeffBezanson JeffBezanson modified the milestones: 0.6.0, 0.5.0 Mar 30, 2016
@vtjnash vtjnash modified the milestones: 1.0, 0.6.0 Dec 22, 2016
@vtjnash
Copy link
Member

vtjnash commented Dec 22, 2016

Current plan is to revisit this after all immutables can be inlined. See #9150

@KristofferC
Copy link
Member

I'm pretty sure this was decided against and the thought now is to keep returning copies but with a more convenient syntax for views (@view)

@fjarri
Copy link
Contributor

fjarri commented Jun 26, 2017

@KristofferC, is there a discussion/rationale leading to this decision available somewhere online? From several issues here in this repo I got an impression that people were generally in favor of returning views by default as soon as some general performance issues are fixed. I am really curious as to why the opposite decision was finally made.

@timholy
Copy link
Member

timholy commented Jun 26, 2017

Probably some useful discussion in #7941 and #13157. We also have @views now and dot-fusion so there really isn't much need for it.

@fjarri
Copy link
Contributor

fjarri commented Jun 26, 2017

@timholy, thanks. I knew about @views (and dot-fusion seems orthogonal to this), it's just that I come from numpy, and almost never needed copied (or even mutable) slices in my numeric code, so the decision seemed a bit counter-intuitive. In fact, I only discovered that this is the case recently, when I got some unexpected performance drop from using slices; before that I was under the impression that slicing works just as it does in numpy.

@mbauman
Copy link
Member

mbauman commented Jun 26, 2017

There are two major issues:

  • In my experience, the performance story is really quite mixed. It tends to only be a performance win when you're slicing inside Matlab/Numpy style vectorized inner loops. In some other cases it can be a huge performance loss. Modern CPU architectures really like predictable and cache-efficient memory locality. Copying is a great way to ensure that happens. I used to think it'd be a bigger win across the board, but that's simply not the case. It will be interesting to see how stack-allocated views might change this, but I don't think it'll help the outer loop cases. Note that there are other compiler optimizations that could help the copying-slices-within-the-inner-loop case (in some of my applications, I've gotten the best performance by copying slices into a pre-allocated buffer with a simple custom getindex! function).
  • Changing the default semantics would be an upgrader's nightmare. You might not exploit the independent mutability of slices, but many lines of code out there does. Not only would folks need to vet their code for semantic compatibility... they'd also need to profile and try copy(X[Y]) to see if they want to opt back out to retain their original performance. There'd need to be a really compelling story for it to be worth the churn. I used to think it'd be more compelling.

As it stands, every single indexing expression in Julia can be performed as a view. You can opt-in to using views over large sections of code with @views begin … end. Even entire modules can be written this way:

module M @views begin
# … Your code
end end

@timholy
Copy link
Member

timholy commented Jun 26, 2017

and dot-fusion seems orthogonal to this

One of the best reasons for returning views was to minimize the amount of copying in complex expressions; fusion is really a better solution to the same problem.

@mbauman
Copy link
Member

mbauman commented Jun 26, 2017

One thing that could help there is hammering out a definition for A.[X] so indexing syntaxes can also participate in fusion.

@StefanKarpinski
Copy link
Member

The options seem to be

A.[X] := getindex.(A, X)
A.[X] := (x->A[x]).(X)

I've found myself wanting both on occasion. IIRC, there's a discussion about this somewhere.

@mbauman
Copy link
Member

mbauman commented Jun 26, 2017

Yes, as have I. I've wanted them both exclusively of each other (within a few lines) when working with nested JSON-like data. #19169.

@fjarri
Copy link
Contributor

fjarri commented Jun 27, 2017

It tends to only be a performance win when you're slicing inside Matlab/Numpy style vectorized inner loops. In some other cases it can be a huge performance loss.

To clarify my usage pattern, I'm talking about something like this (a stencil-type computation, essentially). Omitting @views leads to a five times worse performance (which is what I would expect for large arrays, since creating a view object is a constant time operation in the size of the array, as opposed to the linear time copying). Of course, the views in my example contain large contiguous parts, perhaps the situation is different for sparse ones.

One of the best reasons for returning views was to minimize the amount of copying in complex expressions; fusion is really a better solution to the same problem.

I guess you can look at it that way. The problem is that loop fusion and views prevent the creation of temporary arrays resulting from two separate expression types: arithmetic and slicing respectively. If fusion worked for the latter type of expressions too (e.g. as @mbauman suggested), it would indeed be a better solution to this problem.

@damiendr
Copy link
Contributor

damiendr commented Aug 3, 2017

I find myself wanting to use this pattern all the time: a .= b[:,i]
But it seems that the right way to do this is: a .= view(b,:,i), which reads a bit less clearly (besides not being able to use end).

a = zeros(1000)
b = zeros(1000,10)

function f1(a,b,i)
    a[:] = b[:,i]
    nothing
end
function f2(a,b,i)
    a .= b[:,i]
    nothing
end
function f3(a,b,i)
    a .= view(b,:,i)
    nothing
end

@time f1(a,b,1)
  0.000011 seconds (6 allocations: 8.109 KiB)
@time f2(a,b,1)
  0.000006 seconds (6 allocations: 8.109 KiB)
@time f3(a,b,1)
  0.000002 seconds (4 allocations: 160 bytes)

Preserving the semantics of copying on slicing, could the compiler optimise away the copy in a .= b[:,i]? That would seem logical for a dot-operation, barring further use of the RHS through the return value.

I find the idea of module-wide @views begin ... end dangerous as it makes the semantics modal.

@StefanKarpinski
Copy link
Member

It is entirely possible that the compiler could at some point recognize that it doesn't actually need to make a copy in these cases. We've also considered the syntax b@[:,i] for view slices which would make the awkwardness difference considerably less. Some people do hate that syntax though.

@KristofferC
Copy link
Member

KristofferC commented Aug 3, 2017

@views function f2(a,b,i)
    a .= b[:,1]
    nothing
end

isn't so bad and that way you don't have to put it on the entire module. Optimization would, of course, be better.

@damiendr
Copy link
Contributor

damiendr commented Aug 4, 2017

@views on a function is more acceptable already, though I agree that optimisation would be better.

b@[:,i] sounds fine to me for times when the intent is to create a view that will be used as such later in the code, though I don't mind view() in that case either. But doesn't the dot syntax already imply that one wants to use views?

From the doc:

If the left-hand side is an array-indexing expression, e.g. X[2:end] .= sin.(Y), then it translates to broadcast! on a view, e.g. broadcast!(sin, view(X, 2:endof(X)), Y), so that the left-hand side is updated in-place.

How about a similar rule for the RHS, eg. indexing produces a view whenever the array is an operand of a dot call?

@ngharrison
Copy link

Out of curiosity, is there any chance the default behavior will switch to views instead of copies in a future version of the language?

@ViralBShah
Copy link
Member

If it were to happen, it can't happen before 2.0.

@ngharrison
Copy link

Cool cool. Thanks to everyone for their great work on the language! It's very impressive :)

Keno pushed a commit that referenced this issue Dec 21, 2023
Stdlib: Pkg
URL: https://github.com/JuliaLang/Pkg.jl.git
Stdlib branch: master
Julia branch: master
Old commit: 85f1e5564
New commit: 3c86ba27e
Julia version: 1.11.0-DEV
Pkg version: 1.11.0
Bump invoked by: @IanButterworth
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
JuliaLang/Pkg.jl@85f1e55...3c86ba2

```
$ git log --oneline 85f1e5564..3c86ba27e
3c86ba27e add `add --weak/extra Foo` to add to [weakdeps] or [extras] (#3708)
2e640f92f respect --color=no in Pkg.precompile (#3740)
cbd5d08ad Automatically add compat entries when adding deps to a package (#3732)
03de920b3 rm old manual handling of `--compiled-modules` (#3738)
314d5497b Use realpaths for temp dirs during tests. Fix SparseArrays `why` breakage (#3734)
a6531d4be environments.md: update Julia version (#3715)
a509bc062 Revise the API of is_manifest_current. (#3701)
60b7b7995 rm incorrect kwargs in add docstring (#3733)
```

Co-authored-by: Dilum Aluthge <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code
Projects
None yet
Development

No branches or pull requests