-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Summing along a dimension of a PermutedDimsArray could be faster #38774
Comments
Most likely it's a cache-order effect. See https://julialang.org/blog/2013/09/fast-numeric/#write_cache-friendly_codes |
I agree that Without figuring out high-tech things like #34847, is it possible that some julia> view(x, 1:10, 1) |> typeof
SubArray{Float64, 1, Matrix{Float64}, Tuple{UnitRange{Int64}, Int64}, true}
julia> view(view(x, :, 1), 1:10) |> typeof
SubArray{Float64, 1, Matrix{Float64}, Tuple{UnitRange{Int64}, Int64}, true} Also, I think julia> summatcols(x) == vec(sum(x, dims=1))
true
julia> @btime sum($x, dims=1);
293.548 μs (1 allocation: 7.94 KiB)
julia> @btime sum($x, dims=2);
320.756 μs (5 allocations: 8.02 KiB)
julia> @btime sum($y, dims=1);
1.430 ms (1 allocation: 7.94 KiB)
julia> @btime sum($y, dims=2);
1.558 ms (5 allocations: 8.02 KiB) |
Looking at a section of the Lines 279 to 284 in 527d6b6
A if it is a transpose, and flipping the order of the loops improves performance.
|
I've been wanting to store all my matrices of with particular axes, but some matrices need fast iteration over columns and other matrices need fast iteration over rows. The solution to this is to use
PermutedDimsArray
s.However it looks like you don't get the full performance benefit of this strategy using
view
s. Below is an MWEThe last timing should be around
350 μs
, but it is instead more than 3 times that.Note that I think the
@view
may be the problem. Consider a scenario that only depends on the order of the loops. There,PermutedDimsArray
works as as expected.xref #34847. I commented on that issue but it might not be that so I'm filing a new issue here.
Thank you!
The text was updated successfully, but these errors were encountered: