-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slow array allocation #4707
Comments
Could you post the matrix generator? |
You'll need 3 codes for that example: A = getDivGrad(8,8,8);
#----------------- Get the A matrix
function getDivGrad(n1,n2,n3)
# the Divergence
D1 = kron(speye(n3),kron(speye(n2),ddx(n1)))
D2 = kron(speye(n3),kron(ddx(n2),speye(n1)))
D3 = kron(ddx(n3),kron(speye(n2),speye(n1)))
# DIV from faces to cell-centers
Div = [D1 D2 D3]
return Div*Div';
end
#----------------- 1D finite difference on staggered grid
function ddx(n)
# generate 1D derivatives
return d = spdiags(ones(n)*[-1 1],[0,1],n,n+1)
end
#------------- Build a diagonal matrix
function spdiags(B,d,m,n)
# spdiags(B,d,m,n)
# creates a sparse matrix from its diagonals
d = d[:]
p = length(d)
len = zeros(p+1,1)
for k = 1:p
len[k+1] = int(len[k]+length(max(1,1-d[k]):min(m,n-d[k])))
end
a = zeros(int(len[p+1]),3)
for k = 1:p
# Append new d[k]-th diagonal to compact form
i = max(1,1-d[k]):min(m,n-d[k])
a[(int(len[k])+1):int(len[k+1]),:] = [i i+d[k] B[i+(m>=n)*d[k],k]]
end
A = sparse(int(a[:,1]),int(a[:,2]),a[:,3],m,n);
return A
end |
It would also be helpful to know how many cores you have, and to test when starting matlab with the |
Comparison on a single core with matlab, grid size 64^3 tic; for i=1:100,z = A*v;end; toc tic(); for i=1:100, z=A*v; end; toc() |
Hmm, I can't replicate a difference anywhere close to this. For me, Matlab runs in 0.9 seconds, slower than for you. Julia runs in 1.45 seconds, faster than for you. It would be great to close this gap, of course, but it's not the tenfold difference you see. Are you using a recently-compiled Julia? What platform? What version of Matlab? I'm testing on R2012b. |
#4720 gets Julia's performance down to about 1.1 seconds, still a little slower than matlab but now they're pretty close. Somewhat surprisingly to me, disabling bounds-checking was the most important change, but every tweak made some difference. |
@ehaber99 which version of julia are you using? |
Also which OS? |
I'm running everything through Julia Studio 0.4.1 and downloaded the latest version. I replicated this on another mac I have. I'll try to see tomorrow if my students can replicate it on their systems |
timholy - can you post the code that made julia and matlab around the same speed? |
The test code is just your code; I translated the first two functions directly into matlab, too. (Did you translate it differently somehow? You're not doing any row/column rearrangements, are you?) The code that provides a ~35% speed improvement for Julia is in #4720. If you were building Julia yourself, you could just git pull and rebuild. (But you're not, if you're using Studio.) However, that speed improvement is far too modest to account for the difference you're observing. I have no idea what Julia version Julia Studio 0.4.1 corresponds to; |
Thanks I have |
by the way, my matlab is almost identical. No tricks ... |
The only things I can recommend are asking Forio to update to the latest, building Julia yourself, or profiling to figure out the issue. I've definitely seen at least one example of code that should have been fast, but wasn't at some point in the last month, but was fast again when I tried the latest Julia. The details escape me, unfortunately. |
Could this be a good thing to capture in our codespeed tests? |
I believe so. |
We need some code generation improvements here too. |
Cc @staticfloat We should certainly track this in code speed. Matvec performance is important and will be improved significantly as part of the iterative solvers work that is just starting out. |
I can't get such a large difference either. I am running Julia pre #4720 and get |
Perhaps @WestleyArgentum can provide a new julia studio build to try this out? |
Sorry about the delay, we've been meaning to update since last week. There's a new build (0.4.2) out now though, and it includes rc2 |
No worries, you're updating quite frequently given that it's a little more involve than |
@ViralBShah Would this be tracked by the axpy test in our BLAS suite? |
@staticfloat That is just |
Oh, of course, this is for sparse vectors. Got it. |
This is for sparse matrix times dense vector. |
To me it seems that most of the confusion in this thread arised from the slightly different syntax of the for-loop in Julia and Matlab. In Julia a semicolon is used to separate the for loop from the statements. Thus, in the above example matrix-vector product is evaluated more often in Julia than in Matlab which explains the huge difference in timings. This may also be the reason why most people here could not reproduce this. However, on my MacBookPro, Julia (Julia Version 0.2.0-rc3) is still slower by a factor of 2 than Matlab R2013a. #----------------- Julia (wrong) ---------------------- #----------------- Julia (correct) ---------------------- %-------------- Matlab R2013a --------------------- |
Perhaps it's best to use multiline |
sure in this case that would have helped. I actually like Julia's ability to concisely define nested for loops by separating the ranges with commas. I think this pitfall for matlab users may be a candidate for the Noteworthy Differences from other Languages section. |
@lruthotto Good catch. The difference between MATLAB and Julia is not that big on my machine, which is
It takes 0.31 seconds in Julia and 0.23 seconds in MATLAB. The |
@lruthotto, nice catch indeed. @ehaber99, I'm sorry I didn't notice that as a the reason earlier. @lruthotto, were you running Matlab single-threaded? Obviously it's a relevant comparison to check the multithreaded Matlab, but it's also worth knowing as it may explain the timing difference. |
Yes, I ran the code in a single threaded Matlab. @andreasnoackjensen: I also see a huge difference in allocation speed: ---- MATLAB ---- ---- Julia ---- Maybe that is the real bottleneck on my machine? Here's my versioninfo() Julia Version 0.2.0-rc3+8 |
@JeffBezanson Do we need codegen improvements here, or is it memory allocation / GC improvements to get on par? Is it possible that Matlab is using SIMD instructions? |
* upstream/master: (53 commits) edit embedding chapter formatting fixes Fix JuliaLang#5056 more consitent/useful description of predicate in ie all() NEWS for JuliaLang#4042 Add sparse matvec to perf benchmark (JuliaLang#4707) Use warn_once instead of warn (JuliaLang#5038) Use warn() instead of println in base/sprase/sparsematrix.jl allow scale(b,A) or scale(A,b) when b is a scalar as well as a vector, don't restrict scale unnecessarily to arrays of numbers (e.g. scaling arrays of arrays should work), and improve documentation for scale\! Added section about memory management added negative bitarray rolling More accurate linspace for types with greater precision than Float64 add realmin realmax for BigFloat fix eps(realmax), add typemin and typemax for BigFloat fix JuliaLang#5025, ordering used by nextfloat/prevfloat roll back on errors during type definitions. fixes JuliaLang#5009 improve method signature errors. fixes JuliaLang#5018 update juliadoc add more asserts for casts Fix segfaulting on strstr() failure ...
The header should either be changed to something like "slow array allocation" or the issue should be closed because the original problem was because of a typo. |
I think we can close this. All that's left here is generic performance improvements. |
I am generating the 2D Laplacian in matlab and Julia.
For a 3D mesh of 64^3 I have a (sparse) matrix A which has 7 diagonals
of size 262144^2
I generate a random vector v of the right size
then run the following
----------------- Julia ----------------------
tic(); for i=1:100, z=A*v; end; toc()
elapsed time: 4.248922994 seconds
%-------------- matlab ---------------------
tic; for i=1:100, z = A*v; end; toc
Elapsed time is 0.282192 seconds.
I am not sure why is Julia so much slower for mat-vec products.
The text was updated successfully, but these errors were encountered: