Performance regression when indexing an 4+ dimensional array #11819

bfredl · 2015-06-23T12:09:21Z

Recently, after updating Julia some time last week or so, I see serious performance regression on some code using 4-dimensional arrays or more (my orginal application code used 6 dimensional arrays)

function v4test()
    l = 10
    v = zeros(Float64, (l, l, l, l))
    for i4 = 1:l
        for i3 = 1:l
            for i2 = 1:l
                for i1 = 1:l
                    v[i1,i2,i3,i4] = i1*i2
                end
            end
        end
    end
    v
end
@time v4test();
@code_llvm v4test()

The llvm code shows that the setindex! isn't statically inlined, rather a dynamic call is done in the innermost loop. No problem on 3 dimensional arrays. I believe this is a different problem than #11787 , as the regression is much bigger (50x than julia 0.3) and only for 4+ dimensional arrays.

I don't remember the last verison it worked without slowdown, but it was definitely after the Tuplecalypse.

The text was updated successfully, but these errors were encountered:

bfredl · 2015-06-23T12:37:27Z

Incidently, it seems to work well with Base.checkbounds(v, i1,i2,i3,i4); v[sub2ind(size(v),i1,i2,i3,i4)] = i1*i2. I'll use that as a workaround for now.

simonster · 2015-06-23T13:24:11Z

Most likely related to #9622 (see #9622 (comment)). Not sure exactly what commit of Julia you're on, but it sounds like updating may improve the situation a bit, since 6a3c173 improved the situation a bit.

bfredl · 2015-06-23T13:37:09Z

I tried using latest master, so I do have that commit. MAX_TUPLETYPE_LEN mean that static dispatch doesn't work with more than MAX_TUPLETYPE_LEN arguments ? That explains that it should work with sub2ind directly at least to 7 dimensions.

simonster · 2015-06-23T13:58:39Z

I can reproduce this on master too. Not sure if there's been a regression since 6a3c173 or if not all the cases are fixed, but I think it's a good idea to leave this issue open until it's fixed. This would be a bad performance regression to leave in the final release.

mbauman · 2015-06-23T14:01:23Z

Instead of relying upon Julia to do the sub2ind for the builtin Array types, let's go back to defining all of the getindex(A::Array, i1::Int, i2::Int, …) manually. It was an interesting experiment, but I think the additional complexity on the Julia side outweighs the necessity of these things being as fast as possible.

bfredl · 2015-06-23T14:34:52Z

Perhaps MAX_TUPLETYPE_LEN could be increased to say, 12 or 16? No matter if changed, shouldn't there be warning when the limit is exceeded? Even a somewhat confusing warning (due to vararg expansion deep down in a callstack and so on) is much better than silently 50x overhead when adding a extra paramenter to a function call.

mbauman · 2015-06-23T22:04:52Z

Aha, I figured it out. This one stems from removing map from to_index (6a3c173). While the right direction, tuples don't get the same sort of special treatment for recursive inlining that arguments do. The right thing to do here is to remove to_index(::Tuple) and instead always splat the arguments, with definitions like:

to_index() = ()
to_index(i1, I...) = (to_index(i1), to_index(I...)...)

(and also adjusting all the call sites to splat their tuples)… but that hits a stack overflow within femptolisp when computing the primes during bootstrap.

mbauman · 2015-06-23T22:10:37Z

Ah, of course, those definitions won't work. Amazing how describing the problem helps you see it.

tkelman added the performance Must go faster label Jun 23, 2015

bfredl closed this as completed Jun 23, 2015

simonster reopened this Jun 23, 2015

simonster added the regression Regression in behavior compared to a previous version label Jun 23, 2015

mbauman self-assigned this Jun 23, 2015

mbauman mentioned this issue Jun 23, 2015

Reinstate Array indexing for arbitrary dimensions #11827

Merged

mbauman mentioned this issue Jun 23, 2015

Ensure n-ary to_index inlines #11833

Merged

mbauman closed this as completed in #11833 Jun 28, 2015

samuelpowell mentioned this issue Oct 3, 2016

50 x perfomance regression in 0.5.0 when indexing > 4 dimensional array #18774

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance regression when indexing an 4+ dimensional array #11819

Performance regression when indexing an 4+ dimensional array #11819

bfredl commented Jun 23, 2015

bfredl commented Jun 23, 2015

simonster commented Jun 23, 2015

bfredl commented Jun 23, 2015

simonster commented Jun 23, 2015

mbauman commented Jun 23, 2015

bfredl commented Jun 23, 2015

mbauman commented Jun 23, 2015

mbauman commented Jun 23, 2015

Performance regression when indexing an 4+ dimensional array #11819

Performance regression when indexing an 4+ dimensional array #11819

Comments

bfredl commented Jun 23, 2015

bfredl commented Jun 23, 2015

simonster commented Jun 23, 2015

bfredl commented Jun 23, 2015

simonster commented Jun 23, 2015

mbauman commented Jun 23, 2015

bfredl commented Jun 23, 2015

mbauman commented Jun 23, 2015

mbauman commented Jun 23, 2015