-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance of nested functions #4428
Comments
Isn't part of the execution time compiling the nested function? |
I think it's not the time of compiling nested function. I checked generated code and it is very different for g and g1 functions. |
Hopefully @JeffBezanson will chime in then. It's beyond my depth at that point. |
Putting everything in a function gives me a bit different results:
julia> g(k) = 1.0/k julia> test(1e7,g) Hard-coded inline function wins by a factor >10. |
That's because you pass function g(k) = 1.0/k
function test(n)
function f1(n)
sum = 0.0
g1(k) = 1.0/k
for i = 1:n
sum += g1(i)
end
sum
end
function f2(n)
sum = 0.0
for i = 1:n
sum += g(i)
end
sum
end
function f3(n)
sum = 0.0
for i = 1:n
sum += 1.0/i
end
sum
end
@time f1(n)
@time f2(n)
@time f3(n)
end
test(1e7) My results:
Also allocated memory is much more in the first case. |
I ran the script above, doubling n each time: ---------------- f(n) ----------------- f(2n) ----------------- f(4n) ----------------- f(8n) If the overhead were due to compiling the nested function, I think there would be some constant baseline for each iteration of f1...but here the elapsed time doubles each time n is doubled. |
I ran the script removing the sum variable (just making calls to the inline function g1 in f1) for n = 10^7. The memory allocated was ~ (10^7)_32 bytes. For n = 2_10^7, memory allocated was (2_10^7)_32 bytes. It looks like the compiler is not inlining g1, resulting in 32 extra bytes per call to the function. |
Related issue #1864 |
I ran into this issue doing some benchmarking. In many cases anonymous or nested functions can either be lifted to global scope, i.e. close over no variables, or don't ever escape as values, i.e. could be re-written as global functions with extra arguments passed at the call site. Does Julia have a lambda lifting pass in its compiler? (My benchmarking is here: http://www.palladiumconsulting.com/2014/09/little-performance-explorations-julia/) |
Very nice series of articles. Thanks for sharing. -erik On Tue, Sep 30, 2014 at 11:38 AM, Sebastian Good [email protected]
Erik Schnetter [email protected] |
For the short term, see https://github.com/timholy/FastAnonymous.jl. Nice post, by the way; glad you were persistent and figured out how to get the performance you wanted. |
closing as a dup of #1864 |
I've found that nested functions usage cause significant performance degrade. In the following sample script I got increasing run time from 0.4s to 2.6s on my machine.
The text was updated successfully, but these errors were encountered: