Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance of captured variables in closures #15276

Open
timholy opened this issue Feb 28, 2016 · 93 comments
Open

performance of captured variables in closures #15276

timholy opened this issue Feb 28, 2016 · 93 comments
Assignees
Labels
compiler:lowering Syntax lowering (compiler front end, 2nd stage) performance Must go faster

Comments

@timholy
Copy link
Member

timholy commented Feb 28, 2016

using Images: realtype

function ifi{T<:Real,K,N}(img::AbstractArray{T,N}, kern::AbstractArray{K,N}, border::AbstractString, value)
    if border == "circular" && size(img) == size(kern)
        out = real(ifftshift(ifft(fft(img).*fft(kern))))
    elseif border != "inner"
        prepad  = [div(size(kern,i)-1, 2) for i = 1:N]
        postpad = [div(size(kern,i),   2) for i = 1:N]
        fullpad = [nextprod([2,3], size(img,i) + prepad[i] + postpad[i]) - size(img, i) - prepad[i] for i = 1:N]
        A = padarray(img, prepad, fullpad, border, convert(T, value))
        krn = zeros(typeof(one(T)*one(K)), size(A))
        indexesK = ntuple(d->[size(krn,d)-prepad[d]+1:size(krn,d);1:size(kern,d)-prepad[d]], N)::NTuple{N,Vector{Int}}
        AF = ifft(fft(A).*fft(krn))
        out = Array(realtype(eltype(AF)), size(img))
    end
    out
end

Test:

julia> @code_warntype ifi(rand(3,3), rand(3,3), "replicate", 0)
Variables:
  #self#::#ifi
  img::Array{Float64,2}
  kern::Array{Float64,2}
  border::ASCIIString
  value::Int64
  prepad::Box
...

Now comment out the indexesK = ... line (the output of which is not used at all). Suddenly prepad is inferred as Array{Int, 1}.

@JeffBezanson
Copy link
Member

This is due to that line capturing prepad in a closure.

@vtjnash vtjnash added performance Must go faster regression Regression in behavior compared to a previous version labels Mar 8, 2016
@andreasnoack andreasnoack added the multithreading Base.Threads and related functionality label Mar 17, 2016
@andreasnoack
Copy link
Member

Just want to add that this has quite severe consequences for @threads for. Is it the plan to change/improve the behavior of Box or is it advised not to capture arrays in clusures? The last option will make it difficult for @threads for.

@KristofferC
Copy link
Member

KristofferC commented Mar 19, 2016

This also has severe consequences for NLsolve (JuliaNLSolvers/NLsolve.jl#49). Previously things were inferred fine:

image

but now everything is just Box and Any and performance is pretty much garbage..

image

Now sure how to work around this.

This feels like a quite significant regression. Now suddenly any variable captured by a closure poisons the variable for the whole outer function. Even if it is only read from.

Edit: The example below has been fixed.

Ex:

function newton_{T}(initial_x::Vector{T})

    nn = length(initial_x)
    xold = fill(convert(T, NaN), nn)
    fvec = Array(T, nn)
    f_calls = 1

    function fo(xlin::Vector)
        if xlin != xold
            print(f_calls)
        end
        return(dot(fvec, fvec) / 2)
    end

    return
end

@code_warntype newton_(rand(5))

Here, xold, f_calls and fvec are Box for the whole function. This used to work fine on 0.4.

nalimilan added a commit to nalimilan/FreqTables.jl that referenced this issue Apr 22, 2016
Until the number of elements passed via varargs is inferred,
we need to use a wrapper function. Also remove access to lev[i]
from anonymous function, which is currently buggy on 0.5
(JuliaLang/julia#15276).
nalimilan added a commit to nalimilan/FreqTables.jl that referenced this issue Apr 22, 2016
Until the number of elements passed via varargs is inferred,
we need to use a wrapper function. Also remove access to lev[i]
from anonymous function, which is currently buggy on 0.5
(JuliaLang/julia#15276).
@davidanthoff
Copy link
Contributor

Should this get a 0.5 label? For my projects this seems to cause a significant increase in runtime (~30%) for projects that worked well on 0.4.

@JeffBezanson
Copy link
Member

Everything tagged "regression" will get attention before 0.5-final.

@JeffBezanson JeffBezanson self-assigned this Apr 26, 2016
@davidanthoff
Copy link
Contributor

Thanks!

@JeffBezanson
Copy link
Member

Ok, I have a quick fix for many of the examples posted here. Not including @timholy 's original example in this issue, of course :/

@davidanthoff
Copy link
Contributor

I'll try my cases once this has landed in a windows nightly binary.

@JeffBezanson
Copy link
Member

#16048 should be good now, #16050 not yet but also should be easy.

JeffBezanson added a commit that referenced this issue Apr 26, 2016
Before they were pessimized a bit in a way that's no longer necessary
after the function redesign.

helps #15276
JeffBezanson added a commit that referenced this issue Apr 26, 2016
Before they were pessimized a bit in a way that's no longer necessary
after the function redesign.

helps #15276
@gitboy16
Copy link
Contributor

@uniment
Copy link

uniment commented Jan 19, 2023

what Jeff described only works if you are calling f inside the same function it was created in

Under what circumstance is this true? I don't see it:

julia> @btime (()->begin
           x::Int = 1
           f() = (x = x+1; x)
           for i=1:1000; f() end
           x
       end)()
  4.200 μs (490 allocations: 7.66 KiB)
1001

julia> @btime (()->begin
           x = Ref(1)
           f() = (x[] = x[] + 1; x[])
           for i=1:1000; f() end
           x[]
       end)()
  1.800 ns (0 allocations: 0 bytes)
1001

(Julia 1.9.0-beta3)

(inspired by this discourse discussion, which showed that global variables share similar performance characteristics w.r.t. typed global vs. const Ref)

@LilithHafner
Copy link
Member

@c42f, if you happen to be rewriting lowering it would be nice to fix this issue.

@LilithHafner
Copy link
Member

Ran into this again here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:lowering Syntax lowering (compiler front end, 2nd stage) performance Must go faster
Projects
None yet
Development

No branches or pull requests