workaround for more complicated gradient #1

tholdem · 2020-12-31T20:45:13Z

Hello,

Thank you for this workaround it was very helpful for my understanding of the same bug. However, I'm trying to write AD Hessian function using only Zygote to avoid bugs I got in ForwardDiff. The function unfortunately is a bit more complicated than matrix multiplication, and I'm not sure if finding the gradient ourselves is feasible.

function jacobian(f,x)
    y,back  = Zygote.pullback(f,x)
    k  = length(y)
    n  = length(x)
    J  = Matrix{eltype(y)}(undef,k,n)
    e_mat = Matrix(I,k,k)
    @inbounds for i = 1:k
        J[i,:] = back(e_mat[:,i])[1]
    end
    (J,)
end

hessian(f, x) = jacobian(x -> gradient(f, x)[1], x)

I got the following bug:

Mutating arrays is not supported

    error(::String)@error.jl:33
    (::Zygote.var"#364#365")(::Nothing)@array.jl:58
    (::Zygote.var"#2245#back#366"{Zygote.var"#364#365"})(::Nothing)@adjoint.jl:59
    (::Zygote.var"#150#151"{Zygote.var"#2245#back#366"{Zygote.var"#364#365"},Tuple{Tuple{Nothing,Nothing},Tuple{Nothing}}})(::Nothing)@lib.jl:191
    (::Zygote.var"#1693#back#152"{Zygote.var"#150#151"{Zygote.var"#2245#back#366"{Zygote.var"#364#365"},Tuple{Tuple{Nothing,Nothing},Tuple{Nothing}}}})(::Nothing)@adjoint.jl:59
    #[email protected]:38[inlined]
    (::typeof(∂(λ)))(::Tuple{Array{Bool,2},Nothing})@interface2.jl:0
    #2209#[email protected]:59[inlined]
    (::typeof(∂(λ)))(::Tuple{Nothing,Array{Bool,2},Nothing})@interface2.jl:0
    [email protected]:15[inlined]
    (::typeof(∂(λ)))(::Tuple{Nothing,Array{Bool,2},Nothing})@interface2.jl:0
    F@Other: 1[inlined]
    (::typeof(∂(λ)))(::Tuple{Nothing,Array{Bool,1}})@interface2.jl:0
    #[email protected]:40[inlined]
    (::typeof(∂(λ)))(::Tuple{Array{Bool,1}})@interface2.jl:0
    [email protected]:49[inlined]
    (::typeof(∂(gradient)))(::Tuple{Array{Bool,1}})@interface2.jl:0
    #1@Local: 1[inlined]
    (::typeof(∂(#1)))(::Array{Bool,1})@interface2.jl:0
    (::Zygote.var"#41#42"{typeof(∂(#1))})(::Array{Bool,1})@interface.jl:40
    jacobian(::Function, ::Array{Float64,2})@Other: 8
    top-level scope@Local: 1[inlined]

Because I don't know what the gradient of back function from Zygote.pullback is, I cannot think of a way to not use mutating array to accomplish creating the Jacobian. I was wondering if you might have any insights on this. Thank you so much, and happy new year!

The text was updated successfully, but these errors were encountered:

rakeshvar · 2021-01-01T07:02:37Z

There should be a much simpler way of doing this.
Look at the Hessian Function from Zygote itself.

tholdem · 2021-01-01T07:05:43Z

Sorry if I wasn't clear, I'm trying to avoid using Zygote.hessian because it uses functions from ForwardDiff and is not purely Zygote, thus giving me lots of difficult bugs that would not happen if I use a purely Zygote hessian function.

rakeshvar · 2021-01-01T08:05:29Z

Hmmm... you are right. I did not know that.
The st. forward way to do this would be to define h(x) as in the example below. But that is error-ing out. This seems to be a fundamental limitation of Zygote that can not be surmounted by the trick you are using above. If it could be they would not be using ForwardDiff in the first place. May be we can raise a ticket there or in discourse and see.

> n = 3
> A = reshape(0:(n^2-1), n, n) .% (n+1)
3×3 Array{Int64,2}:
 0  3  2
 1  0  3
 2  1  0

> H = 2*A'*A
3×3 Array{Int64,2}:
 10   4   6
  4  20  12
  6  12  26

> x1 = collect(0:(n-1))
3-element Array{Int64,1}:
 0
 1
 2

> f(x) = sum(abs2, A*x)
> g(x) = Zygote.gradient(x_ -> f(x_), x)[1]
> h(x) = Zygote.gradient(x_ -> g(x_), x)[1]
> f(x1)
86

> [g(x1) f'(x1)]
3×2 Array{Int64,2}:
 16  16
 44  44
 64  64

> [Zygote.hessian(f, x1) H]
3×6 Array{Int64,2}:
 10   4   6  10   4   6
  4  20  12   4  20  12
  6  12  26   6  12  26

> h(x1)
ERROR: BoundsError

tholdem · 2021-01-01T18:01:40Z

That makes sense. I've already opened an issue here FluxML/Zygote.jl#865 but I will give Discourse a try too. Thank you for your help!

rakeshvar · 2021-01-02T04:30:26Z

I already started a discussion on discourse here. I did not think through some obvious things before posting there. It let to long discussions there which are very insightful. 🙂

tholdem · 2021-01-02T05:02:01Z

I really appreciate your help! Yes I actually saw this earlier and was fascinated by the technical details. In short ForwardDiff is the right tool for Zygote.hessian, I just need to figure out a way to make it compatible with my function. Definitely worth the time to dig into more. Cheers!

rakeshvar closed this as completed Jan 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

workaround for more complicated gradient #1

workaround for more complicated gradient #1

tholdem commented Dec 31, 2020 •

edited

Loading

rakeshvar commented Jan 1, 2021

tholdem commented Jan 1, 2021

rakeshvar commented Jan 1, 2021

tholdem commented Jan 1, 2021

rakeshvar commented Jan 2, 2021

tholdem commented Jan 2, 2021

workaround for more complicated gradient #1

workaround for more complicated gradient #1

Comments

tholdem commented Dec 31, 2020 • edited Loading

rakeshvar commented Jan 1, 2021

tholdem commented Jan 1, 2021

rakeshvar commented Jan 1, 2021

tholdem commented Jan 1, 2021

rakeshvar commented Jan 2, 2021

tholdem commented Jan 2, 2021

tholdem commented Dec 31, 2020 •

edited

Loading