-
-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jacobian in loss function #953
Comments
Zygote's ForwardDiff's
Something is slightly wrong there. Second derivatives are not really well-supported, but can sometimes be made to work; you will have to experiment a bit, and start small. |
Thanks for your helpful response. Working from your example, this seems to work
but modifying accordingly the loss function from the MWE above doesn't work.
If I print out the gradients, it looks like they're missing:
It's not clear to me why the first approach works, but the second doesn't. |
I think what you're seeing is what Smaller example, using Zygote#master which after #968 applies julia> using Zygote, ForwardDiff
julia> W = rand(2); X = rand(2);
julia> G = gradient(Params([W,X])) do
sum(ForwardDiff.gradient(x -> dot(x,W)^3, X))
end
Grads(...)
julia> G[X]
2-element Vector{Float64}:
9.779807803787289
5.941561578834241
julia> G[W] === nothing
true This is still true if you change it to call forward-over-reverse instead, although perhaps there is some context which ought to be passed to the inner call here, to inform it that we care about julia> G2 = gradient(Params([W,X])) do
sum(Zygote.forwarddiff(X -> Zygote.gradient(x -> dot(x,W)^3, X)[1], X))
end
Grads(...)
julia> G2[X]
2-element Vector{Float64}:
9.779807803787289
5.941561578834241
julia> G2[W] === nothing
true |
This seems similar to #820 and FluxML/Flux.jl#1464, but the suggestions there don't seem to help in my case.
I'm trying to use the Jacobian of a specific layer's activations with respect to the network input in order to calculate a penalty in my loss function. I'm not sure if the problem is that my implementation is naive, or if what I'm trying to do just isn't supported. I'm using Flux 0.12.2 and Zygote 0.6.10 with Julia 1.6.1.
Running the following MWE
I get a few different outcomes, depending on which version of
jacobian
I call. Calling theZygote
andForwardDiff
methods results in the errorERROR: LoadError: Mutating arrays is not supported
. Calling theReverseDiff
version gives meERROR: LoadError: Can't differentiate foreigncall expression
.Should it be possible to get a Jacobian inside the loss function like this? If not, is there a better way to do it?
Thanks in advance--I appreciate any insight you can give me.
The text was updated successfully, but these errors were encountered: