Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculation of the gradient of a loss function that requires the calculation of the gradient of a Flux model. #24

Closed
emmanuellujan opened this issue Sep 30, 2022 · 1 comment · Fixed by #25

Comments

@emmanuellujan
Copy link
Member

emmanuellujan commented Sep 30, 2022

Calculating the gradient of a loss function that requires computing the gradient of the energy, which is currently defined by a Flux neural network model. I did not find a clean and performant way to do this. For the moment I am computing the gradient of the neural network model "analytically", particularly, the gradient of a feed-forward neural network using relu as activation function (see here). In addition, I had to use Flux.destructure to extract the parameters of the model.
Links related to this issue:

Solving this issue enables working with many types of neural network architectures, not only FFNN. For example, it allows experimenting with different models in this script which helps to find the optimal hyperparameters of the model

@emmanuellujan emmanuellujan changed the title Calculation of the gradient of the loss function that requires the calculation of the gradient of a Flux model. Calculation of the gradient of a loss function that requires the calculation of the gradient of a Flux model. Sep 30, 2022
@emmanuellujan
Copy link
Member Author

using Flux

# Range
xrange = 0:π/99:π
xs = [ Float32.([x1, x2]) for x1 in xrange for x2 in xrange]

# Target function: E
E_analytic(x) = sin(x[1]) * cos(x[2])

# Analytical gradient of E
dE_analytic(x) = [cos(x[1]) * cos(x[2]), -sin(x[1]) * sin(x[2])]

# NN model
mlp = Chain(Dense(2,4, Flux.σ),Dense(4,1))
ps_mlp = Flux.params(mlp)
E(x) = sum(mlp(x))
dE(mlp, x) = sum(gradient(x -> sum(mlp(x)), x))

# Loss
loss(x, y) = Flux.Losses.mse(x, y)

# Training
epochs = 10; opt = Flux.Adam(0.1)
for _ in 1:epochs
    g = gradient(()->loss(reduce(vcat, dE.([mlp],xs)),
                          reduce(vcat, dE_analytic.(xs))), ps_mlp)
    Flux.Optimise.update!(opt, ps_mlp, g)
    l = loss(reduce(vcat, dE.([mlp],xs)),
             reduce(vcat, dE_analytic.(xs)))
    println("loss:", l)
    # GC.gc()
end
  • Note: NeuralPDE, a project with similarities to this one (training of MLPs using ML Julia abstractions) moved from Flux to Lux.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant