NNODE fails on simple logistic curve #634

sdwfrost · 2023-01-25T16:47:49Z

NeuralPDE.jl fails on this simple example of a logistic curve. I'm not sure whether this would be helped by changing the neural net (as the output is between 0 and 1), though I've tried lots of structures, as well as trying to refine the optimization. The ADAM+BFGS should work OK for this example.

using ModelingToolkit
using OrdinaryDiffEq
using NeuralPDE
using DomainSets
using Flux
using Optimization
using OptimizationOptimJL
using OptimizationOptimisers
using Plots

@parameters t
@variables x(..)
Dt = Differential(t)
eqs = [Dt(x(t)) ~ x(t)*(1-x(t))]

@named ode_sys = ODESystem(eqs)
ode_prob = ODEProblem(ode_sys, [0.01], (0.0,10.0), [])
ode_sol = solve(ode_prob, Tsit5(), saveat=0.1)
plot(ode_sol)

bcs = [x(0) ~ 0.01]
domains = [t ∈ Interval(0.0, 10.0)];
@named pde_sys = PDESystem(eqs, bcs, domains, [t], [x(t)])

numhid = 32
chain = [Flux.Chain(Flux.Dense(1, numhid, Flux.σ), Flux.Dense(numhid, 1))]
grid_strategy = NeuralPDE.GridTraining(0.1)
discretization = NeuralPDE.PhysicsInformedNN(chain, grid_strategy)
pde_prob = NeuralPDE.discretize(pde_sys, discretization)

global i=1
callback = function (p,l)
    println("Epoch $i: Current loss is: $l")
    global i += 1
    return false
end

res = Optimization.solve(pde_prob, OptimizationOptimisers.Adam(0.1); callback=callback, maxiters=5000, abstol = 1e-10, reltol = 1e-10)
pde_prob = remake(pde_prob, u0 = res.minimizer)
res = Optimization.solve(pde_prob, OptimizationOptimJL.BFGS(); callback=callback, maxiters=5000, abstol = 1e-10, reltol = 1e-10)
phi = discretization.phi
ts = [infimum(d.domain):0.1:supremum(d.domain) for d in domains][1]
xpred  = hcat([phi[1]([t],res.u) for t in ts]...)'
plot(ode_sol)
plot!(ts,xpred)

lhoenig · 2023-06-24T23:18:11Z

I think the problem is not strictly speaking with NeuralPDE but that depending on the initialization, the NN sometimes gets stuck in local minima. With networks this small, I think this is a real concern. Your code sometimes actually trains successfully for me using Adam, BFGS, or both. You could try making the network wider or deeper to increase chances of training success.

sdwfrost · 2023-06-25T03:29:23Z

I've tried both wider and deeper (as well as both), and sometimes I get a sigmoidish curve, but still way off the true solution, and with many more parameters than a simple basis function approach would require. I was hoping that this example wouldn't require anything large or complex on the neural net side. The kind of problems I'm looking at will have the same problem as this simple logistic curve.

lhoenig · 2023-06-25T12:22:42Z

I'm more confused after playing around with it more.. Sometimes the loss is really low at the end (like 1e-10) but the solution is totally wrong. For successful runs the loss is more like 1e-7.

Sometimes I get NaN solutions / parameters without an error.

I also tried using Lux instead of Flux as this is recommended in the docs to use double precision. But using Lux I was never able to train successfully, the loss got lower more consistently but the solutions just don't seem to work.

Another thing I just realized is that the issue title references NNODE (https://docs.sciml.ai/NeuralPDE/stable/manual/ode/#NeuralPDE.NNODE) but the code doesn't actually use NNODE (or is it used internally?). Maybe the NNODE specialization would work better as this is an ODE here, but I could not get NNODE to work based on the example code there. I always get "Optimization algorithm not found. Either the chosen algorithm is not a valid solver
choice for the OptimizationProblem, or the Optimization solver library is not loaded." even though everything is indeed loaded..

lhoenig · 2023-06-25T12:55:23Z

sometimes I get a sigmoidish curve, but still way off the true solution

Fwiw, the successful runs do come very close to the sigmoid function for me. I was just able to get a successful solution using Lux, and it looks like this:

But I am not really sure about the loss values. Sometimes it works, sometimes it doesn't, sometimes that is reflected in the loss, other times it doesn't seem to be correlated. Maybe it becomes clearer when looking at the individual loss terms (data and physics loss).

sdwfrost · 2023-06-25T17:49:18Z

Sorry, the subject comes from a discussion of this issue on the Julia Discourse.

lhoenig · 2023-06-26T11:08:24Z

I got NNODE to run, I was missing the Optimisers package. But it doesn't really fit the function more reliably than using a PDESystem + PhysicsInformedNN on this problem, for me actually less so. I really don't understand these highly variable loss values of the PINNs and how they relate to the solution quality. With the NNODE I achieve the same results as described by @sdwfrost, i.e. never really an exact fit as I sometimes got with the PDESystem + PhysicsInformedNN code.

Here's my code for using the NNODE:

using Flux
using NeuralPDE
using OrdinaryDiffEq, Optimisers
using Plots
import Lux, OptimizationOptimisers, OptimizationOptimJL

eq = (u, p, t) -> u .* (1 .- u)
tspan = (0.0, 10.0)
u0 = 0.01
prob = ODEProblem(eq, u0, tspan)
ode_sol = solve(prob, Tsit5(), saveat=0.1)

function run()
    chain = Flux.Chain(Flux.Dense(1, 20, σ), Flux.Dense(20, 1))
    luxchain = Lux.Chain(Lux.Dense(1, 20, Lux.σ), Lux.Dense(20, 1))
    opt = OptimizationOptimJL.BFGS()
    # opt = OptimizationOptimisers.Adam(0.1)
    nnode_sol = solve(prob, NeuralPDE.NNODE(luxchain, opt), dt=1 / 10.0, verbose=true, abstol=1.0e-10, maxiters=5000)
    ts = tspan[1]:0.01:tspan[2]
    xpred = [nnode_sol(t) for t in ts]
    plot(ode_sol, label="ODE solution")
    plot!(ts, xpred, label="NNODE solution")
end

run()

sdesai1287 · 2023-08-18T18:44:14Z

I've been playing with this and applying a new training strategy to it, and I am getting the mean absolute difference between the nnode solution and ode solution is 0.02. With some other strategies, this difference is sometimes very large which confuses me for such a simple problem. Here is the code, I am going to try to make this better

using OrdinaryDiffEq, Optimisers
using Plots, Statistics
import Lux, OptimizationOptimisers, OptimizationOptimJL

eq = (u, p, t) -> u * (1 .- u)
tspan = (0.0, 10.0)
u0 = 0.01
prob = ODEProblem(eq, u0, tspan)
ode_sol = solve(prob, Tsit5(), saveat=0.01)

N = 4
func = Lux.σ
luxchain = Lux.Chain(Lux.Dense(1, N, func), Lux.Dense(N, N, func), Lux.Dense(N, 1,func))
# opt = OptimizationOptimJL.BFGS()
opt = OptimizationOptimisers.Adam(0.01)
weights = [1/14, 4/14, 4/14, 4/14, 1/14]
samples = 200
alg = NeuralPDE.NNODE(luxchain, opt, autodiff = false, strategy = NeuralPDE.WeightedIntervalTraining(weights, samples))
nnode_sol = solve(prob, alg, verbose=true, maxiters=50000, saveat=0.01)
ts = tspan[1]:0.01:tspan[2]
xpred = [nnode_sol(t) for t in ts]
print(abs(mean(ode_sol .- nnode_sol)))

using Plots
plot(ode_sol, label="ODE solution")
plot!(ts, xpred, label="NNODE solution")

Here is the plot

sdesai1287 · 2023-08-31T04:33:02Z

Seems like Quadrature Training works pretty well for this, I am getting mean difference in solutions to be 0.01 or less fairly consistently. Here is the code and a graph

using OrdinaryDiffEq, Optimisers, NeuralPDE
using Plots, Statistics
import Lux, OptimizationOptimisers, OptimizationOptimJL

eq = (u, p, t) -> u .* (1 .- u)
tspan = (0.0, 10.0)
u0 = 0.01
prob = ODEProblem(eq, u0, tspan)
ode_sol = solve(prob, Tsit5(), saveat=0.01)

N = 4
func = Lux.σ
luxchain = Lux.Chain(Lux.Dense(1, N, func), Lux.Dense(N, N -1, func), Lux.Dense(N -1, N-2, func), Lux.Dense(N - 2, 1, func))
opt = OptimizationOptimisers.Adam(0.001)
alg = NeuralPDE.NNODE(luxchain, opt, autodiff = false, strategy = NeuralPDE.QuadratureTraining())
nnode_sol = solve(prob, alg, verbose=true, abstol = 1e-10, maxiters=100000, saveat=0.01)

ts = tspan[1]:0.01:tspan[2]
xpred = [nnode_sol(t) for t in ts]
xreal = [ode_sol(t) for t in ts]
println(abs(mean(xpred .- xreal)))
println(abs(mean(nnode_sol .- ode_sol)))

using Plots
plot(ode_sol, label="ODE solution")
plot!(ts, xpred, label="NNODE solution")

ChrisRackauckas · 2023-08-31T10:56:11Z

👍 . I think we should change the docs to use QuadratureTraining in the tutorials and then close this. GridTraining shouldn't be used (as the docs say in other places).

sathvikbhagavan · 2023-10-08T06:48:00Z

@ChrisRackauckas we can close this?
#729 addresses using QuadratureTraining in the docs

ChrisRackauckas closed this as completed Oct 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NNODE fails on simple logistic curve #634

NNODE fails on simple logistic curve #634

sdwfrost commented Jan 25, 2023

lhoenig commented Jun 24, 2023

sdwfrost commented Jun 25, 2023

lhoenig commented Jun 25, 2023

lhoenig commented Jun 25, 2023

sdwfrost commented Jun 25, 2023

lhoenig commented Jun 26, 2023

sdesai1287 commented Aug 18, 2023 •

edited

Loading

sdesai1287 commented Aug 31, 2023

ChrisRackauckas commented Aug 31, 2023

sathvikbhagavan commented Oct 8, 2023

NNODE fails on simple logistic curve #634

NNODE fails on simple logistic curve #634

Comments

sdwfrost commented Jan 25, 2023

lhoenig commented Jun 24, 2023

sdwfrost commented Jun 25, 2023

lhoenig commented Jun 25, 2023

lhoenig commented Jun 25, 2023

sdwfrost commented Jun 25, 2023

lhoenig commented Jun 26, 2023

sdesai1287 commented Aug 18, 2023 • edited Loading

sdesai1287 commented Aug 31, 2023

ChrisRackauckas commented Aug 31, 2023

sathvikbhagavan commented Oct 8, 2023

sdesai1287 commented Aug 18, 2023 •

edited

Loading