-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce default_stepsize(M, O)
#180
Comments
If you find default parameters that do work – sure; I am not 100% sure where the current default is from, I think those are the ones from Manopt/Matlab. In general setting good defaults is a really complicated thing to do – and remember that Rosenbrock is a really mean example. But - hehe - this one really explodes fast julia> cg_opts = conjugate_gradient_descent(M, f_manopt, g_manopt!, [0.0, 0.0]; evaluation=MutatingEvaluation(), debug=[:Iteration, :Iterate, :Cost, :Stop,"\n"])
Initial F(x): 1.000000
# 1 x: [2.0, 0.0]F(x): 1601.000000
# 2 x: [-3200.0, 800.0]F(x): 10484121674246400.000000
# 3 x: [2.6212863989790594e13, -3.2725775149997744e12]F(x): 47212597681438596372495799766368366330997806787448537088.000000
# 4 x: [-1.4408985664390341e43, 8.994538430325952e41]F(x): 4310559429836373893762014461052062964684649679539072155822222971503375833074684002488557111981322457934557086002722283200061060668117239830893990062402996910678581744400072704.000000
# 5 x: [2.393261832712779e132, -7.46974354390761e130]F(x): Inf
# 6 x: [NaN, NaN]F(x): NaN |
Ah, I see that is the same problem as with gradient descent. We have a constant stepsize as default for both For example with the default. Armijo julia> cg_opts = conjugate_gradient_descent(M, f_manopt, g_manopt!, [0.0, 0.0]; evaluation=MutatingEvaluation(), stepsize=ArmijoLinesearch(M), debug=[:Iteration, :Iterate, :Cost, :Stop,"\n",100])
Initial F(x): 1.000000
# 100 x: [0.3789170183626525, 0.1455948665234621]F(x): 0.386151
# 200 x: [0.4604486280239601, 0.21308291361774606]F(x): 0.291230
# 300 x: [0.5157252480309722, 0.26664128363827033]F(x): 0.234567
# 400 x: [0.5595346370785345, 0.3136420428474512]F(x): 0.194041
# 500 x: [0.5940718838053803, 0.3531472767461153]F(x): 0.164783
The algorithm reached its maximal number of iterations (500).
2-element Vector{Float64}:
0.5940718838053803
0.3531472767461153 it already looks better. The question here is – whom do we want to annoy with our default?
I think we could switch to the first case, since we now have the |
...and of course you can tweak Armijo as well:
or even switch to a different CG coefficient
yields
So it is again often the question: Provide good defaults that work “okayish” – or optimise for a specific problem |
I think it's safe to assume that people who would like gradient-based optimization on a manifold do it on a manifold with a reasonable |
Sure we can switch to Armijo in Manopt 0.4. The safest way of course could be to have an “empty” default if none is defined (but that is a change to ManifoldsBase) and choose a constant stepsize then. But this means that when defining exp one has to set the default themselves. |
Currently I have no way of checking the curvature before setting the default, so If I would switch for all manifolds to Armijo. |
Could we have a |
That sounds like a good idea :) White some things to keep in mind for 0.4, but sure, we can do a few PRs on main before releasing that one. Let me finish the large rework for costgrad first. edit: Maybe |
default_linesearch(M, O)
Ah an I would maybe call that |
default_linesearch(M, O)
default_stepsize(M, O)
I noticed two problems with this idea.
What should we do about these two points to make is usable and easy to implement? |
Hm, I think this shouldn't dispatch on solver state but on solver type. So we may have something like |
Good that I have a hierarchy established for the (former options now) SolverStates, which identify the solver nearly-uniquely. I think the type-idea you posted is not 100% correct, since |
So, your idea is to use something like |
I think that looks also nicer than using the high-level functions like |
Cool, thanks 🙂 . |
The following code demonstrates how conjugate gradient descent fails on Rosenbrock function using default parameters:
Could we have default parameters that make it work?
The text was updated successfully, but these errors were encountered: