-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ScalarNonlinearFunction support #2059
Conversation
Here's a current demo with the Ipopt PR: julia> import Ipopt
julia> const MOI = Ipopt.MOI
MathOptInterface
julia> function main()
model = Ipopt.Optimizer()
x = MOI.add_variable(model)
con_f = MOI.ScalarNonlinearFunction{Float64}(:^, Any[x, 2])
MOI.add_constraint(model, con_f, MOI.LessThan(1.0))
MOI.set(model, MOI.ObjectiveSense(), MOI.MIN_SENSE)
my_f(x) = (x - 0.5)^2
MOI.set(model, MOI.UserDefinedFunction(:my_f, 1), (my_f,))
obj_f = MOI.ScalarNonlinearFunction{Float64}(:my_f, Any[x])
MOI.set(model, MOI.ObjectiveFunction{typeof(obj_f)}(), obj_f)
MOI.optimize!(model)
return MOI.get(model, MOI.VariablePrimal(), x)
end
main (generic function with 1 method)
julia> main()
This is Ipopt version 3.14.4, running with linear solver MUMPS 5.4.1.
Number of nonzeros in equality constraint Jacobian...: 0
Number of nonzeros in inequality constraint Jacobian.: 1
Number of nonzeros in Lagrangian Hessian.............: 2
Total number of variables............................: 1
variables with only lower bounds: 0
variables with lower and upper bounds: 0
variables with only upper bounds: 0
Total number of equality constraints.................: 0
Total number of inequality constraints...............: 1
inequality constraints with only lower bounds: 0
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
0 2.5000000e-01 0.00e+00 1.00e+00 -1.0 0.00e+00 - 0.00e+00 0.00e+00 0
1 6.2500000e-02 0.00e+00 4.90e-01 -1.7 2.50e-01 - 1.00e+00 1.00e+00f 1
2 1.1441994e-04 0.00e+00 1.74e-03 -1.7 2.39e-01 - 1.00e+00 1.00e+00h 1
3 7.8304570e-06 0.00e+00 2.89e-04 -2.5 6.50e-02 - 1.00e+00 1.00e+00h 1
4 1.5182438e-08 0.00e+00 2.74e-05 -3.8 2.72e-03 - 1.00e+00 1.00e+00h 1
5 1.6255746e-12 0.00e+00 5.28e-08 -5.7 1.29e-04 - 1.00e+00 1.00e+00h 1
6 2.8086636e-18 0.00e+00 6.35e-12 -8.6 1.29e-06 - 1.00e+00 1.00e+00h 1
Number of Iterations....: 6
(scaled) (unscaled)
Objective...............: 2.8086636097379302e-18 2.8086636097379302e-18
Dual infeasibility......: 6.3507046478545526e-12 6.3507046478545526e-12
Constraint violation....: 0.0000000000000000e+00 0.0000000000000000e+00
Variable bound violation: 0.0000000000000000e+00 0.0000000000000000e+00
Complementarity.........: 2.5091160166026672e-09 2.5091160166026672e-09
Overall NLP error.......: 2.5091160166026672e-09 2.5091160166026672e-09
Number of objective function evaluations = 7
Number of objective gradient evaluations = 7
Number of equality constraint evaluations = 0
Number of inequality constraint evaluations = 7
Number of equality constraint Jacobian evaluations = 0
Number of inequality constraint Jacobian evaluations = 7
Number of Lagrangian Hessian evaluations = 6
Total seconds in IPOPT = 0.045
EXIT: Optimal Solution Found.
0.4999999983240932 Things are plumbed end-to-end, including user-defined functions. I'm still undecided on how to implement nonlinear parameters and expressions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor comments in the code. I'm wondering what the API would look like to rewrite such functions using bridges, for instance for CP. (Or this issue has been discussed elsewhere and I'm not aware of it.)
I think, instead of a bridge, it'd likely be a transformation layer, like We don't really have tools for bridging based on the content of a constraint, only on the type, so you'd likely need to be able to see the entire problem in order to construct a valid transformation. |
I don't think it's about content vs type. If a transformation can be done independently for each constraint then it can be done by a bridge. For the non-breaking path, I don't need we need to be non-breaking for the solvers. Once we have the NL API set up, we can drop support for NLPBlock in Ipopt and tag Ipopt v2.0. What we need is to continue supporting solvers requiring NLPBlock from MOI until MOI v2.0. I still think transforming from NL expression to gradient/hessian oracle dealing with AD, |
No. Computing derivatives really needs the entirety of the problem in one go, whereas the bridges work on a constraint-by-constraint basis. Dealing with the derivatives is not complicated (look how small the Ipopt PR is jump-dev/Ipopt.jl#346) because we've abstracted all of the helpers into |
Why computing derivatives cannot be done on a constraint-by-constraint basis ? |
Because that's a muuuuuch larger change to our AD engines. It also makes it a lot more complicated to deal with sparsity etc. We don't really want to have to deal with 10,000 constraints, each with 10,000 instances of an AD tape. JuMP's reverse diff amortizes a lot of allocations over the constraints. It also makes it hard to have shared common subexpressions. |
Shared sub-expressions is a good reason. Sparsity of the hessian of the lagrangian is another one, although this could probably be refactored if we really wanted to. |
Could an MOI layer that does the transformation from |
To me, the question is: what would be the use case for this? Usually, the nonlinear solver requires only access to the full Jacobian and to the full Hessian, discarding the individual constraints. Seeing all the constraints as a single multivalued function has also its advantages, such as:
|
We should check and confirm support for complex-valued nonlinear functions (polynomials of complex variables) |
Here's a proof of concept of PATHSolver solving a nonlinear complementarity problem with Still an open question as to what the proper definition is. We can also revert the commit and start it in a separate PR if we decide to merge ScalarNonlinearFunction first. |
For VNF here you mean? This seems like the obvious approach to me - what alternatives are there? |
Some sort of vector-valued expression tree? Not sure. You could imagine |
#2094 is a resolution to the "how to we deal with parameters" question. Ipopt would declare support for |
|
Co-authored-by: Chris Coey <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM to merge.
In terms of breaking this in the future, I think we should do our best to stick to semver. So let's test this out and see what needs to change. Then MOI 2.0 can include any breaking changes to this API, plus deleting the old NLPBlockData
interface?
Sounds good. The actual interface is pretty minimal, so I think there are probably a bunch of missing methods and things that we'll need to add, rather than breaking changes. |
I'll merge now, but hold off from tagging for a couple of days to see if anything else pops up once I update other other packages to |
Overview
This PR is an implementation of the idea in #846, and is based on our work at the JuMP-level: jump-dev/JuMP.jl#3106.
Preview links
There are two key bits of documentation to read. The actual implementation is very straight forward:
One complicating issue is how to deal with user-defined function. I've added a new attribute:
I've also added
ListOfSupportedNonlinearOperators <: AbstractModelAttribute
andUnsupportedNonlinearOperator <: UnsupportedError
Solver support
Ipopt is adds support for
ScalarNonlinearFunction
in jump-dev/Ipopt.jl#346.What to do with NLPBlock
There's an open question about what we do with the existing NLPBlock stuff. The easiest answer is just to leave it, and say that solvers should support both, although they can error if both appear in the same model.
It'll be a harder decision at the JuMP-level, but we can wait until this is merged to figure that out.
Blockers
There's nothing major blocking this from my end. It's actually pretty lightweight. The harder part is JuMP and documenting the two different nonlinear approaches. But this can be added without breaking anything because it is strictly a feature addition.
Needs: