Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The role of A matrix... #11

Open
pangtao22 opened this issue Feb 5, 2022 · 0 comments
Open

The role of A matrix... #11

pangtao22 opened this issue Feb 5, 2022 · 0 comments

Comments

@pangtao22
Copy link
Owner

Whether we should use A for trajectory optimization has been a long-standing mystery, ever since Terry discovered that setting the q_u block of A to identity and the rest of A to 0 leads to better convergence for iRS-MPC. We looked at the problem again from a calculus perspective.

Background

It is assumed that the system consists of an un-actuated part, q_u, and an actuated part, q_a. There is a large part of the configuration space that results in inter-penetration (configuration space obstacles). The standard notation is C_obs vs C_free.

In order to change q_u, contact between the robot and the object is necessary. The effect of this simple observation on bundled dynamics is that, in order to get informative samples to estimate the B that is useful for planning, the samples need to live on the boundary between C_obs vs C_free.

What we think the problem is.

Traditional non-linear control assumes that the dynamics function,

x_{t+1} = f(x_t, u_t),

has a valid linearization in a small neighborhood of a nominal state, i.e.

x_{t+1} = A * x_t + B * u_t,

which can then be used for synthesizing linear controllers.

But this is not true for x that's in contact as it lives on the boundary of C_free. Furthermore, derivatives on the boundary of the domain of the dynamics function is not well-defined. One-sided derivatives (and partial derivatives?) can be defined in C_free, and might be what differentiable simulators actually compute. But when A and B are used in MPC, the C_free constraint is dropped, and the optimizer may take the system into C_obs in order to minimize cost.

How to test our hypotheses

  • Correctly implement the chain rule for computing A in QuasistaticSimulator. Right now it's missing the term on the derivatives w.r.t. contact points and normals.
  • Work through the differentiable simulator implementation line by line on a simple 1D, 2-cart system to confirm/disprove the theory that the auto-diffed derivative is some form of one-sided derivative.
  • In iRS-MPC, after each iteration, observe how much collision there is in the x trajectory solved by the optimizer.
  • Test whether adding to iRS-MPC the non-penetration constraints which appear in the quasistatic dynamics QP improves convergence.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant