Skip to content

Latest commit

 

History

History
244 lines (196 loc) · 6.92 KB

notes.org

File metadata and controls

244 lines (196 loc) · 6.92 KB

python multiprocessing consensus library

should be able to run in serial and in parallel, for debugging/sanity purposes. do simple consensus, and general consensus.

we want to send general messages up to the collecting process. such as local computation of things that will go into the dual variable calculations

we also want to send general messages down to the processor, such as telling them to reset their dual variables to zero, if we detect ripples in the collecting process

partitioning

http://www.staff.science.uu.nl/~bisse101/Mondriaan/

problem form

primal

\begin{align*} \mbox{minimize}\ &c^T x
\mbox{subject to}\ & Ax = b \ & Gx \leqK h \end{align*}

dual

\begin{align*} \mbox{minimize}\ & b^T w + h^T z
\mbox{subject to}\ & A^T w + G^T z = -c \ & z ∈ K^* \end{align*}

When we transform the primal output from QCML to dual form, and re-pack the matrices into a form that ECOS can handle, we expand the problem size, but that may or may not be a big issue.

simple consensus

We have a program \[ min\sumi=1^n f_i(x_i) s.t. x_i = z ∀ i \]

the iteration becomes

\begin{align*} x_ik+1 = \mbox{prox}f_i/ρ\left( \bar{x}^k - u_i^k \right)
u_ik+1 = u_i^k + x_ik+1 - \bar{x}k+1 \end{align*}

after an iteration, we can compute the residuals:

\begin{align*} \| r^k \|^2_2 &= ∑i=1N\|x_i^k - \bar{x}^k \|_2^2
\| s^k \|_2 &= \sqrt{N} ρ \| \bar{x}^k - \bar{x}k-1 \|_2 \end{align*}

but we can reformualte it into something more computationally friendly buy manipulating the indexing time:

\begin{align*} u_ik+1 &= u_i^k + x_ik - \bar{x}k
x_ik+1 &= \mbox{prox}f_i/ρ\left( \bar{x}^k - u_ik+1 \right) \end{align*}

Computationally, we see that each node maintains a state of $(x_i^k,u_i^k)$. The steps are as follows

  1. $\bar{x}^k$ is sent to each node
  2. each node updates $u_ik+1$ locally
  3. each node computes the prox locally to produce $x_ik+1$
  4. each node sends out $x_ik+1$
  5. $\bar{x}k+1$ is computed somehow
  6. repeat

general consensus

We describe the mapping from local variables to global with $G(i,j) = k$ meaning that that $j$th element of local variable $x_i$ corresponds to the $k$th element of the global variable. \begin{align*} u_ik+1 &= u_i^k + x_ik - ˜{z}_ik
x_ik+1 &= \mbox{prox}f_i/ρ\left( ˜{z}_i^k - u_ik+1 \right)\ z_gk+1 &= \frac{1}{k_g} ∑G(i,j)=g \left(x_ik+1 \right)_j \end{align*}

Computationally, we have that each node maintains a state of $(x_ik, u_i^k)$. The steps are:

  1. $z^k$ is computed in shared memory
  2. each node pulls its $˜{z}_i^k$
  3. each node updates $u_ik+1$
  4. each node updates $x_ik+1$ by computing the prox
  5. each node sends out its $x_ik+1$
  6. the $x_ik+1$ are averaged appropriately to produce $zk+1$

The residuals are given by \begin{align*} \| r^k \|^2_2 &= ∑i=1N\|x_i^k - ˜{z}_i^k \|_2^2
\| s^k \|_2^2 &= ρ^2 ∑i=1^N \| ˜{z}_i^k - ˜{z}_ik-1 \|_2^2 \end{align*}

new ordering of alg

\begin{align*} x_ik+1 &= \mbox{prox}f_i/ρ\left( ˜{z}_i^k - u_ik \right)
z_gk+1 &= \frac{1}{k_g} ∑G(i,j)=g \left(x_ik+1 \right)_j\ u_ik+1 &= u_i^k + x_ik+1 - ˜{z}_ik+1 \end{align*}

Computationally, we have that each node maintains a state of $(x_ik, u_i^k)$. The steps are:

  1. $z^k$ is computed in shared memory
  2. each node pulls its $˜{z}_i^k$
  3. each node updates $u_ik+1$
  4. each node updates $x_ik+1$ by computing the prox
  5. each node sends out its $x_ik+1$
  6. the $x_ik+1$ are averaged appropriately to produce $zk+1$

The residuals are given by \begin{align*} \| r^k \|^2_2 &= ∑i=1N\|x_i^k - ˜{z}_i^k \|_2^2
\| s^k \|_2^2 &= ρ^2 ∑i=1^N \| ˜{z}_i^k - ˜{z}_ik-1 \|_2^2 \end{align*}

rho updates

if $r^k$ much bigger than $s^k$, then increase $ρ$, and vice versa. and update $u$: $uk = \frac{ρ^k}{ρk+1}u^k$

over/under relaxation

reordering rows

To reorder the rows, we first store the problem in a modified format \begin{align*} \mbox{minimize}\ &c^T x
\mbox{subject to}\ & Rx \leqK s \end{align*} where $K$ is a product cone which includes the zero cone. This allows us to include the equality constraints we find in the primal form.

restarting

The below actually gives us a value of the optimal dual variable $u^*$.

as a good guess for a restart, we would like have each prox stay at the current xbar. That is, we want to find a $y$ such that \[ \bar{x} = \mbox{prox}f_i/ρ\left(\bar{x} - y \right) \]

If $f_i(x) = c^T x$, then the solution is given by $u^* = -\frac{1}{ρ}c$.

the simple algorithm: \[ x_ik+1 = \mbox{prox}f_i/ρ\left(\bar{x}^k + \frac{c}{ρ}\right) \] should converge to the correct solution, no?

NO! Adding the $\frac{c}{ρ}$ term cancels the $c$ term in the prox function, making the prox exactly the projection onto the convex set, without any consideration for the linear term. We might try to shrink towards the optimal value when we see ripples… we still need some of the dynamic $u$ updates so that we are not just solving a feasibility problem.

residuals

use direct primal, dual residuals

try ecos with large prox, but few inequalities.

primal and dual, do whole, do consensus. see how fast it goes.

split rows of A and columsn of A^T

try comparing general and simple consensus. we need a data point to show that doing general consensus is the right thing to do.

design

prox object

the prox object should take in reduced (local) socp data and be able to return the prox.

the constructor should take in socp_data, but this data can have None objects for A, G, or c

  • we should wrap these objects with an xupdate
  • also wrap with a global variable index

this should allow us to use the same code for simple consensus, general consensus, and set intersection

todo

  • add something to compute the residuals
  • add a hook for computing progress (like cross validation progress, or distance to a known solution)
  • adding something to compute residuals will allow us to work with the new problem form for set intersection
  • add the set intersection code

set intersection

We transform a problem from the ECOS input format to a pure convex intersection problem by requiring primal and dual feasibility, and zero duality gap. We store the problem in the form $R \leq s$ where the type of cone inequalities (zero cone or LP cone) are described by a list like ‘cone_array’.

The resulting system is

(org mode)

c^Tb^Th^T=0
A00=b
G00\leqh
0A^TG^T=-c
00-I\leq0

(latex) \[ \begin{bmatrix} c^T & b^T & h^T
A & 0 & 0 \ G & 0 & 0 \ 0 & A^T & G^T \ 0 & 0 & -I \end{bmatrix} \begin{bmatrix} x \ y_1 \ y_2 \end{bmatrix} \begin{matrix} =\ =\ \leq\ =\ \leq \end{matrix} \begin{bmatrix} 0\ b\ h\ -c\ 0 \end{bmatrix} \]