search.xml

<?xml version="1.0" encoding="utf-8"?>
<search>
  <entry>
    <title>Algorithm Design 14</title>
    <url>/2023/12/29/Algorithm-Design-14/</url>
    <content><![CDATA[<h2 id="chapter-10-local-search">Chapter 10 Local Search</h2>
<h3 id="game-theory-local-search">10.2 Game Theory &amp; Local
Search</h3>
<ol type="1">
<li><p>Shapley Network Design Game</p>
<ol type="1">
<li><p>Description</p>
<ul>
<li><p>Given a directed graph <span
class="math inline">\(G(V,E)\)</span>, each edge <span
class="math inline">\(e\in E\)</span> having cost <span
class="math inline">\(c_e\)</span></p></li>
<li><p><span class="math inline">\(k\)</span> players, each having
source and target <span class="math inline">\((s_i,t_i)\)</span>. For
simplicity, let <span class="math inline">\(s_i=s\)</span>.</p>
<p>Player <span class="math inline">\(i\)</span> wants to go from <span
class="math inline">\(s_i\)</span> to <span
class="math inline">\(t_i\)</span>.</p></li>
<li><p>Set strategy for player <span class="math inline">\(i\)</span>:
<span class="math inline">\(\{\text{paths }s_i\to t_i\}\)</span></p>
<p>Strategy profile: <span
class="math inline">\(P=\{P_1,\cdots,P_k\}\)</span>, where <span
class="math inline">\(P_i\)</span> denotes a path <span
class="math inline">\(s_i\to t_i\)</span></p></li>
<li><p>Cost of player <span class="math inline">\(i\)</span>: <span
class="math display">\[
C_i(P)=\sum_{e\in P_i}\frac{c_e}{|\{j\in [k]:e\in P_j\}|}
\]</span></p></li>
<li><p>Each player is selfish, and wants to minimize its own
cost</p></li>
</ul></li>
<li><p>Nash Equilibrium (NE)</p>
<ol type="1">
<li><p>Def: We say a strategy profile <span
class="math inline">\(P=(P_1,\cdots,P_k)\)</span> is a Nash Equilibrium,
if <span class="math inline">\(\forall i\)</span>, <span
class="math inline">\(\forall P_i&#39;\)</span> <span
class="math display">\[
C_i(P_1,\cdots,P_{i-1},P_i,P_{i+1},\cdots,P_n)\le
C_i(P_1,\cdots,P_{i-1},P_i&#39;,P_{i+1},P_n)
\]</span> Similar to "local minimum", any single player wants to change
its strategy.</p>
<p>But some players can collaborate to make their costs
decrease.</p></li>
<li><p>Social Optimal</p>
<p>Def: Minimum Steiner Tree connecting <span
class="math inline">\(s,t_1,\cdots,t_k\)</span></p>
<ul>
<li><p>Steiner Tree: Given a set <span class="math inline">\(S\subseteq
V\)</span>, find a connected graph <span
class="math inline">\((V&#39;,E&#39;)\)</span> that <span
class="math inline">\(S\subseteq V&#39;\subseteq V\)</span>, and
minimize the sum of cost of all edges in <span
class="math inline">\(E&#39;\)</span>.</p>
<p>Pollard Conjecture: <span class="math inline">\(\frac{\texttt{M
Steriner T}}{\texttt{M Spanning T}}\ge \frac{\sqrt
3}{2}\)</span></p></li>
</ul></li>
<li><p>THM: There is a game where the social cost of the unique NE is
<span class="math inline">\(\Theta(\log k)\)</span> times the social
optimum.</p>
<ul>
<li><p>Remark: this THM indicates that NE can be extremely bad.</p></li>
<li><p>Example:</p>
<p><span class="math inline">\(V=s\cup
T\cup\{t_1,\cdots,t_k\}\)</span></p>
<p><span class="math inline">\(E=\{(s,t_i,\frac{1}{i}):i\in
[k]\}\cup\{(T,t_i,0):i\in [k]\}\cup(s,T,1+\epsilon)\)</span></p>
<p>Social Optimal: All players use <span class="math inline">\(s\to T\to
t_i\)</span>, with <span
class="math inline">\(C_i=\frac{1+\epsilon}{k}\)</span>. Total Cost =
<span class="math inline">\(1+\epsilon\)</span>.</p>
<p>NE: the above state is not NE, finally NE would be player <span
class="math inline">\(i\)</span> use <span class="math inline">\(s\to
t_i\)</span>, with <span class="math inline">\(C_i=\frac{1}{i}\)</span>.
Total Cost = <span class="math inline">\(H_k\)</span>.</p></li>
</ul></li>
<li><p>NE Properties</p>
<ol type="1">
<li><p>Existence: NE can be achieved by best response dynamics
(BRP).</p>
<p>If BRP terminates (no cycle), it must be a NE.</p></li>
<li><p>Price of Anarchy(无政府主义): <span
class="math inline">\(PoA=\frac{Cost(\texttt{worst
NE})}{cost(\texttt{Social Optimal})}\)</span></p>
<p>Price of Stability: <span
class="math inline">\(PoS=\frac{Cost(\texttt{best
NE})}{cost(\texttt{Social Optimal})}\)</span></p></li>
</ol></li>
<li><p>THM: <span class="math inline">\(PoS\le \mathcal O(\log
k)\)</span></p>
<p>Proof: [potential function method]</p>
<p>Let <span class="math inline">\(H(k)=\sum_{i=1}^k
\frac{1}{i}\)</span>, and <span class="math inline">\(X_e=|\{i\in
[k]:e\in P_i\}|\)</span> <span class="math display">\[
\Phi(P):=\sum_{e\in E}c_e H(X_e)
\]</span></p>
<ul>
<li><p>Key Lemma: Suppose player <span class="math inline">\(i\)</span>
wants to update from <span class="math inline">\(P_i\)</span> to <span
class="math inline">\(P_i&#39;\)</span>, then <span
class="math display">\[
C_i(P_i,P\backslash P_i)-C_i(P_i&#39;,P\backslash
P_i)=\Phi(P_i,P\backslash P_i)-\Phi(P_i&#39;,P\backslash P_i)
\]</span> Proof: <span class="math display">\[
\begin{aligned}
&amp;\quad C_i(P_i,P\backslash P_i)-C_i(P_i&#39;,P\backslash P_i)\\
&amp;=\sum_{e\in P_i}\frac{c_e}{X_e}-\sum_{e\in
P_i&#39;}\frac{c_e}{X_e&#39;}\\
&amp;=\sum_{e\in P_i\backslash P_i&#39;}\frac{c_e}{X_e}-\sum_{e\in
P_i&#39;\backslash P_i}\frac{c_e}{X_e+1}\\
&amp;=\left(\sum_{e\in P_i\backslash P_i&#39;}c_eH(X_e)-\sum_{e\in
P_i\backslash P_i&#39;}c_eH(X_e-1)\right)-\left(\sum_{e\in
P_i&#39;\backslash P_i}c_eH(X_e+1)-\sum_{e\in P_i&#39;\backslash
P_i}c_eH(X_e)\right)\\
&amp;=\left(\sum_{e\in P_i\backslash P_i&#39;}c_eH(X_e)-\sum_{e\in
P_i\backslash P_i&#39;}c_eH(X_e&#39;)\right)-\left(\sum_{e\in
P_i&#39;\backslash P_i}c_eH(X_e&#39;)-\sum_{e\in P_i&#39;\backslash
P_i}c_eH(X_e)\right)\\
&amp;=\left(\sum_{e\in P_i\backslash P_i&#39;}c_eH(X_e)+\sum_{e\in
P_i&#39;\backslash P_i}c_eH(X_e)\right)-\left(\sum_{e\in
P_i&#39;\backslash P_i}c_eH(X_e&#39;)+\sum_{e\in P_i\backslash
P_i&#39;}c_eH(X_e&#39;)\right)\\
&amp;=\Phi(P_i,P\backslash P_i)-\Phi(P_i&#39;,P\backslash P_i)
\end{aligned}
\]</span> <span class="math inline">\(\square\)</span></p></li>
<li><p>Consider BRD, <span class="math inline">\(\Phi\)</span> decreases
monotonically, and BRD is finite, so BRD will terminate and lead to
NE.</p>
<p>Let <span class="math inline">\(P^*\)</span> denote the social
optimal potential, so, <span class="math display">\[
H_k\cdot C(P^*)\ge \Phi(P^*)&gt;\Phi(P_{NE})\ge C(P_{NE})
\]</span></p></li>
</ul></li>
<li><p>Undirected graph case</p>
<p>Worst case Example not holds.</p>
<p>Current best lower bound: <span
class="math inline">\(PoA\ge1.8\)</span></p>
<p>Current best upper bound:</p>
<ul>
<li><span class="math inline">\(PoS\le \mathcal O(\log\log n)\)</span>
for broadcast game (each vertex has <span
class="math inline">\(t_i\)</span>, and <span
class="math inline">\(s_i=s\)</span>) [Fiat et.al.]</li>
<li><span class="math inline">\(PoS\le \mathcal O(\log\log\log
n)\)</span> for broadcast game [Lee, Ligett]</li>
<li><span class="math inline">\(PoS\le \mathcal O(1)\)</span> for
broadcast game [Bilo et.al. 20]</li>
<li><span class="math inline">\(PoS\le \mathcal O(\frac{\log n}{\log
\log n})\)</span> for multicast game</li>
</ul></li>
</ol></li>
</ol></li>
</ol>
<h2 id="chapter-11-linear-programming">Chapter 11 Linear
Programming</h2>
<h3 id="totally-unimodular-matrix">11.1 Totally Unimodular Matrix</h3>
<ol type="1">
<li><p>Definitions</p>
<ol type="1">
<li>[ <strong><em>Totally Unimodular Matrix,TUM</em></strong> ] Matrix
<span class="math inline">\(A\)</span> is TUM if the determinant of
every square submatrix belongs to <span class="math inline">\(\{0,\pm
1\}\)</span></li>
<li>[ <strong><em>Integral Polytope/Polyhedron</em></strong> ] A
polytope <span class="math inline">\(P\)</span> is integral if every
vertex of <span class="math inline">\(P\)</span> is an integral
vector</li>
</ol></li>
<li><p>Prop: If <span class="math inline">\(A\)</span> is TUM, for every
invertible square submatrix <span class="math inline">\(U\)</span> (i.e.
<span class="math inline">\(\det(U)=\pm 1\)</span>) , <span
class="math inline">\(U^{-1}\)</span> is integral (every entry in <span
class="math inline">\(U^{-1}\)</span> is integral)</p>
<p>Proof: <span
class="math inline">\(U_{ij}^{-1}=\frac{\det(U^*_{ji})}{\det(U)}=\frac{0,\pm
1}{\pm 1}\)</span></p></li>
<li><p>THM [ Hoffman, Kruscal ]: <span class="math inline">\(A\)</span>
is TUM <span class="math inline">\(\iff\)</span> For any integral vector
<span class="math inline">\(\vec b\)</span>, <span
class="math inline">\(P=\{\vec x:A\vec x\le \vec b\}\)</span> is
integral</p>
<p>Proof: <span class="math inline">\(\Rightarrow\)</span></p>
<p>A vertex is the solution of a linear system <span
class="math inline">\(A&#39;\vec x=\vec b&#39;\)</span> ( <span
class="math inline">\(A&#39;\)</span> consists of a subset of rows <span
class="math inline">\(I\)</span> of <span
class="math inline">\(A\)</span> and full rank, <span
class="math inline">\(\vec b&#39;=\vec b_I\)</span> ).</p>
<p>Therefore, <span class="math inline">\(A&#39;\)</span> is invertible,
and <span class="math inline">\(\vec x=A&#39;^{-1}\vec b&#39;\)</span>.
By Prop, <span class="math inline">\(A&#39;^{-1}\)</span> is integral,
so <span class="math inline">\(\vec x\)</span> is integral.</p></li>
<li><p>THM [ Judging TUM ]: <span class="math inline">\(A\in \mathbb
R^{m\times n}\)</span> is TUM <span class="math inline">\(\iff\)</span>
<span class="math inline">\(\forall R\subseteq [m]\)</span>, <span
class="math inline">\(\exists R=R_1+R_2\)</span>, s.t. <span
class="math inline">\(\forall j\in[n]\)</span>, <span
class="math inline">\(\sum\limits_{i\in R_1}a_{i,j}-\sum\limits_{i\in
R_2}a_{i,j}\in \{0,\pm 1\}\)</span>.</p>
<p>Proof: OMITTED</p></li>
<li><p>Example 1: Bipartite Matching</p>
<p>Maximize <span class="math inline">\(\sum_{e} x_e\)</span> , s.t.
<span class="math inline">\(\forall v,\sum_{v\in e}x_e\le 1\)</span>,
<span class="math inline">\(\forall e, x_e\ge 0\)</span>.</p>
<p><span class="math inline">\(A\in \mathbb R^{(|V|+|E|)\times
|E|}\)</span>, <span class="math inline">\(A_{v,e}=\mathbf 1(v\in
e)\)</span>, <span class="math inline">\(A_{|V|+e_i,e_j}=\mathbf
1(i=j)\)</span></p></li>
<li><p>Example 2: Consecutive "1" Matrix</p>
<p><span class="math inline">\(A\in \mathbb R^{m\times n}\)</span>,
<span class="math inline">\(\forall i\in [m]\)</span>, <span
class="math inline">\(\exist 1\le l_i\le r_i\le n\)</span>, <span
class="math inline">\(A_{i,j}=\mathbf 1(j\in [l_i,r_i])\)</span></p>
<p>Remark: Interval Scheduling</p></li>
<li><p>Example 3: Network Matrix</p>
<p>Given a directed tree <span class="math inline">\(T(V,E)\)</span>,
<span class="math inline">\(|E|=m\)</span>, and a set of ordered pairs
of vertices <span class="math inline">\(P\subseteq V\times V\)</span>,
<span class="math inline">\(|P|=k\)</span>.</p>
<p>Network Matrix: <span class="math inline">\(M\in \mathbb R^{m\times
k}\)</span> , if <span class="math inline">\(e=(u,v)\)</span>, <span
class="math inline">\(\bar e:=(v,u)\)</span> <span
class="math display">\[
M_{e,(v_1,v_2)}=\begin{cases}1&amp;e\in Path(v_1,v_2)\\
-1&amp;\bar e\in Path(v_1,v_2)\\
0&amp;e,\bar e\notin Path(v_1,v_2)\end{cases}
\]</span> THM [ Tutte ]: A network matrix is TUM</p>
<p>Remark:</p>
<ul>
<li><p>Reverse result almost correct, only <span
class="math inline">\(2\)</span> classes of matrices are TUM but not
network matrix</p>
<p>TUM=Network Matrix+ (2 classes of matrices &amp; their
operation)</p></li>
</ul></li>
<li><p>Remark: TDI, Matroid <span class="math inline">\(\sim\)</span>
TUM</p></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>博弈论-纳什均衡</tag>
        <tag>博弈论-纳什均衡-PoA</tag>
        <tag>博弈论-纳什均衡-PoS</tag>
        <tag>算法-斯坦纳树</tag>
        <tag>博弈论-Shapley网络设计游戏</tag>
        <tag>算法-势能分析</tag>
        <tag>线性规划-TUM</tag>
        <tag>线性规划-整数线性规划</tag>
        <tag>线性规划-网络矩阵</tag>
      </tags>
  </entry>
  <entry>
    <title>Algorithm Design 13</title>
    <url>/2023/12/29/Algorithm-Design-13/</url>
    <content><![CDATA[<h2 id="chapter-9-randomized-algorithm">Chapter 9 Randomized
Algorithm</h2>
<h3 id="approximate-counting">9.4 Approximate Counting</h3>
<ol type="1">
<li><p>Strategy : Rejection Selection</p>
<p>Goal : estimate <span class="math inline">\(|G|\)</span></p>
<p>Method : take <span class="math inline">\(U\)</span> s.t. <span
class="math inline">\(G\subseteq U\)</span></p>
<ul>
<li>We can take uniform samples from <span
class="math inline">\(U\)</span> efficiently</li>
<li>There is a membership oracle <span class="math inline">\(\mathcal
O\)</span> for <span class="math inline">\(G\)</span> : <span
class="math inline">\(\forall x\in U,\mathcal O(x)=\mathbf 1(x\in
G)\)</span></li>
<li>We can easily compute/estimate <span
class="math inline">\(|U|\)</span></li>
</ul>
<p>Theorem : Let <span
class="math inline">\(\alpha=\frac{|G|}{|U|}\)</span>, so we need <span
class="math inline">\(N\ge
c\frac{1}{\epsilon^2\alpha}\log(\frac{1}{\delta})\)</span> samples, s.t.
<span class="math display">\[
\Pr\left\{\frac{\text{\# samples in }G}{N}\in (1\pm
\epsilon)|G|\right\}\ge 1-\delta
\]</span> Proof : Chernoff Bound</p></li>
<li><p>Counting #DNF solutions</p>
<ol type="1">
<li><p>Description</p>
<p>DNF : <span class="math inline">\(\lor_{i=1}^m C_i\)</span>, <span
class="math inline">\(C_i=a_{i,1}\land \cdots\land a_{i,l_i}\)</span> ,
<span class="math inline">\(a_{i,j}\in \{x_t,\bar x_t:1\le t\le
n\}\)</span></p>
<p>Goal : Estimate the number of satisfying assignments ( out of <span
class="math inline">\(2^n\)</span> possibilities )</p>
<p>#P Hard Problem</p></li>
<li><p>Karp-Luby-Mardars Method</p>
<p>If DNF has a clause of size <span
class="math inline">\(3\)</span>:</p>
<p><span class="math inline">\(U\)</span> : all <span
class="math inline">\(2^n\)</span> assignments, <span
class="math inline">\(G\)</span> : satisfying assignments</p>
<p><span class="math inline">\(|G|\ge \frac{|U|}{8}\)</span>, so <span
class="math inline">\(\alpha\ge \frac{1}{8}\)</span>, we only need <span
class="math inline">\(\mathcal O(\frac{1}{\epsilon^2}\log
\frac{1}{\delta})\)</span></p>
<p>For general case:</p>
<p>Considering a table, <span class="math inline">\(2^n\times
m\)</span>, <span class="math inline">\(t_{i,j}=*\)</span> iff
assignment <span class="math inline">\(i\)</span> can satisfy clause
<span class="math inline">\(j\)</span>. <span
class="math inline">\(t_{i,j}=\otimes\)</span> if clause <span
class="math inline">\(j\)</span> is the first clause satisfied by
assignment <span class="math inline">\(i\)</span></p>
<p><span class="math inline">\(U\)</span> : All <span
class="math inline">\(*\)</span> , <span
class="math inline">\(G\)</span> : All <span
class="math inline">\(\otimes\)</span></p>
<ol type="1">
<li><p>How to sample from <span class="math inline">\(U\)</span> :</p>
<p><span class="math inline">\(|U|=\sum_{i=1}^m 2^{n-l_i}\)</span> .</p>
<ul>
<li>Firstly sample column <span class="math inline">\(i\)</span> w.p.
<span class="math inline">\(\frac{2^{n-l_i}}{|U|}\)</span> ( this can be
implemented in <span class="math inline">\(\mathcal O(1)\)</span> , or
<span class="math inline">\(\mathcal O(m)\)</span> ?)</li>
<li>Then, from column <span class="math inline">\(i\)</span>, choose
<span class="math inline">\(*\)</span></li>
</ul></li>
<li><p>Check whether <span class="math inline">\(*\in G\)</span> :
easily in <span class="math inline">\(\mathcal O(m)\)</span> , check all
previous clauses</p></li>
<li><p><span class="math inline">\(\alpha=\frac{|G|}{|U|}\ge
\frac{1}{m}\)</span> : we have at least one <span
class="math inline">\(*\)</span> in one satisfying assignment</p></li>
</ol></li>
</ol></li>
</ol>
<h3 id="streaming-algorithms">9.5 Streaming Algorithms</h3>
<ol type="1">
<li><p>Definition</p>
<p>Input arrives in stream fashion, length of stream is <span
class="math inline">\(n\)</span>.</p>
<p>Goal: estimate some statistics of the stream.</p>
<p>Setting: Space $ n$ . Typical size : <span
class="math inline">\(\mathcal O(1),poly(\log n),\mathcal O(\sqrt
n)\)</span> ( at least sublinear )</p>
<p>Usually, we cannot give a precise solution due to space
constraint.</p>
<p><span class="math inline">\((\epsilon,\delta)\)</span>-solution:
<span class="math inline">\(\Pr\{\mathtt{SOL}\in (1\pm \epsilon)\texttt{
True Value}\}\ge 1-\delta\)</span>.</p></li>
<li><p>Counting the distinct elements ( <span
class="math inline">\(F_0\)</span>-estimation )</p>
<ol type="1">
<li><p>Description</p>
<p><span class="math inline">\(n\)</span> elements, each element <span
class="math inline">\(\in [m]\)</span> , <span
class="math inline">\(n,m\)</span> are large</p>
<p>Goal: output the number of distinct elements</p></li>
<li><p>Google : HyperLogLog</p></li>
<li><p>A simpler (idealized) algorithm with Hash function</p>
<ol type="1">
<li><p>deterministic hash function <span class="math inline">\(h:[m]\to
[0,1]\)</span></p></li>
<li><p>Take representative: maintain <span
class="math inline">\(k=\mathcal O(\frac{1}{\epsilon^2})\)</span>-th
smallest <span class="math inline">\(h\)</span> value ( suppose its
value is <span class="math inline">\(t\)</span> )</p>
<blockquote>
<p>Not taking the smallest value: variance is large</p>
</blockquote></li>
<li><p>Estimation: <span
class="math inline">\(\frac{t}{k}\)</span></p></li>
</ol></li>
<li><p>Can achieve <span
class="math inline">\((\epsilon,\delta)\)</span>-approximation (Proof:
Chernoff Bound)</p></li>
<li><p>Real Implementation: <span class="math inline">\(h:[m]\to
[m^2]\)</span>, only store smallest <span
class="math inline">\(k\)</span> values</p></li>
</ol></li>
</ol>
<h2 id="chapter-10-local-search">Chapter 10 Local Search</h2>
<h3 id="metropolis-algorithm">10.1 Metropolis Algorithm</h3>
<ol type="1">
<li><p>From statistical mechanics</p>
<p><span class="math inline">\(\Pr\{\text{a state with energy }E\}\sim
\exp(-\frac{E}{kT})\)</span></p>
<p><img
src="C:\Users\Administrator\AppData\Roaming\Typora\typora-user-images\image-20231218110241961.png" /></p>
<p>(Purple: Small <span class="math inline">\(T\)</span>, Red: Large
<span class="math inline">\(T\)</span>)</p></li>
<li><p>Markov Chain</p>
<p>Directed weighted graph <span class="math inline">\(G\)</span>, <span
class="math inline">\(\forall u\in V,\sum_{(u,v)\in
E}p(u,v)=1\)</span></p>
<p>Transition matrix: <span
class="math inline">\(P_{i,j}=p(i,j)=P(j|i)\)</span></p>
<p><span class="math inline">\((P^k)_{i,j}\)</span>: probability that
start from <span class="math inline">\(i\)</span> and end at <span
class="math inline">\(j\)</span> with exactly <span
class="math inline">\(k\)</span> steps</p>
<p>Stationary distribution <span class="math inline">\(\bar x\)</span>:
<span class="math inline">\(P^T\bar x=\bar x\)</span></p>
<p>Finite space: must have stationary distribution</p></li>
<li><p>Count # perfect matching in bipartite graph</p>
<p><span class="math inline">\(\iff\)</span> Sampling problem <span
class="math inline">\(\iff\)</span> Markov Chain ( matching <span
class="math inline">\(\to\)</span> matching )</p>
<p>Design MC s.t. stationary distribution of MC is uniformly distributed
over all states</p>
<p>Starting from one state, after <span class="math inline">\(k\)</span>
steps we stop, consider the final state as a uniformly sampled
stationary distribution</p>
<p>Under certain condition (Ergodic), a random walk will converge to a
<strong>unique</strong> stationary distribution</p>
<p><span class="math inline">\(\forall
x_0,\lim\limits_{k\to\infty}(P^T)^k x_0=\bar x\)</span></p>
<p>Mix time: maximum ( consider all <span
class="math inline">\(x_0\)</span> ) time to reach <span
class="math inline">\(\bar x\)</span></p></li>
<li><p>Metropolis Algorithm</p>
<p><span class="math inline">\(\mathcal C\)</span>: set of all states,
<span class="math inline">\(S\)</span>: current state, <span
class="math inline">\(S&#39;\)</span>: a uniformly chosen neighbor of
<span class="math inline">\(S\)</span>.</p>
<p>If <span class="math inline">\(E(S&#39;)&lt;E(S)\)</span>, <span
class="math inline">\(S\gets S&#39;\)</span>. Otherwise, <span
class="math inline">\(S\gets S&#39;\)</span> with probability <span
class="math inline">\(\exp(-\frac{E(S&#39;)-E(S)}{kT})\)</span>.</p>
<p>Theorem: Let <span class="math inline">\(Z=\sum_{S\in \mathcal
C}\exp(-\frac{E(S)}{kT})\)</span>, then <span class="math display">\[
\Pr\{S\gets \mathcal C:S\in \text{stationary
distribution}\}=\frac{1}{Z}\exp(-\frac{E(S)}{kT})
\]</span> Application:</p>
<ul>
<li>Sampling: Bayesian Inference/ Graphical Models</li>
<li>Optimization: minimize <span class="math inline">\(f(x)\)</span>:
Let <span class="math inline">\(E(S)=f(S)\)</span> <span
class="math inline">\(\to\)</span> Simulated Annealing</li>
</ul></li>
<li><p>Simulated Annealing</p>
<p>Goal: minimize <span class="math inline">\(f\)</span></p>
<ul>
<li>High temperature: like uniform, Markov Chain converges quicker for
larger <span class="math inline">\(T\)</span> (since the graph has good
continuity)</li>
<li>Low temperature: many local minimum, hard to sample (converges
slower)</li>
</ul>
<p>Cooling scheduling <span class="math inline">\(T=T(i)\)</span>: <span
class="math inline">\(i=1,2,\cdots\)</span></p>
<ul>
<li>$x_i$ run Metropolis Algorithm for <span
class="math inline">\(T(i)\)</span> with initial state <span
class="math inline">\(x_{i-1}\)</span></li>
</ul></li>
<li><p>Mixing Time</p>
<p>Total variation distance: <span class="math inline">\(p,q\)</span>
are two distributions, <span class="math display">\[
d_{TV}(p,q)=\sum_{i}|p_i-q_i|=\left\|\vec p-\vec q\right\|_1
\]</span> Let <span class="math inline">\(p_x^k=(P^T)^kx\)</span>, <span
class="math inline">\(\pi\)</span> be the stationary distribution.
Define mixing time as <span class="math display">\[
\tau_x(\epsilon)=\min\{k:d_{TV}(p_x^k,\pi)\le \epsilon\}
\]</span> Mixing time <span class="math inline">\(\sim\)</span>
continuity</p>
<ul>
<li>MinCut very small <span class="math inline">\(\to\)</span> mixing
time large</li>
<li>Good continuity <span class="math inline">\(\to\)</span> mixing time
small</li>
</ul>
<p>Theorem: Let <span class="math inline">\(P\)</span> be the transition
matrix of Markov Chain, suppose <span class="math inline">\(P\)</span>
is symmetric. Suppose <span class="math inline">\(\lambda_1\ge
\lambda_2\ge \cdots\ge \lambda_N\)</span> be eigenvalues of <span
class="math inline">\(P\)</span>, it is known that <span
class="math inline">\(\lambda_1=1\)</span> and <span
class="math inline">\(v_1=(\frac{1}{\sqrt n},\cdots,\frac{1}{\sqrt
n})\)</span>. Then let <span
class="math inline">\(\lambda_{\max}=\max\{|\lambda_2|,|\lambda_N|\}\)</span>,
so <span class="math display">\[
\tau(\epsilon)\le \mathcal O\left(\frac{\log n+\log
\frac{1}{\epsilon}}{1-\lambda_{\max}}\right)
\]</span> i.e. <span class="math inline">\(\lambda_2\to 1\)</span>,
mixing time <span class="math inline">\(\to\)</span> large</p></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>算法-随机算法</tag>
        <tag>算法-随机算法-近似计数问题</tag>
        <tag>算法-随机算法-采样与计数</tag>
        <tag>算法-随机算法-近似计数问题-KLM方法</tag>
        <tag>算法-流算法-不同元素个数</tag>
        <tag>算法-局部搜索-Metropolis算法</tag>
        <tag>机器学习-马尔科夫链</tag>
        <tag>算法-局部搜索-模拟退火</tag>
      </tags>
  </entry>
  <entry>
    <title>Algorithm Design 12</title>
    <url>/2023/12/29/Algorithm-Design-12/</url>
    <content><![CDATA[<h2 id="chapter-9-randomized-algorithm">Chapter 9 Randomized
Algorithm</h2>
<h3 id="hashing">9.3 Hashing</h3>
<ol type="1">
<li><p>universal hashing function</p>
<p>Intuition : want the function looks "random"</p>
<ol type="1">
<li><p>Def [ <strong><em>universal hashing function</em></strong> ] A
family <span class="math inline">\(\mathcal H\)</span> of functions
<span class="math inline">\(h:U\to\{0,\cdots,n-1\}\)</span> is
<strong><em>universal</em></strong> if <span
class="math inline">\(\forall u,v\in U,u\neq v\)</span> , <span
class="math display">\[
\Pr\{h\gets \mathcal H:h(u)=h(v)\}\le \frac{1}{n}
\]</span></p></li>
<li><p>Design : Let <span class="math inline">\(p\approx n\)</span> be a
prime number ( <span class="math inline">\(n\)</span> is the size of
hash table , for simplicity let <span class="math inline">\(p=n\)</span>
)</p>
<p>For all <span class="math inline">\(x\in U\)</span> , write <span
class="math inline">\(x\)</span> in base-<span
class="math inline">\(p\)</span> : <span
class="math inline">\(x=\sum_{i=0}^{r-1} x_ip^i\)</span> , where <span
class="math inline">\(x_i\in \{0,\cdots,p-1\}\)</span> . <span
class="math display">\[
\mathcal H:=\left\{h_{\vec
a}(x)=\left(\sum_{i=0}^{r-1}a_ix_i\right)\bmod p:\vec
a=(a_0,\cdots,a_{r-1}),a_i\in \{0,\cdots,p-1\}\right\}
\]</span> Goal : prove <span class="math inline">\(\forall x,y\in
U,x\neq y\)</span> , <span class="math display">\[
\Pr\{a_i\gets \{0,\cdots,p-1\}:h_{\vec a}(x)=h_{\vec a}(y)\}\le
\frac{1}{n}
\]</span> Proof :</p>
<p>Suppose <span class="math inline">\(x=\sum_{i=0}^{r-1}x_ip^i\)</span>
, <span class="math inline">\(y=\sum_{i=0}^{r-1}y_ip^i\)</span> , so
there exists <span class="math inline">\(j\)</span> such that <span
class="math inline">\(x_j\neq y_j\)</span> . <span
class="math display">\[
\begin{aligned}
&amp;\quad \Pr\left\{h_{\vec a}(x)=h_{\vec a}(y)\right\}\\
&amp;=\Pr\left\{\sum_{i=0}^{r-1}a_ix_i\equiv \sum_{i=0}^{r-1}a_iy_i\pmod
p\right\}\\
&amp;=\Pr\left\{a_j(x_j-y_j)\equiv \sum_{i=0,i\neq
j}^{r-1}a_i(y_i-x_i)\pmod p\right\}\\
&amp;=\frac{1}{p}
\end{aligned}
\]</span> This is because <span class="math inline">\(a_j\gets
\{0,\cdots,p-1\}\)</span> , and for all possible <span
class="math inline">\(\sum_{i=0,i\neq j}^{r-1}a_i(y_i-x_i)\)</span> ,
there exists exactly one <span class="math inline">\(a_j\)</span> that
<span class="math inline">\(a_j(x_j-y_j)\equiv \sum_{i=0,i\neq
j}^{r-1}a_i(y_i-x_i)\pmod p\)</span></p></li>
</ol></li>
<li><p>Perfect Hashing</p>
<ol type="1">
<li><p>Def : static dictionary with no collision , <span
class="math inline">\(|U|=n\)</span> , space <span
class="math inline">\(\mathcal O(n)\)</span> , query time <span
class="math inline">\(\mathcal O(1)\)</span> ( with <span
class="math inline">\(h(x)\)</span> oracle , performing <span
class="math inline">\(h(x)\)</span> costs <span
class="math inline">\(\mathcal O(1)\)</span> time )</p>
<p>static : only consider lookup operation , no insertion/deletion
.</p></li>
<li><p>FKS Perfect Hashing [1984]</p>
<p>Intuition : <span class="math inline">\(2\)</span>-level data
structure</p>
<ul>
<li>Firstly sample <span class="math inline">\(h\gets\mathcal H\)</span>
, get a hash table , space <span class="math inline">\(\mathcal
O(n)\)</span></li>
<li>Secondly , if index <span class="math inline">\(i\)</span>
encounters collision (suppose the collision set is <span
class="math inline">\(B_i=\{u\in U:h(u)=i\}\)</span> ), then sample
<span class="math inline">\(h_i\gets \mathcal H&#39;\)</span> and get a
second-level hash table , space <span class="math inline">\(\mathcal
O(|B_i|^2)\)</span> . If collision still happens , then resample <span
class="math inline">\(h_i\)</span> , until no collision happens .</li>
</ul></li>
<li><p>Collision Analysis</p>
<p>Claim : <span class="math inline">\(\mathcal H=\{h\}\)</span>
universal , <span class="math inline">\(h:[n]\to [m]\)</span> . If <span
class="math inline">\(m=\mathcal O(n^2)\)</span> , then with probability
<span class="math inline">\(\ge 0.9\)</span> there is no collision .</p>
<p>Proof : <span class="math inline">\(U=\{x_1,\cdots,x_n\}\)</span> ,
define random variable <span
class="math inline">\(X_{i,j}=\begin{cases}1&amp;h(x_i)=h(x_j)\\0&amp;otherwise\end{cases}\)</span>
, define <span class="math inline">\(X=\sum_{1\le i&lt;j\le n}
X_{i,j}\)</span> <span class="math display">\[
E[X]=\sum_{i&lt;j}E[X_{i,j}]=\sum_{i&lt;j}\Pr\{h(x_i)=h(x_j)\}\le
\frac{\binom{n}{2}}{M}
\]</span> By Markov's Inequality , <span class="math display">\[
\Pr\{X\ge 1\}\le E[X]\le \frac{n(n-1)}{2M}\le \frac{n^2}{2M}
\]</span> Therefore , let <span class="math inline">\(M=5n^2\)</span> ,
so <span class="math inline">\(\Pr\{X\ge 1\}\le \frac{1}{10}\)</span> .
<span class="math inline">\(\square\)</span></p></li>
<li><p>Space Analysis</p>
<p>Claim : <span class="math inline">\(E[\sum |B_i|^2]=\mathcal
O(n)\)</span></p>
<p>Proof : Let <span class="math inline">\(Y_i=|B_i|\)</span> , <span
class="math inline">\(Y=\sum |B_i|^2\)</span> , so in the first level ,
<span class="math display">\[
X=\sum \binom{Y_i}{2}=\frac{1}{2}\left(\sum Y_i^2-\sum
Y_i\right)=\frac{Y-n}{2}
\]</span> Similarly , <span class="math inline">\(E[X]\le
\frac{\binom{n}{2}}{n}=\frac{n-1}{2}\)</span> , so <span
class="math inline">\(E[Y]=E[2X+n]\le 2n-1=\mathcal O(n)\)</span>
.</p></li>
</ol></li>
<li><p>Balls-Bins Game</p>
<ol type="1">
<li><p><span class="math inline">\(n\)</span> bins , <span
class="math inline">\(n\)</span> balls , <span
class="math inline">\(E[\text{max bins}]=\Theta(\frac{\log n}{\log\log
n})\)</span></p>
<p><span class="math inline">\(X_i=\#\text{ balls in bin }i\)</span> ,
<span class="math inline">\(Y_{i,j}=\begin{cases}1&amp;\text{ball }j\to
\text{bin }i\\0&amp;otherwise\end{cases}\)</span> , so <span
class="math inline">\(X_i=\sum Y_{i,j}\)</span></p>
<p>Let <span class="math inline">\(c=\Theta(\frac{\log n}{\log\log
n})\)</span> , so <span class="math inline">\(c=1+\delta\)</span> ,
<span class="math inline">\(E[X_i]=1\)</span> , so <span
class="math display">\[
\Pr\{X_i\ge c\}\le \left(\frac{e^{c-1}}{c^c}\right)\le \frac{1}{n^2}
\]</span></p></li>
<li><p>Coupon Collector Problem</p>
<p><span class="math inline">\(n\ln n\)</span> balls , <span
class="math inline">\(n\)</span> bins , w.h.p. every bin is
non-empty</p></li>
<li><p>Load Balancing Problem</p>
<p><span class="math inline">\(kn\log n\)</span> balls , <span
class="math inline">\(n\)</span> bins , load of every bin <span
class="math inline">\(\approx (k-1,k+1)\log n\)</span> ( <span
class="math inline">\(k\)</span> not too small )</p>
<p>Application : load balancing in network</p></li>
<li><p>Power of two choices</p>
<p><span class="math inline">\(n\)</span> bins , <span
class="math inline">\(n\)</span> balls</p>
<p>For every ball , choose <span class="math inline">\(2\)</span> random
bins , and place the ball in the bin with smaller load .</p>
<p><span class="math inline">\(E[\text{max bins}]=\Theta(\log \log
n)\)</span></p></li>
</ol></li>
<li><p>(*) Cuckoo Hashing</p>
<ol type="1">
<li><p>worst case : <span class="math inline">\(\mathcal O(1)\)</span>
lookup , dynamic dictionary</p></li>
<li><p>Maintain <span class="math inline">\(2\)</span> hash tables ,
<span class="math inline">\(h_1,h_2\gets \mathcal H\)</span></p>
<p><code>Insert(x)</code></p>
<ol type="1">
<li><p>If <span class="math inline">\(h_1(x)\)</span> in <span
class="math inline">\(T_1\)</span> is empty , insert <span
class="math inline">\(x\)</span> into <span
class="math inline">\(T_1\)</span></p></li>
<li><p>If <span class="math inline">\(T_1(h_1(x))\)</span> contains
<span class="math inline">\(x&#39;\)</span> , then insert <span
class="math inline">\(x&#39;\)</span> into <span
class="math inline">\(T_2\)</span> by <span
class="math inline">\(h_2(x&#39;)\)</span> , and insert <span
class="math inline">\(x\)</span> into <span
class="math inline">\(T_1\)</span> by <span
class="math inline">\(h_1(x)\)</span></p>
<p>Following some procedure , until an empty slot is found .</p></li>
<li><p>If #iteration $$ threshold <span class="math inline">\(t\)</span>
, rehash all elements</p></li>
</ol>
<p>If load <span class="math inline">\(\le 50\%\)</span> , <span
class="math inline">\(E[\texttt{insertion time}]=\mathcal
O(1)\)</span></p>
<p><code>Lookup(x)</code> : Obviously at most look up <span
class="math inline">\(2\)</span> times .</p>
<p><code>Delete(x)</code> : Use lookup then delete .</p></li>
</ol></li>
<li><p>Application : Closest Pair Problem</p>
<p>using D&amp;C , <span class="math inline">\(\mathcal O(n\log
n)\)</span></p>
<p>using Hashing , <span class="math inline">\(\mathcal O(n)\)</span> in
expectation</p>
<ol type="1">
<li><p>Hash function computation modal :</p>
<ul>
<li>Insertion : <span class="math inline">\(\mathcal O(1)\)</span> in
expectation</li>
<li>Deletion : <span class="math inline">\(\mathcal O(1)\)</span> in
expectation</li>
<li>Lookup : <span class="math inline">\(\mathcal O(1)\)</span> in
expectation</li>
</ul></li>
<li><p>Algorithm :</p>
<ol type="1">
<li><p>Order points randomly <span
class="math inline">\(P_1,\cdots,P_n\)</span> , let <span
class="math inline">\(\delta=d(P_1,P_2)\)</span></p></li>
<li><p>When we process <span class="math inline">\(P_i\)</span> , we
will find whether <span class="math inline">\(\exists j&lt;i\)</span>
s.t. <span class="math inline">\(d(P_j,P_i)&lt;\delta\)</span> .</p>
<p>If YES , we start a new phase with new <span
class="math inline">\(\delta&#39;=d(P_j,P_i)\)</span></p></li>
<li><p>In each phase , maintain a grid with edge length <span
class="math inline">\(\frac{\delta}{2}\)</span> , each cell contains
<span class="math inline">\(\le 1\)</span> points in <span
class="math inline">\(\{P_1,\cdots,P_{i-1}\}\)</span></p>
<p>maintain all nonempty cells in a dictionary ( <span
class="math inline">\(U\)</span> : all cells )</p>
<p>Suppose we are processing <span class="math inline">\(P_i\)</span> ,
we only need to look up <span class="math inline">\(25\)</span> cells in
the dictionary to find smaller distance .</p>
<p>Whenever we start a new phase , we reconstruct the whole dictionary
.</p></li>
</ol></li>
<li><p>Analysis :</p>
<p><span class="math inline">\(X_i=\begin{cases}1&amp;P_i\text{ causes a
new phase}\\0&amp;otherwise\end{cases}\)</span> , running time <span
class="math inline">\(T=\sum_{i=1}^n \left(X_i \mathcal O(i)+25\mathcal
O(1)\right)\)</span> .</p>
<p>Therefore , <span class="math inline">\(E[T]=\left(\sum_{i=1}^n
\mathcal O(i)\Pr\{X_i=1\}\right)+\mathcal O(n)\)</span> .</p>
<p>Claim : <span class="math inline">\(\Pr\{X_i=1\}\le
\frac{2}{i}\)</span> ( by the initial random permutation )</p>
<p>Proof : Suppose <span class="math inline">\((P_a,P_b)\)</span> has
the minimum distance among <span
class="math inline">\(P_1,\cdots,P_i\)</span> , so <span
class="math inline">\(\Pr\{X_i=1\}=\Pr\{a=i\lor b=i\}\le
\Pr\{a=i\}+\Pr\{b=i\}\)</span> .</p>
<p><span
class="math inline">\(\Pr\{a=i\}=\Pr\{b=i\}=\frac{1}{i}\)</span> , so
<span class="math inline">\(\Pr\{X_i=1\}\le \frac{2}{i}\)</span> . <span
class="math inline">\(\square\)</span></p></li>
</ol></li>
</ol>
<h3 id="approximate-counting">9.4 Approximate Counting</h3>
<ol type="1">
<li><p>Approximate Counting Problem</p>
<ol type="1">
<li><p>#P : counting problem <span
class="math inline">\(\leftrightarrow\)</span> NP</p></li>
<li><p>#P-complete</p>
<p><span class="math inline">\(L\in \texttt{NP-Complete}\to L\in
\texttt{\#P-Complete}\)</span></p>
<p><span class="math inline">\(L\in \texttt{\#P-Complete}\not\to L\in
\texttt{NP-Complete}\)</span> : e.g. counting perfect matching</p></li>
<li><p>FPRAS : Fully-Poly Randomized Approximation Scheme</p>
<p>An algorithm in <span
class="math inline">\(poly(n,\frac{1}{\epsilon})\)</span> , <span
class="math inline">\(\Pr\{(1-\epsilon)ANS\le SOL\le
(1+\epsilon)ANS\}\ge 0.99\)</span></p></li>
<li><p>Scheme : Rejection Sampling</p>
<p>Example:</p>
<p><img src="/images/posts/AD12_fig1.png" /></p>
<p>General : to estimate <span class="math inline">\(|G|\)</span> ,
construct <span class="math inline">\(U\)</span> s.t. <span
class="math inline">\(G\subseteq U\)</span></p>
<ul>
<li>Can uniformly sample from <span
class="math inline">\(U\)</span></li>
<li>Membership Oracle : <span class="math inline">\(\forall x\in
U\)</span> , decide whether <span class="math inline">\(x\in G\)</span>
or not</li>
<li>We know <span class="math inline">\(|U|\)</span></li>
</ul></li>
</ol></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>算法-随机算法</tag>
        <tag>算法-随机算法-哈希</tag>
        <tag>算法-随机算法-近似计数问题</tag>
        <tag>算法-随机算法-采样与计数</tag>
        <tag>算法-随机算法-哈希-通用哈希</tag>
        <tag>算法-随机算法-哈希-FKS完美哈希</tag>
        <tag>算法-随机算法-球盒模型</tag>
        <tag>算法-随机算法-Power of 2 choices</tag>
        <tag>算法-随机算法-哈希-Cuckoo哈希</tag>
        <tag>算法-随机算法-哈希-平面最近点对</tag>
      </tags>
  </entry>
  <entry>
    <title>Algorithm Design 11</title>
    <url>/2023/12/29/Algorithm-Design-11/</url>
    <content><![CDATA[<h2 id="chapter-8-network-flow">Chapter 8 Network Flow</h2>
<h3 id="application">8.3 Application</h3>
<h3 id="project-selection-problem">8.3.7 Project Selection Problem</h3>
<ol type="1">
<li><p>Description</p>
<p>A set of projects , project <span class="math inline">\(i\)</span>
has profit <span class="math inline">\(p_i\)</span> ( <span
class="math inline">\(p_i\)</span> can be positive or negative)</p>
<p>A set of procedure constraints : given by DAG , <span
class="math inline">\(e=(i,j)\)</span> means <span
class="math inline">\(i\)</span> can be selected only if <span
class="math inline">\(j\)</span> is selected .</p>
<p>Goal : Select a subset of projects <span
class="math inline">\(A\)</span> , s.t. <span
class="math inline">\(\sum_{i\in A}p_i\)</span> is maximized .</p></li>
<li><p>Flow Construction</p>
<p>Min-Cut Model</p>
<ul>
<li><span class="math inline">\((i,j)\)</span> : give <span
class="math inline">\(\infty\)</span> capacity , then it cannot be a cut
. Then it cannot be <span class="math inline">\(i\)</span> in <span
class="math inline">\(s\)</span> part while <span
class="math inline">\(j\)</span> in <span
class="math inline">\(t\)</span> part .</li>
<li>Define in <span class="math inline">\(s\)</span> part : selected ,
in <span class="math inline">\(t\)</span> part : not selected .</li>
<li>If <span class="math inline">\(p_i\ge 0\)</span> , then have edge
<span class="math inline">\((s,i,p_i)\)</span> ; If <span
class="math inline">\(p_i&lt;0\)</span> , then have edge <span
class="math inline">\((i,t,-p_i)\)</span> .</li>
<li>Answer is <span class="math inline">\(\sum_{p_i\ge
0}p_i-\mathtt{MinCut}\)</span></li>
</ul>
<p>Formal construction :</p>
<ul>
<li><p>Vertices : <span class="math inline">\(V&#39;=V\cup
\{s,t\}\)</span></p></li>
<li><p>Edges :</p>
<ul>
<li><p><span class="math inline">\(\forall (i,j)\in E\)</span> , <span
class="math inline">\((i,j,\infty)\)</span></p></li>
<li><p><span class="math inline">\(\forall i\in V,p_i\ge 0\)</span> ,
<span class="math inline">\((s,i,p_i)\)</span></p></li>
<li><p><span class="math inline">\(\forall i\in V,p_i&lt;0\)</span> ,
<span class="math inline">\((i,t,-p_i)\)</span></p></li>
</ul></li>
<li><p>Find a Min-Cut in <span class="math inline">\(G&#39;\)</span> :
<span class="math inline">\((A,V&#39;-A)\)</span></p>
<p>The minimum value of cut <span
class="math inline">\((A,V&#39;-A)\)</span> is <span
class="math display">\[
\begin{aligned}
&amp;\quad \sum_{p_i\ge 0,i\notin A}p_i+\sum_{p_i&lt;0,i\in A}(-p_i)\\
&amp;=\sum_{p_i\ge 0}p_i-\left(\sum_{p_i\ge 0,i\in
A}p_i+\sum_{p_i&lt;0,i\in A}p_i\right)
\end{aligned}
\]</span> Which is equivalent to maximize <span class="math display">\[
\sum_{p_i\ge 0,i\in A}p_i+\sum_{p_i&lt;0,i\in A}p_i=\sum_{i\in A}p_i
\]</span></p></li>
</ul></li>
</ol>
<h4 id="densest-subgraph-problem">8.3.8 Densest Subgraph Problem</h4>
<ol type="1">
<li><p>Description</p>
<p>Given undirected graph <span class="math inline">\(G=(V,E)\)</span> .
Density of induced subgraph <span class="math inline">\(G[S]\)</span> ,
<span class="math inline">\(S\subseteq V\)</span> is defined as <span
class="math inline">\(\frac{|E\cap G[S]|}{|S|}\)</span></p>
<p>Goal : find a subgraph of maximum density .</p></li>
<li><p>More generalized (but easier problem) : Baseball Elimination</p>
<p>Input :</p>
<ul>
<li>A set <span class="math inline">\(S\)</span> of teams . For each
<span class="math inline">\(x\in S\)</span> , its current score <span
class="math inline">\(w_x\)</span> .</li>
<li>For <span class="math inline">\(x,y\in S\)</span> , they will need
to play <span class="math inline">\(g_{x,y}\)</span> more games .</li>
<li>For each game , winner gains <span class="math inline">\(1\)</span>
point , loser gains <span class="math inline">\(0\)</span> point . Not
consider draw case .</li>
</ul>
<p>Consider a special team <span class="math inline">\(z\)</span> ,
determine theoretically whether <span class="math inline">\(z\)</span>
can win the league ( have the highest score ) (not necessarily
unique)</p>
<p>WOLG , we can let <span class="math inline">\(z\)</span> win all
games it needs to play , and the current score of <span
class="math inline">\(z\)</span> becomes <span
class="math inline">\(m\)</span> .</p>
<p><span class="math inline">\(z\)</span> cannot win <span
class="math inline">\(\iff\)</span> <span class="math inline">\(\exists
T\subseteq S\)</span> , <span class="math inline">\(\sum\limits_{x\in
T}w_x+\sum\limits_{x,y\in T}g_{x,y}&gt;m|T|\)</span></p></li>
<li><p>Flow Construction</p>
<p><span class="math inline">\(\iff\)</span> <span
class="math inline">\(\exists T\subseteq S\)</span> , <span
class="math inline">\(\sum\limits_{x\in T}(w_x-m)+\sum\limits_{x,y\in
T}g_{x,y}&gt;0\)</span></p>
<p>Min-Cut Model</p>
<p>Construction :</p>
<ul>
<li>Vertices : <span class="math inline">\(V&#39;=V\cup E\cup
\{s,t\}\)</span></li>
<li>Edges :
<ul>
<li><span class="math inline">\(\forall (u,v)\in E\)</span> , <span
class="math inline">\((s,(u,v),g_{u,v})\)</span></li>
<li><span class="math inline">\(\forall (u,v)\in E\)</span> , <span
class="math inline">\(((u,v),u,\infty)\)</span> , <span
class="math inline">\(((u,v),v,\infty)\)</span></li>
<li><span class="math inline">\(\forall v\in V\)</span> , <span
class="math inline">\((v,t,m-w_v)\)</span></li>
</ul></li>
</ul></li>
<li><p>Analysis</p>
<ol type="1">
<li><p><span class="math inline">\(\max\limits_{T\subseteq
S}\sum\limits_{x\in T}(w_x-m)+\sum\limits_{x,y\in
T}g_{x,y}&gt;0\)</span> <span class="math inline">\(\iff\)</span> <span
class="math inline">\(\sum_{x,y}g_{x,y}-\mathtt{MinCut}&gt;0\)</span></p>
<p>Min-Cut in <span class="math inline">\(G&#39;\)</span> : <span
class="math inline">\((A,V&#39;-A)\)</span> , let <span
class="math inline">\(T=A\cap V\)</span> . The min cut value is : <span
class="math display">\[
\begin{aligned}
&amp;\quad \sum_{x\in T} (m-w_x)+\sum_{x\notin T\lor y\notin T}
g_{x,y}\\
&amp;=\sum_{x,y}g_{x,y}-\left(\sum_{x,y\in T}g_{x,y}+\sum_{x\in
T}(w_x-m)\right)
\end{aligned}
\]</span></p></li>
<li><p><span
class="math inline">\(\sum_{x,y}g_{x,y}-\mathtt{MinCut}&gt;0\)</span>
<span class="math inline">\(\iff\)</span> <span
class="math inline">\(z\)</span> cannot win</p>
<p>Consider as MaxFlow : greedily allocate all games , MaxFlow obviously
<span class="math inline">\(\le \sum_{x,y}g_{x,y}\)</span> .</p>
<p>If <span class="math inline">\(&lt;\)</span> , we cannot allocate all
games that satisfies <span class="math inline">\(f_v+w_v\le m\)</span> ,
so <span class="math inline">\(z\)</span> must lose in all possibilities
.</p>
<p>If <span class="math inline">\(=\)</span> , we can find a valid
game-result allocation , so <span class="math inline">\(z\)</span> still
have possibility to win .</p></li>
</ol></li>
<li><p>Find maximum density</p>
<ul>
<li><p>Binary search : OK but slow</p></li>
<li><p>parametric diagram</p>
<p><img src="/images/posts/AD11_fig1.png" /></p></li>
<li><p>Densest Subgraph problem can be solved by parametric flow
algorithm in poly-time .</p></li>
</ul></li>
</ol>
<h3 id="maxflow-and-lp">8.4 MaxFlow and LP</h3>
<ol type="1">
<li><p>LP form of MaxFlow</p>
<p>Goal : maximize <span
class="math inline">\(\sum_{v}f_{s,v}\)</span></p>
<p>Constraints :</p>
<ul>
<li><span class="math inline">\(\forall v\in V\backslash\{s,t\}\)</span>
, <span
class="math inline">\(\sum_{u}f_{u,v}=\sum_{u}f_{v,u}\)</span></li>
<li><span class="math inline">\(\forall f_e\in E\)</span> , <span
class="math inline">\(0\le f_e\le c_e\)</span></li>
</ul></li>
<li><p>Integral Polytope</p>
<p>Claim : If <span class="math inline">\(c_e\in \mathbb Z_+\)</span> ,
<span class="math inline">\(P\)</span> is integral ( vertices of <span
class="math inline">\(P\)</span> are integral vector )</p>
<p>Proof : By FF algorithm</p>
<p>General Result :</p>
<ul>
<li>LP constraints : <span class="math inline">\(Ax\le b\)</span></li>
<li><span class="math inline">\(A\)</span> is totally unimodular matrix
<span class="math inline">\(\to\)</span> Integral Polytope</li>
</ul></li>
</ol>
<h2 id="chapter-9-randomized-algorithm">Chapter 9 Randomized
Algorithm</h2>
<ol type="1">
<li><p>Complexity</p>
<p>P , RP , ZPP , BPP</p>
<p>Conjecture : P=RP ? (current guess : YES)</p></li>
</ol>
<h3 id="maxcut-problem">9.1 MaxCut Problem</h3>
<ol type="1">
<li><p>Description</p>
<p>Given undirected <span class="math inline">\(G=(V,E)\)</span></p>
<p>Find a cut <span class="math inline">\((A,V-A)\)</span> , s.t. <span
class="math inline">\(|\{(u,v)\in E:u\in A,v\in V-A\}|\)</span> is
maximized</p></li>
<li><p>Randomized <span class="math inline">\(2\)</span>-approximation
algorithm</p>
<p><span class="math inline">\(\forall v\in V\)</span> , <span
class="math inline">\(\begin{cases}v\in A&amp;w.p.\frac{1}{2}\\ v\notin
A&amp;w.p.\frac{1}{2}\end{cases}\)</span></p>
<p>Analysis : <span class="math display">\[
\begin{aligned}
&amp;\quad E[cut(A,V-A)]\\
&amp;=\sum_{e\in E} \Pr\{e\in cut(A,V-A)\}\\
&amp;=\sum_{(u,v)\in E} \Pr\{(u\in A,v\notin A)\lor (u\notin A,v\in
A)\}\\
&amp;=\sum_{(u,v)\in E} \frac{1}{2}\\
&amp;=\frac{|E|}{2}\ge \frac{OPT}{2}
\end{aligned}
\]</span></p></li>
<li><p>Similarly , for Max-3-SAT problem , randomly assign each
variable</p>
<p><span class="math inline">\(E[SOL]\ge
\frac{7}{8}OPT\)</span></p></li>
<li><p>Derandomization : Conditional Expectation method</p>
<p>Goal : design a deterministic <span
class="math inline">\(\frac{8}{7}\)</span>-approximation algorithm for
Max-3-SAT <span class="math display">\[
E_{x_i\in \{0,1\}}[SOL]=\sum_{C}\Pr\{C\text{ is
satisfied}\}=\frac{7}{8}|C|
\]</span></p>
<p><span class="math display">\[
E_{x_i\in \{0,1\}}[SOL]=\frac{1}{2}E[SOL|x_1=0]+\frac{1}{2}E[SOL|x_1=1]
\]</span></p>
<p>We can still compute <span
class="math inline">\(E[SOL|x_1=0]\)</span> and <span
class="math inline">\(E[SOL|x_1=1]\)</span> by computing <span
class="math inline">\(\Pr\{C\text{ is satisfied}\}\)</span> for each
clause .</p>
<p>Either <span class="math inline">\(E[SOL|x_1=0]\ge
\frac{7}{8}\)</span> or <span class="math inline">\(E[SOL|x_1=1]\ge
\frac{7}{8}\)</span></p></li>
</ol>
<h3 id="randomized-dc">9.2 Randomized D&amp;C</h3>
<p><strong><span class="math inline">\(k\)</span>-th smallest
number</strong></p>
<ol type="1">
<li><p>Randomized algorithm</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">select</span><span class="params">(S,k)</span></span>&#123; <span class="comment">// find k-th smallest number in S</span></span><br><span class="line">    a&lt;- randomly choosed element in S ;</span><br><span class="line">    S_l=S.<span class="built_in">filter</span>(x:x&lt;a);</span><br><span class="line">    S_r=S.<span class="built_in">filter</span>(x:x&gt;=a);</span><br><span class="line">    <span class="keyword">if</span>(|S_l|==k<span class="number">-1</span>) <span class="keyword">return</span> a;</span><br><span class="line">    <span class="keyword">else</span>&#123;</span><br><span class="line">        <span class="keyword">if</span>(|S_l|&gt;=k) <span class="keyword">return</span> <span class="built_in">select</span>(S_l,k);</span><br><span class="line">        <span class="keyword">else</span> <span class="keyword">return</span> <span class="built_in">select</span>(S_r,k-|S_l|);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li>
<li><p>Analysis</p>
<p>Goal : prove expected time is <span class="math inline">\(\mathcal
O(n)\)</span> .</p>
<p>Intuition :</p>
<ul>
<li>If <span class="math inline">\(\frac{1}{4}|S|\le |S_l|\le
\frac{3}{4}|S|\)</span> , do the function . Otherwise , do not divide
and re-choose pivot <span class="math inline">\(a\)</span> .</li>
<li>This algorithm is obviously slower than our algorithm .</li>
</ul>
<p>Analysis :</p>
<ul>
<li><p>Define phase <span class="math inline">\(j\)</span> : <span
class="math inline">\(n(\frac{3}{4})^j\le |S|\le
n(\frac{3}{4})^{j+1}\)</span></p>
<p>We have <span class="math inline">\(\mathcal O(\log n)\)</span>
phases</p></li>
<li><p>Key : In each phase , #iterations to find a valid pivot is
expected bounded .</p>
<p>In one iteration : <span class="math inline">\(\Pr\{\text{succeed in
choosing a pivot element}\}\ge \frac{1}{2}\)</span> .</p>
<p>Therefore , <span class="math inline">\(E[|\text{iterations in this
phase}|]\le 2\)</span> .</p></li>
<li><p><span class="math display">\[
\begin{aligned}
&amp;\quad E[\text{running time}]\\
&amp;\le \sum_{j=1}^{\mathcal O(\log n)} E[\text{running time of phase
}j]\\
&amp;\le \sum_{j=1}^{\mathcal O(\log n)} (\frac{3}{4})^{j+1} n
E[\text{iterations in this phase}]\\
&amp;\le \mathcal O(n)
\end{aligned}
\]</span></p></li>
</ul></li>
<li><p>Derandomization</p>
<p>Divide into <span class="math inline">\(\frac{n}{k}\)</span> groups ,
find medium for each group , then find medium for medium of each group
as pivot .</p>
<p>We can ensure that this pivot is not far from middle rank <span
class="math inline">\(\to\)</span> good separation .</p></li>
</ol>
<h3 id="hashing">9.3 Hashing</h3>
<ol type="1">
<li><p>Intuition</p>
<p>Find a "good" function <span
class="math inline">\(h:U\to\{0,1,\cdots,n-1\}\)</span> , where <span
class="math inline">\(|U|&gt; n\)</span> . ( <span
class="math inline">\(|U|\)</span> is unaffordable in time/memory , but
<span class="math inline">\(n\)</span> is acceptable in time/memory)</p>
<p>Use <span class="math inline">\(h(u)\)</span> to represent <span
class="math inline">\(u\)</span> .</p>
<p>Good performance : Insert : <span class="math inline">\(\mathcal
O(1)\)</span> , Delete : <span class="math inline">\(\mathcal
O(1)\)</span> , Find : <span class="math inline">\(\mathcal
O(1)\)</span> .</p>
<p>Problem : collision : <span class="math inline">\(u\neq v\in
U\)</span> , <span class="math inline">\(h(u)=h(v)\)</span> .</p>
<p>Collision resolution -</p>
<ul>
<li><p>Chaining : maintain a link-list for each <span
class="math inline">\(i\in \{0,1,\cdots,n-1\}\)</span> .</p>
<ul>
<li>If <span class="math inline">\(h(v)\)</span> is allocated , insert
<span class="math inline">\(v\)</span> at the end of <span
class="math inline">\(h(v)\)</span>-th link-list .</li>
<li>When find/delete , iterate the whole <span
class="math inline">\(h(v)\)</span>-th link-list .</li>
</ul></li>
<li><p>Open Address method : ( #element stored <span
class="math inline">\(&lt;N\)</span> )</p>
<ul>
<li><p><span class="math inline">\(h:U\times\{0,\cdots,N-1\}\to
\{0,1,\cdots,N-1\}\)</span></p></li>
<li><p>We probe <span
class="math inline">\(h(x,0),h(x,1),\cdots\)</span> in order , until
find an empty cell</p></li>
<li><p>Linear probing : <span class="math inline">\(h&#39;:U\to
\{0,1,\cdots,N-1\}\)</span> , <span
class="math inline">\(h(x,i)=(h&#39;(x)+i)\bmod N\)</span></p></li>
<li><p>Drawback : continuous allocation <span
class="math inline">\(\to\)</span> more possible to allocate near</p>
<p>Ideal hashing function : like random</p>
<p>Primary Cluster : Bad in performance</p></li>
<li><p>Improvement :</p>
<ul>
<li>Quadratic probing <span
class="math inline">\(h(x,i)=(h&#39;(x)+c_1i+c_2i^2)\bmod
N\)</span></li>
<li>Double Hashing : <span
class="math inline">\(h(x,i)=(h_1(x)+ih_2(x))\bmod N\)</span></li>
</ul></li>
</ul></li>
</ul></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>算法-网络流</tag>
        <tag>算法-网络流-最大权闭合子图</tag>
        <tag>算法-网络流-最密子图</tag>
        <tag>算法-网络流-线性规划形式</tag>
        <tag>算法-随机算法</tag>
        <tag>算法-随机算法-最大割</tag>
        <tag>算法-随机算法-随机分治</tag>
        <tag>算法-随机算法-哈希</tag>
      </tags>
  </entry>
  <entry>
    <title>Algorithm Design 10</title>
    <url>/2023/11/28/Algorithm-Design-10/</url>
    <content><![CDATA[<h2 id="chapter-8-network-flow">Chapter 8 Network Flow</h2>
<h3 id="basic-definitions">8.1 Basic Definitions</h3>
<ol type="1">
<li><p>Description</p>
<p>Input : directed graph <span class="math inline">\(G=(V,E)\)</span> ,
<span class="math inline">\(e=(u,v,c)\)</span> . Two special vertices :
source <span class="math inline">\(s\in V\)</span> , sink <span
class="math inline">\(t\in V\)</span> .</p>
<p>Goal : Find a feasible/maximum flow</p></li>
<li><p>Def [ <strong><em>Flow</em></strong> ] : A function <span
class="math inline">\(f:E\to \mathbb R^*\)</span> satisfying following
constraints:</p>
<ul>
<li>Capacity constraint : <span class="math inline">\(\forall e\in E\ ,\
0\le f(e)\le c_e\)</span></li>
<li>Flow conservation : <span class="math inline">\(\forall v\in
V\backslash\{s,t\} ,
\sum\limits_{e=(x,v)}f(e)=\sum\limits_{e=(v,y)}f(e)\)</span></li>
</ul></li>
<li><p>Def [ <strong><em>Value of Flow</em></strong> ] : <span
class="math display">\[
v(f):=\sum_{e=(s,x)}f(e)=\sum_{e=(y,t)}f(e)
\]</span> By flow conservation , this definition is complete .</p>
<p><strong><em>Max-Flow Problem</em></strong> : Find a feasible flow
with maximum value .</p>
<p>LP version of Max-Flow Problem: Constraints above are all linear .
Goal is also linear .</p></li>
<li><p>Def [ <strong><em><span class="math inline">\(s\)</span>-<span
class="math inline">\(t\)</span> cut</em></strong> ] : A partition of
<span class="math inline">\(V=V_1+V_2\)</span> , s.t. <span
class="math inline">\(s\in V_1,t\in V_2\)</span> . <span
class="math display">\[
cut(V_1,V_2):=\{e=(u,v)\in E:u\in V_1,v\in V_2\}
\]</span> Capacity of the cut : <span
class="math inline">\(c(V_1,V_2):=\sum\limits_{e\in
cut(V_1,V_2)}c_e\)</span></p>
<p>Observation : <span class="math display">\[
\forall f\in \mathtt{FLOW},\forall (V_1,V_2)\in \mathtt{CUT}, v(f)\le
c(V_1,V_2)
\]</span></p></li>
<li><p>THM [ <strong><em>max-flow min-cut theorem</em></strong> ] :
<span class="math inline">\(\max\limits_{f\in
\mathtt{FLOW}}v(f)=\min\limits_{(V_1,V_2)\in
\mathtt{CUT}}c(V_1,V_2)\)</span></p></li>
</ol>
<h3 id="max-flow-algorithm">8.2 Max-Flow Algorithm</h3>
<p>Intuition : Finding Augmenting Path</p>
<ul>
<li><p>Each time we find a path from <span
class="math inline">\(s\)</span> to <span
class="math inline">\(t\)</span> such that all edges have rest capacity
, then we augment on this path</p></li>
<li><p>Difficulty : How to choose path to augment ?</p>
<p><img src="../images/posts/AD10_fig1.png" /></p>
<p>If we firstly choose <span class="math inline">\(s\to 1\to 2\to
t\)</span> , and augment <span class="math inline">\(20\)</span> , then
we cannot augment more flow .</p>
<p>However , we can pretend to choose <span class="math inline">\(s\to
2\to 1\to t\)</span> path , and "augment" <span
class="math inline">\(10\)</span> to get still-valid , more optimal flow
.</p></li>
</ul>
<ol type="1">
<li><p>Def [ <strong><em>residue graph</em></strong> ] : residue graph
<span class="math inline">\(G_f\)</span> with flow <span
class="math inline">\(f\)</span> is defined as follows :</p>
<ul>
<li><p><span class="math inline">\(V(G_f)=V(G)\)</span></p></li>
<li><p><span class="math display">\[
E(G_f)=\left\{e_f=(u,v,c-f(e)):e=(u,v,c)\in
E\right\}\cup\left\{e_b=(v,u,f(e)):e=(u,v,c)\in E\right\}
\]</span></p></li>
</ul></li>
<li><p>Ford-Fulkerson Algorithm [1956]</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="comment">// init</span></span><br><span class="line"><span class="keyword">for</span>(e in E)&#123;</span><br><span class="line">    <span class="built_in">f</span>(e)=<span class="number">0</span>;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">while</span>(there exists s-t path with all edge capacity positive in G_f)&#123;</span><br><span class="line">    P=simple s-t path;</span><br><span class="line">    c=<span class="built_in">min</span>(c_e: e in P);</span><br><span class="line">    <span class="keyword">for</span>(e in P)&#123;</span><br><span class="line">        <span class="built_in">c</span>(e_f)-=c;</span><br><span class="line">        <span class="built_in">c</span>(e_b)+=c;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li>
<li><p>Analysis</p>
<ol type="1">
<li><p>Lemma : Upon termination of FF , we get a feasible flow</p>
<p>All updates during FF do not violate the constraints .</p></li>
<li><p>Lemma : Let <span class="math inline">\(f\)</span> be an <span
class="math inline">\(s\)</span>-<span class="math inline">\(t\)</span>
flow , <span class="math inline">\((A,B)\)</span> be an <span
class="math inline">\(s\)</span>-<span class="math inline">\(t\)</span>
cut , then <span class="math display">\[
v(f)=f^{out}(A)-f^{in}(A)
\]</span> Proof : ( note that <span class="math inline">\(s\)</span>
only has output flow , i.e. <span
class="math inline">\(f^{in}(s)=0\)</span> ) <span
class="math display">\[
\begin{aligned}
v(f)&amp;=f^{out}(s)-f^{in}(s)\\
&amp;=\sum_{v\in A}f^{out}(v)-f^{in}(v)\qquad \forall v\in
A\backslash\{s\}, f^{out}(v)=f^{in}(v)\\
&amp;=f^{out}(A)-f^{in}(A)
\end{aligned}
\]</span> Remark : this is an extended lemma from observation that <span
class="math inline">\(v(f)\le c(A,B)\)</span> , since <span
class="math inline">\(c(A,B)\ge f^{out}(A)\ge
f^{out}(A)-f^{in}(A)=v(f)\)</span> .</p></li>
<li><p>Proof correctness of FF ( also max-flow min-cut theorem )</p>
<p>Assume FF terminates with flow <span class="math inline">\(\bar
f\)</span> , then in the residue graph , <span
class="math inline">\(s,t\)</span> are disconnected considering only
positive-capacity edges .</p>
<p>Let <span class="math inline">\(A^*=\{v\in V:v\text{ is reachable
from }s\text{ in }G_{\bar f}\}\)</span> , so <span
class="math inline">\(t\notin A^*\)</span> .</p>
<p>By the lemma : <span class="math inline">\(v(\bar f)=\bar
f^{out}(A^*)-\bar f^{in}(A^*)\)</span> .</p>
<p>Claim 1 : <span class="math inline">\(\bar f^{in}(A^*)=0\)</span> .
Otherwise , if <span class="math inline">\(e=(u,v)\)</span> from <span
class="math inline">\(V\backslash A^*\)</span> to <span
class="math inline">\(A^*\)</span> has <span class="math inline">\(\bar
f(e)&gt;0\)</span> , then corresponding <span
class="math inline">\(e_b=(v,u,\bar f(e))\)</span> , so <span
class="math inline">\(u\)</span> is also reachable from <span
class="math inline">\(s\)</span> in <span class="math inline">\(G_{\bar
f}\)</span> .</p>
<p>Claim 2 : <span class="math inline">\(\bar
f^{out}(A^*)=c(A^*,V\backslash A^*)\)</span> . Otherwise , if <span
class="math inline">\(e=(u,v)\)</span> from <span
class="math inline">\(A^*\)</span> to <span
class="math inline">\(V\backslash A^*\)</span> has <span
class="math inline">\(\bar f(e)&lt;c_e\)</span> , then corresponding
<span class="math inline">\(e_f=(u,v,c_e-\bar f(e))\)</span> , so <span
class="math inline">\(v\)</span> is also reachable from <span
class="math inline">\(s\)</span> in <span
class="math inline">\(G_f\)</span> .</p>
<p>Therefore , <span class="math inline">\(v(\bar f)=c(A^*,V\backslash
A^*)\)</span> .</p></li>
</ol></li>
<li><p>Remarks</p>
<ul>
<li><p>If <span class="math inline">\(c_e\in \mathbb Z^+\)</span> , FF
will terminate with time <span class="math inline">\(\mathcal
O(|E|\times \texttt{max flow})\)</span> .</p></li>
<li><p>If <span class="math inline">\(\exists e,c_e\in \mathbb
R\backslash \mathbb Q\)</span> , FF will not terminate if path is not
chosen carefully .</p></li>
<li><p>Regardless of termination of FF , max-flow min-cut theorem always
holds .</p></li>
<li><p>Other algorithms</p>
<ul>
<li><p>[ Edmond-Karp ] Always choose shortest path in <span
class="math inline">\(G_f\)</span> , terminate with time <span
class="math inline">\(\mathcal O(|E|^2 |V|)\)</span> .</p></li>
<li><p>Scaling Algorithm : Like FPTAS+discretize , with time <span
class="math inline">\(\mathcal O(|E|^2 \log C)\)</span></p>
<p><span class="math inline">\(\mathcal O(\log C)\)</span> : weak
poly-time (still poly-time , different from pseudo-poly time)</p></li>
<li><p>push-relabel : <span class="math inline">\(\mathcal
O(|V|^3)\)</span></p></li>
</ul></li>
<li><p>Min-cost flow</p>
<p>each edge has a cost <span class="math inline">\(w_e\)</span> . Given
<span class="math inline">\(v_0\)</span> , find a flow s.t. <span
class="math inline">\(v(f)=v_0\)</span> , and minimize <span
class="math inline">\(\sum\limits_{e\in E}f(e)w_e\)</span> .</p></li>
</ul></li>
</ol>
<h3 id="application-of-flow">8.3 Application of Flow</h3>
<h4 id="bipartite-matching">8.3.1 Bipartite Matching</h4>
<ol type="1">
<li><p>Description</p>
<p>Input : undirected graph <span
class="math inline">\(G=(L+R,E)\)</span> , <span
class="math inline">\(E\subseteq\{(u,v):u\in L,v\in R\}\)</span></p>
<p>Output : Matching <span class="math inline">\(M\)</span> with maximum
edges .</p></li>
<li><p>Construction</p>
<p>Vertices : <span class="math inline">\(L,R,s,t\)</span></p>
<p>Edges :</p>
<ul>
<li><span class="math inline">\(\forall u\in L\)</span> , <span
class="math inline">\((s,u,1)\)</span></li>
<li><span class="math inline">\(\forall v\in R\)</span> , <span
class="math inline">\((v,t,1)\)</span></li>
<li><span class="math inline">\(\forall e=(u,v)\in E\)</span> , <span
class="math inline">\((u,v,1)\)</span></li>
</ul></li>
<li><p>Hungary Algorithm / Alternating Path Algorithm</p>
<p>Alternating Path : equivalent as a path in residue graph <span
class="math inline">\(G_f\)</span> .</p></li>
</ol>
<h4 id="perfect-matching">8.3.2 Perfect Matching</h4>
<ol type="1">
<li><p>Description</p>
<p>undirected graph <span class="math inline">\(G=(L+R,E)\)</span> ,
<span class="math inline">\(|L|=|R|=n\)</span> .</p>
<p>A perfect matching is a matching with <span
class="math inline">\(n\)</span> edges</p></li>
<li><p>THM [ <strong><em>Hall's Theorem</em></strong> ] : <span
class="math inline">\(G(L+R,E)\)</span> has a perfect matching <span
class="math inline">\(\iff\)</span> <span class="math inline">\(\forall
A\subseteq L,|N(A)|\ge |A|\)</span> .</p>
<p>Here <span class="math inline">\(N(A):=\{v:\exists (u,v)\in E,u\in
A\}\)</span></p>
<p>Extension : <span class="math inline">\(G=(L+R,E)\)</span> has a
matching with <span class="math inline">\(|L|\)</span> edges <span
class="math inline">\(\iff \forall A\subseteq L,|N(A)|\ge |A|\)</span>
.</p>
<p>Proof : use max-flow min-cut theorem</p></li>
</ol>
<h4 id="disjoint-paths-in-directedundirected-graph">8.3.3 Disjoint paths
in directed/undirected graph</h4>
<ol type="1">
<li><p>Description</p>
<p>Input : directed/undirected graph <span
class="math inline">\(G=(V,E)\)</span> , <span
class="math inline">\(s,t\in V\)</span></p>
<p>Output : maximum <span class="math inline">\(s\)</span>-<span
class="math inline">\(t\)</span> vertex/edge disjoint path</p></li>
<li><p>directed , edge-disjoint :</p>
<p>Vertices : <span class="math inline">\(V\)</span> ; Edges : <span
class="math inline">\(\forall e=(u,v)\in E\)</span> , <span
class="math inline">\((u,v,1)\)</span></p></li>
<li><p>directed , vertex-disjoint</p>
<p>split vertex into in-vertex and out-vertex</p>
<p>Vertices : <span class="math inline">\(V_i&#39;=\{v_i:v\in V,v\neq
s\},V_o=\{v_o:v\in V,v\neq t\}\)</span></p>
<p>Edges :</p>
<ul>
<li><span class="math inline">\(\forall v\in V\backslash
\{s,t\}\)</span> , <span class="math inline">\((v_i,v_o,1)\)</span></li>
<li><span class="math inline">\(\forall e=(u,v)\in E\)</span> , <span
class="math inline">\((u_o,v_i,1)\)</span></li>
</ul></li>
<li><p>undirected</p>
<p>split as two directed edges</p>
<p>Both direction used ? no gain in <span
class="math inline">\(v(f)\)</span> .</p></li>
</ol>
<h4 id="circulation-with-demand-and-lower-bound">8.3.4 Circulation with
demand and lower bound</h4>
<ol type="1">
<li><p>Description</p>
<ul>
<li><p>Can have many sources and sinks .</p>
<p>Given <span class="math inline">\(d_v=f^{out}(v)-f^{in}(v)\)</span> .
<span class="math inline">\(d_v&gt;0\)</span> : source , <span
class="math inline">\(d_v&lt;0\)</span> : sink , <span
class="math inline">\(d_v=0\)</span> : ordinary vertex .</p></li>
<li><p>bi-bounded capacity constraint : <span
class="math inline">\(\forall e\in E\)</span> , <span
class="math inline">\(l_e\le f(e)\le c_e\)</span> .</p></li>
</ul>
<p>Goal : given vertex demand <span class="math inline">\(d_v\)</span> ,
find a feasible circulation , satisfying flow conservation and
bi-bounded capacity constraint .</p></li>
<li><p>Lemma : If <span class="math inline">\(f\)</span> is a feasible
circulation , then <span class="math inline">\(\sum\limits_{v\in
V}d_v=0\)</span> .</p></li>
<li><p>Source&amp;Sink constraint without lower-bound</p>
<p>Vertices : <span class="math inline">\(V,s,t\)</span></p>
<p>Edges :</p>
<ul>
<li><span class="math inline">\(\forall u\in V,d_u&gt;0\)</span> , <span
class="math inline">\((s,u,d_u)\)</span></li>
<li><span class="math inline">\(\forall v\in V,d_v&lt;0\)</span> , <span
class="math inline">\((v,t,-d_v)\)</span></li>
</ul>
<p>Find a max <span class="math inline">\(s\)</span>-<span
class="math inline">\(t\)</span> flow in new graph .</p>
<p><span class="math inline">\(\exists\)</span> feasible circulation
<span class="math inline">\(\iff\)</span> <span
class="math inline">\(\max\limits_{f\in
\mathtt{FLOW}}v(f)=\sum\limits_{d_v&gt;0}d_v\)</span> .</p></li>
<li><p>Capacity lower-bound</p>
<p><span class="math inline">\(\forall e=(u,v,l,c)\)</span> , let <span
class="math inline">\(d_u&#39;\gets d_u+l\)</span> , <span
class="math inline">\(d_v&#39;\gets d_v-l\)</span> , <span
class="math inline">\(e&#39;=(u,v,0,c-l)\)</span> .</p>
<p>new graph <span class="math inline">\(G&#39;\)</span> : <span
class="math inline">\(\forall e=(u,v,l_e,c_e)\in
E,e&#39;=(u,v,c_e-l_e)\)</span> . <span class="math display">\[
d_v&#39;\gets d_v-\left(\sum_{(x,v,l_e,c_e)\in
E}l_e-\sum_{(v,y,l_e,c_e)\in E}l_e\right)
\]</span> Reduce capacity lower-bound to source&amp;sink
constraint</p></li>
</ol>
<h4 id="airline-scheduling-problem">8.3.5 Airline Scheduling
Problem</h4>
<ol type="1">
<li><p>Description</p>
<p>Input : Flight set <span class="math inline">\(\{(i,j):j\text{ is
reachable from }i\}\)</span> . This forms a directed graph .</p>
<p>Goal : Determine whether it is possible to serve all flights using
<span class="math inline">\(K\)</span> planes .</p>
<p>a.k.a. 最小路径覆盖问题</p>
<p>Equivalently , whether the graph can be decomposed into at most <span
class="math inline">\(K\)</span> vertex-disjoint paths .</p></li>
<li><p>Algorithm</p>
<p>using circulation problem</p>
<p>Vertices : <span class="math inline">\(s,t,V_i,V_o\)</span> ( split
each vertex into in-vertex and out-vertex )</p>
<p>Demand : <span
class="math inline">\(d_s=K,d_t=-K,d_{v_i}=d_{v_o}=0\)</span> .</p>
<p>Edges :</p>
<ul>
<li><span class="math inline">\(\forall v\in
V,(v_i,v_o,1,1)\)</span></li>
<li><span class="math inline">\(\forall e=(u,v)\in
E,(u_o,v_i,0,1)\)</span></li>
<li><span class="math inline">\((s,t,0,K)\)</span> ( in case we use
planes less than <span class="math inline">\(K\)</span> )</li>
</ul></li>
</ol>
<h4 id="parametric-flow">8.3.6 Parametric Flow</h4>
<ol type="1">
<li><p>Description</p>
<p>Each edge has capacity being a linear function of <span
class="math inline">\(\lambda\)</span> .</p>
<p><span class="math inline">\(c_{\lambda}(s,u)\)</span> non-decreasing
, <span class="math inline">\(c_{\lambda}(v,t)\)</span> non-increasing
.</p></li>
<li><p><span class="math inline">\(\mathtt{maxflow}(\lambda)\)</span> -
<span class="math inline">\(\lambda\)</span> diagram</p>
<p>piecewise linear concave : consider min-cut , cut is linear on <span
class="math inline">\(\lambda\)</span> .</p></li>
<li><p>[Gallo , Grigoriadiso , Tarjan ] : draw the diagram in time <span
class="math inline">\(\mathcal O(nm\log
\left(\frac{n^2}{m}\right))\)</span></p>
<p>At most <span class="math inline">\(n-2\)</span> breakpoints :
considering cut set , monotone .</p></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>算法-网络流</tag>
        <tag>算法-网络流-最大流最小割定理</tag>
        <tag>算法-网络流-增广路算法</tag>
        <tag>算法-网络流-二分图匹配</tag>
        <tag>算法-网络流-路径覆盖问题</tag>
        <tag>算法-网络流-多源汇可行流</tag>
        <tag>算法-网络流-多源汇上下界可行流</tag>
        <tag>算法-网络流-线性参数最大流</tag>
      </tags>
  </entry>
  <entry>
    <title>Algorithm Design 9</title>
    <url>/2023/11/21/Algorithm-Design-9/</url>
    <content><![CDATA[<h2 id="chapter-7-divide-and-conquer">Chapter 7 Divide and Conquer</h2>
<h3 id="merge-sort-inversions">7.1 Merge Sort &amp; #inversions</h3>
<ol type="1">
<li><p>Merge Sort</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="comment">// Initial : array a[1,...,n]</span></span><br><span class="line"><span class="function">function <span class="title">sort</span><span class="params">(s.t)</span></span>&#123; <span class="comment">// sort a[s,...,t]</span></span><br><span class="line">    <span class="keyword">if</span>(s&gt;=t) <span class="keyword">return</span> a[s,...,t];</span><br><span class="line">    mid=(s+t)/<span class="number">2</span>;</span><br><span class="line">    list1=<span class="built_in">sort</span>(s,mid);</span><br><span class="line">    list2=<span class="built_in">sort</span>(mid+<span class="number">1</span>,t);</span><br><span class="line">    list=<span class="built_in">merge</span>(list1,list2);</span><br><span class="line">    <span class="keyword">return</span> list;</span><br><span class="line">&#125;</span><br><span class="line"><span class="function">function <span class="title">merge</span><span class="params">(list1,list2)</span></span>&#123; <span class="comment">// merge two sorted sequence into a sorted sequence</span></span><br><span class="line">    list=[];</span><br><span class="line">    <span class="keyword">while</span>(!list1.<span class="built_in">empty</span>()&amp;&amp;!list2.<span class="built_in">empty</span>())&#123;</span><br><span class="line">        a=list1.<span class="built_in">front</span>();</span><br><span class="line">        b=list2.<span class="built_in">front</span>();</span><br><span class="line">        <span class="keyword">if</span>(a&lt;b)&#123;</span><br><span class="line">            list.<span class="built_in">push_back</span>(a);</span><br><span class="line">            list1.<span class="built_in">pop_front</span>();</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">else</span>&#123;</span><br><span class="line">            list.<span class="built_in">push_back</span>(b);</span><br><span class="line">            list2.<span class="built_in">pop_front</span>();</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span>(!list1.<span class="built_in">empty</span>())list.<span class="built_in">append_back</span>(list1);</span><br><span class="line">    <span class="keyword">if</span>(!list2.<span class="built_in">empty</span>())list.<span class="built_in">append_back</span>(list2);</span><br><span class="line">    <span class="keyword">return</span> list;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li>
<li><p>Time analysis</p>
<p><span class="math inline">\(T(n)\)</span> : time to sort <span
class="math inline">\(n\)</span> elements .</p>
<p><span class="math inline">\(T(n)=2T(\frac{n}{2})+cn\)</span></p>
<ul>
<li><p>Method 1 : unrolling</p>
<p>Draw the recursive tree</p>
<p><img src="/images/posts/AD9_fig1.png" /></p></li>
<li><p>Method 2 : substitution</p>
<p>Guess : <span class="math inline">\(T(n)\le cn\log_2 n\)</span></p>
<p>Inductive Proof : <span class="math display">\[
\begin{aligned}
T(n)&amp;\le 2T(\frac{n}{2})+cn\\
&amp;\le 2c\frac{n}{2}\log_2(\frac{n}{2})+cn\\
&amp;=cn\log_2n-cn+cn\\
&amp;=cn\log_2n
\end{aligned}
\]</span></p>
<blockquote>
<p>Note : we need to consider precise coefficient here .</p>
<p>Wrong Proof : <span class="math inline">\(T(n)=O(n)\)</span> <span
class="math display">\[
T(n)=2T(\frac{n}{2})+O(n)=2O(\frac{n}{2})+O(n)=O(n)
\]</span> Approximate constant for <span class="math inline">\(\log_2
n\)</span> times .</p>
</blockquote></li>
<li><p>Method 3 : Master Theorem <span class="math display">\[
T(n)=\begin{cases}
\Theta (1)&amp;n\le n_0\\
aT(\frac{n}{b})+f(n)&amp;n&gt;n_0
\end{cases}
\]</span> We can solve <span class="math inline">\(T(n)\)</span> <span
class="math display">\[
c_{crit}=\log_{b}a
\]</span></p>
<p><span class="math display">\[
T(n)=\begin{cases}
\Theta(n^{c_{crit}})&amp;f(n)=\mathcal O(n^c),c&lt;c_{crit}\\
\Theta(n^{c_{crit}}\log^{k+1}n)&amp;f(n)=\Theta(n^{c_{crit}}\log^k
n),k\ge 0\\
\Theta(f(n))&amp;f(n)=\Omega(n^c),c&gt;c_{crit}
\end{cases}
\]</span></p></li>
</ul></li>
</ol>
<h3 id="counting-inversions">7.2 Counting inversions</h3>
<ol type="1">
<li><p>#inversion</p>
<p>Given <span class="math inline">\(a_1,\cdots,a_n\)</span> , count
number of pairs <span class="math inline">\((i,j)\)</span> s.t. <span
class="math inline">\(i&lt;j\)</span> and <span
class="math inline">\(a_i&gt;a_j\)</span></p>
<p>View : matching sorted sequence and original sequence ,
#crossings</p></li>
<li><p>counting inversion based on merge sort</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="comment">// Initial : array a[1,...,n]</span></span><br><span class="line"><span class="function">function <span class="title">sort</span><span class="params">(s.t)</span></span>&#123; <span class="comment">// sort a[s,...,t]</span></span><br><span class="line">    <span class="keyword">if</span>(s&gt;=t) <span class="keyword">return</span> a[s,...,t];</span><br><span class="line">    mid=(s+t)/<span class="number">2</span>;</span><br><span class="line">    (list1,n1)=<span class="built_in">sort</span>(s,mid);			<span class="comment">// inversions only in left part</span></span><br><span class="line">    (list2,n2)=<span class="built_in">sort</span>(mid+<span class="number">1</span>,t);		<span class="comment">// inversions only in right part</span></span><br><span class="line">    (list,n3)=<span class="built_in">merge</span>(list1,list2);	<span class="comment">// inversions between two parts</span></span><br><span class="line">    <span class="keyword">return</span> (list,n1+n2+n3);</span><br><span class="line">&#125;</span><br><span class="line"><span class="function">function <span class="title">merge</span><span class="params">(list1,list2)</span></span>&#123;</span><br><span class="line">    <span class="comment">// we only count a\in list1 , b\in list2 , a&gt;b</span></span><br><span class="line">    list=[]; n3=<span class="number">0</span>;</span><br><span class="line">    <span class="keyword">while</span>(!list1.<span class="built_in">empty</span>()&amp;&amp;!list2.<span class="built_in">empty</span>())&#123;</span><br><span class="line">        a=list1.<span class="built_in">front</span>();</span><br><span class="line">        b=list2.<span class="built_in">front</span>();</span><br><span class="line">        <span class="keyword">if</span>(a&lt;=b)&#123;</span><br><span class="line">            list.<span class="built_in">push_back</span>(a);</span><br><span class="line">            list1.<span class="built_in">pop_front</span>();</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">else</span>&#123;</span><br><span class="line">            list.<span class="built_in">push_back</span>(b);</span><br><span class="line">            list2.<span class="built_in">pop_front</span>();</span><br><span class="line">            n3+=list1.<span class="built_in">size</span>(); <span class="comment">// for each b , count all valid a</span></span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span>(!list1.<span class="built_in">empty</span>())list.<span class="built_in">append_back</span>(list1);</span><br><span class="line">    <span class="keyword">if</span>(!list2.<span class="built_in">empty</span>())list.<span class="built_in">append_back</span>(list2);</span><br><span class="line">    <span class="keyword">return</span> list;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li>
</ol>
<h3 id="find-the-closest-pair-of-points-in-mathbb-r2">7.3 Find the
Closest pair of points in <span class="math inline">\(\mathbb
R^2\)</span></h3>
<ol type="1">
<li><p>Current</p>
<ul>
<li>naive : <span class="math inline">\(\mathcal O(n^2)\)</span></li>
<li>deterministic : <span class="math inline">\(\mathcal O(n\log
n)\)</span> [Shamos &amp; Hoey]</li>
<li>randomized : <span class="math inline">\(\mathcal O(n)\)</span>
(with Hash)</li>
</ul></li>
<li><p>D&amp;C Algorithm</p>
<ol type="1">
<li>sort all points according to <span
class="math inline">\(x\)</span>-coordinate , divide into Left side and
Right side , each with <span class="math inline">\(\frac{n}{2}\)</span>
points</li>
<li>solve each part</li>
<li>merge the two parts , compute pairs with one point in Left and the
other in Right</li>
</ol>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="function">function <span class="title">solve</span><span class="params">(s,t)</span></span>&#123;</span><br><span class="line">    <span class="keyword">if</span>(s&gt;=t) <span class="keyword">return</span> +inf;</span><br><span class="line">    <span class="type">int</span> mid=(s+t)/<span class="number">2</span>;</span><br><span class="line">    dl=<span class="built_in">solve</span>(s,mid);</span><br><span class="line">    dr=<span class="built_in">solve</span>(mid+<span class="number">1</span>,t);</span><br><span class="line">   	d=<span class="built_in">merge</span>(s,t,<span class="built_in">min</span>(dl,dr));</span><br><span class="line">    <span class="keyword">return</span> d;</span><br><span class="line">&#125;</span><br><span class="line"><span class="function">functino <span class="title">merge</span><span class="params">(s,t,delta)</span></span>&#123;</span><br><span class="line">    <span class="comment">//	divide R^2 plane with (delta/2)*(delta/2) grids</span></span><br><span class="line">    <span class="comment">//	observation : at most 1 point in one grid </span></span><br><span class="line">    <span class="comment">//  we only need to consider 3 rows upper and 3 rows lower</span></span><br><span class="line">    <span class="comment">//  columns : 2 columns left and 2 columns right</span></span><br><span class="line">    <span class="type">int</span> d=delta;</span><br><span class="line">    list=<span class="built_in">sort_y</span>(s,t);</span><br><span class="line">    list.<span class="built_in">filter</span>(a:halfline-delta&lt;=a.x&lt;=halfline+delta)</span><br><span class="line">    <span class="keyword">for</span>(<span class="type">int</span> i=<span class="number">0</span>;i&lt;list.<span class="built_in">size</span>();i++)&#123;</span><br><span class="line">        <span class="keyword">for</span>(<span class="type">int</span> j=<span class="built_in">max</span>(<span class="number">0</span>,i<span class="number">-12</span>);j&lt;=<span class="built_in">min</span>(list.<span class="built_in">size</span>()<span class="number">-1</span>,i+<span class="number">12</span>);j++)&#123;</span><br><span class="line">            d=<span class="built_in">min</span>(d,<span class="built_in">dist</span>(a[i],a[j]));</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> d;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>Note : we can firstly sort the points according to <span
class="math inline">\(x\)</span>-coordinate , then within solving and
merging , we sort the points according to <span
class="math inline">\(y\)</span>-coordinate . (instead of using a <span
class="math inline">\(\mathcal O(n\log n)\)</span> sorting in
<code>list=sort_y(s,t)</code> ) . The total time can be <span
class="math inline">\(\mathcal O(n\log n)\)</span> .</p></li>
</ol>
<h3 id="k-th-smallest-number">7.4 <span
class="math inline">\(k\)</span>-th smallest number</h3>
<ol type="1">
<li><p>Description</p>
<p>Given a sequence <span class="math inline">\(a[1,\cdots,n]\)</span> ,
find the <span class="math inline">\(k\)</span>-th smallest
number</p></li>
<li><p>D&amp;C Algorithm</p>
<blockquote>
<p>[Blum , Floyd , Pratt , Rivest , Tarjan 1973]</p>
<p>deterministic , <span class="math inline">\(\mathcal
O(n)\)</span></p>
</blockquote>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="function">function <span class="title">Pick</span><span class="params">(A,k)</span></span>&#123;</span><br><span class="line">    x=<span class="built_in">GenBaseline</span>(A,k);</span><br><span class="line">    listl=A.<span class="built_in">filter</span>(a&lt;x);</span><br><span class="line">    listr=A.<span class="built_in">filter</span>(a&gt;x);</span><br><span class="line">    numx=A.<span class="built_in">count</span>(a==x);</span><br><span class="line">    <span class="keyword">if</span>(k&lt;=listl.<span class="built_in">size</span>()) <span class="keyword">return</span> <span class="built_in">Pick</span>(listl,k);</span><br><span class="line">    <span class="keyword">if</span>(k&gt;listl.<span class="built_in">size</span>()&amp;&amp;k&lt;=listl.<span class="built_in">size</span>()+numx) <span class="keyword">return</span> x;</span><br><span class="line">    <span class="keyword">if</span>(k&gt;listl.<span class="built_in">size</span>()+numx) <span class="keyword">return</span> <span class="built_in">Pick</span>(list2,k-listl.<span class="built_in">size</span>()-numx);</span><br><span class="line">&#125;</span><br><span class="line"><span class="function">function <span class="title">GenBaseline</span><span class="params">(A,k)</span></span>&#123;</span><br><span class="line">    Divide n elements into n/<span class="number">5</span> groups</span><br><span class="line">    sort each group</span><br><span class="line">    listm=all group median</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Pick</span>(listm,n/<span class="number">10</span>);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>Observation : <span class="math inline">\(\max\{|listl|,|listr|\}\le
\frac{7}{10}n\)</span></p>
<blockquote>
<p>If we sort the <span class="math inline">\(\frac{n}{5}\)</span>
groups according to the median , let <span
class="math inline">\(x\)</span> be the median of the medians of the
groups .</p>
<p>Then <span class="math inline">\(x\)</span> is larger than the
previous <span class="math inline">\(\frac{n}{10}\)</span> groups' first
<span class="math inline">\(3\)</span> elements , so <span
class="math inline">\(|listr|\le \frac{7}{10}n\)</span></p>
<p>And <span class="math inline">\(x\)</span> is smaller than the later
<span class="math inline">\(\frac{n}{10}\)</span> groups' last <span
class="math inline">\(3\)</span> elements , so <span
class="math inline">\(|listl|\le \frac{7}{10}n\)</span></p>
</blockquote>
<p><span class="math display">\[
T(n)\le cn+T(\frac{n}{5})+T(\frac{7}{10}n)
\]</span></p>
<p>Key : <span
class="math inline">\(\frac{1}{5}+\frac{7}{10}&lt;1\)</span></p></li>
</ol>
<h3 id="fast-fourier-transform">7.5 Fast Fourier Transform</h3>
<ol type="1">
<li><p>Convolution</p>
<p>Sequence <span
class="math inline">\(a[0,\cdots,n-1],b[0,\cdots,m-1]\)</span> .</p>
<p><span
class="math inline">\((a*b)_k=\sum_{i+j=k}a_ib_j\)</span></p></li>
<li><p>Applications</p>
<ul>
<li><p>polynomial multiplication <span class="math display">\[
\begin{aligned}
A(x)&amp;=\sum_{i=0}^{n-1} a_i x^i\\
B(x)&amp;=\sum_{j=0}^{m-1} b_j x^j\\
C(x)=A(x)B(x)&amp;=\sum_{k=0}^{n+m-2}c_kx^k\\
&amp;=\sum_{k=0}^{n+m-2}\sum_{i+j=k}a_ib_jx^k
\end{aligned}
\]</span></p></li>
<li><p>signal processing</p>
<p>time <span class="math inline">\(\to\)</span> frequency</p>
<ul>
<li>remove high-frequency <span class="math inline">\(\to\)</span>
remove noise</li>
<li>multiplication in frequency <span
class="math inline">\(\leftrightarrow\)</span> convolution in time</li>
<li>filter : smoothing , denoising</li>
</ul></li>
<li><p>sum of two independent r.v.</p>
<p><span class="math inline">\(X\)</span> : PDF <span
class="math inline">\(f\)</span> , <span
class="math inline">\(Y\)</span> : PDF <span
class="math inline">\(g\)</span> , <span
class="math inline">\(X+Y\)</span> : PDF <span
class="math inline">\(h\)</span> <span class="math display">\[
h(z)=\int_{-\infty}^{\infty} f(x)g(z-x)dx
\]</span></p></li>
</ul></li>
<li><p>Algorithm (poly multiplication)</p>
<ul>
<li>Choose <span class="math inline">\(n+m-1\)</span> values <span
class="math inline">\(x_1,\cdots,x_{n+m-1}\)</span> , evaluate <span
class="math inline">\(A(x_i),B(x_i)\)</span> (DFT , <span
class="math inline">\(\mathcal O(n\log n)\)</span>)</li>
<li>Let <span class="math inline">\(C(x_i)=A(x_i)B(x_i)\)</span> (<span
class="math inline">\(\mathcal O(n)\)</span>)</li>
<li>Interpolation , reconstruct <span
class="math inline">\(C(x)\)</span> from <span
class="math inline">\(C(x_i)\)</span> (IDFT , <span
class="math inline">\(\mathcal O(n\log n)\)</span>)</li>
</ul></li>
<li><p>DFT</p>
<p><span class="math inline">\(\{x_0,\cdots,x_{N-1}\}\to \{\bar
x_0,\cdots,\bar x_{N-1}\}\)</span> , where <span
class="math inline">\(N=2^t\)</span> for some positive integer <span
class="math inline">\(t\)</span> . <span class="math display">\[
\bar x_k=\sum_{n=0}^{N-1}x_n\exp(-\frac{2\pi \mathrm{i}kn}{N})
\]</span> <span class="math inline">\(\bar x_k\)</span> can be viewed as
<span class="math inline">\(f(\exp(-\frac{2\pi
\mathrm{i}k}{N}))\)</span> .</p>
<p>Denote <span class="math inline">\(w_k^j=\exp(\frac{2\pi
\mathrm{i}j}{k})\)</span> : <span class="math inline">\(j\)</span>-th
root of <span class="math inline">\(x^k=1\)</span> .</p></li>
<li><p>Divide and Conquer for DFT <span class="math display">\[
\begin{aligned}
A_{even}(x)&amp;=a_0+a_2x+a_4x^2+\cdots+a_{N-2}x^{\frac{N-2}{2}}\\
A_{odd}(x)&amp;=a_1+a_3x+a_5x^2+\cdots+a_{N-1}x^{\frac{N-2}{2}}\\
A(x)&amp;=A_{even}(x^2)+xA_{odd}(x^2)
\end{aligned}
\]</span> Goal : compute <span
class="math inline">\(A(w_{2N}^0),A(w_{2N}^1),\cdots
A(w_{2N}^{2N-1})\)</span> .</p>
<p>Divide : <span
class="math inline">\(A_{even}(w_{N}^0),A_{even}(w_N^1),\cdots,A_{even}(w_N^{N-1})\)</span>
, <span class="math inline">\(A_{odd}(w_N^0) , A_{odd}(w_N^1) ,
\cdots,A_{odd}(w_N^{N-1})\)</span></p>
<p>Merge : <span
class="math inline">\(A(w_{2N}^k)=A_{even}(w_N^k)+w_{2N}^k
A_{odd}(w_N^k)\)</span> , <span
class="math inline">\(A(w_{2N}^{N+k})=A_{even}(w_N^k)-w_{2N}^{k}A_{odd}(w_N^k)\)</span>
, for <span class="math inline">\(0\le k\le N-1\)</span> .</p></li>
<li><p>IDFT <span class="math display">\[
\begin{pmatrix}
(w_{N}^{0})^0&amp;(w_{N}^{0})^1&amp;\cdots &amp;(w_{N}^{0})^{N-1}\\
(w_{N}^{1})^0&amp;(w_{N}^{1})^1&amp;\cdots &amp;(w_{N}^{1})^{N-1}\\
\vdots&amp;\vdots&amp;\ddots&amp;\vdots\\
(w_{N}^{N-1})^0&amp;(w_{N}^{N-1})^1&amp;\cdots &amp;(w_{N}^{N-1})^{N-1}
\end{pmatrix}\begin{pmatrix}a_0\\a_1\\\vdots\\a_{N-1}
\end{pmatrix}=\begin{pmatrix}A(w_{N}^0)\\A(w_{N}^1)\\\vdots\\A(w_{N}^{N-1})
\end{pmatrix}
\]</span></p>
<p><span class="math display">\[
W=\begin{pmatrix}
(w_{N}^{0})^0&amp;(w_{N}^{0})^1&amp;\cdots &amp;(w_{N}^{0})^{N-1}\\
(w_{N}^{1})^0&amp;(w_{N}^{1})^1&amp;\cdots &amp;(w_{N}^{1})^{N-1}\\
\vdots&amp;\vdots&amp;\ddots&amp;\vdots\\
(w_{N}^{N-1})^0&amp;(w_{N}^{N-1})^1&amp;\cdots &amp;(w_{N}^{N-1})^{N-1}
\end{pmatrix}
\]</span></p>
<p>Orthogonal basis : <span class="math inline">\(WW^\dagger=NI\)</span>
, so <span class="math inline">\(W^{-1}=\frac{1}{N}W^\dagger\)</span> ,
therefore , <span class="math display">\[
\frac{1}{N}\begin{pmatrix}
\overline{(w_{N}^{0})^0}&amp;\overline{(w_{N}^{0})^1}&amp;\cdots
&amp;\overline{(w_{N}^{0})^{N-1}}\\
\overline{(w_{N}^{1})^0}&amp;\overline{(w_{N}^{1})^1}&amp;\cdots
&amp;\overline{(w_{N}^{1})^{N-1}}\\
\vdots&amp;\vdots&amp;\ddots&amp;\vdots\\
\overline{(w_{N}^{N-1})^0}&amp;\overline{(w_{N}^{N-1})^1}&amp;\cdots
&amp;\overline{(w_{N}^{N-1})^{N-1}}
\end{pmatrix}\begin{pmatrix}A(w_{N}^0)\\A(w_{N}^1)\\\vdots\\A(w_{N}^{N-1})
\end{pmatrix}=\begin{pmatrix}a_0\\a_1\\\vdots\\a_{N-1}
\end{pmatrix}
\]</span></p></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>算法-分治</tag>
        <tag>算法-分治-归并排序</tag>
        <tag>算法-分治-逆序对</tag>
        <tag>算法-分治-二维最近点对</tag>
        <tag>算法-分治-第k小数</tag>
        <tag>算法-分治-快速傅里叶变换</tag>
      </tags>
  </entry>
  <entry>
    <title>Algorithm Design 8</title>
    <url>/2023/11/21/Algorithm-Design-8/</url>
    <content><![CDATA[<h2 id="chapter-6-approximation-problem">Chapter 6 Approximation
Problem</h2>
<h3 id="knapsack">6.4 Knapsack</h3>
<ol type="1">
<li><p>Description</p>
<p><span class="math inline">\(n\)</span> items , each have value <span
class="math inline">\(v_i\)</span> and weight <span
class="math inline">\(w_i\le W\)</span> . Maximum capacity <span
class="math inline">\(W\)</span> .</p>
<p>Find a subset <span class="math inline">\(I\subseteq [n]\)</span> ,
s.t. <span class="math inline">\(\sum_{i\in I}w_i\le W\)</span> , and
<span class="math inline">\(\sum_{i\in I}v_i\)</span> is maximized
.</p></li>
<li><p>By DP : <span class="math inline">\(\mathcal
O(nW)\)</span></p></li>
<li><p>Discretization</p>
<p>Let <span class="math inline">\(b=\frac{\epsilon}{2n}\cdot \max_{i\in
[n]}v_i\)</span> , let <span class="math inline">\(\bar
v_i=\lceil\frac{v_i}{b}\rceil b\)</span> . Therefore , <span
class="math inline">\(\hat v_i:=\lceil \frac{v_i}{b}\rceil \in
[0,\frac{2n}{\epsilon}]\)</span> .</p>
<p>New DP method :</p>
<p><span class="math inline">\(OPT(i,V)\)</span> : the smallest weight
one can obtain from first <span class="math inline">\(i\)</span> items
with total value of <span class="math inline">\(\hat v_i\)</span> <span
class="math inline">\(\ge V\)</span> .</p>
<blockquote>
<p>Originally : <span class="math inline">\(OPT(i,w)=\max v\)</span> ,
now : <span class="math inline">\(OPT(i,v)=\min w\)</span></p>
</blockquote>
<p><span class="math display">\[
OPT(i,V)=\min\begin{cases}OPT(i-1,V)\\OPT(i-1,V-\hat v_i)+w_i\end{cases}
\]</span></p>
<p>Time complexity : <span class="math inline">\(\max V=n\cdot
2n/\epsilon\)</span> , so total time is <span
class="math inline">\(\mathcal O(n^3/\epsilon)\)</span> .</p></li>
<li><p>Approximation analysis</p>
<p>Goal : Let <span class="math inline">\(S^*\)</span> is OPT solution ,
<span class="math inline">\(S\)</span> is our solution , then <span
class="math inline">\(\sum_{i\in S}v_i\ge (1-\epsilon)\sum_{i\in S^*}
v_i\)</span></p>
<p>Proof : <span class="math display">\[
\begin{aligned}
&amp;\quad \sum_{i\in S^*}v_i\\
&amp;\le \sum_{i\in S^*}\bar v_i\\
&amp;\le \sum_{i\in S}\bar v_i\\
&amp;\le \sum_{i\in S}(v_i+b)\\
&amp;\le nb+\sum_{i\in S}v_i\\
&amp;\le \frac{\epsilon}{2}\max_i v_i +\sum_{i\in S}v_i\\
&amp;\le \frac{\epsilon}{2}\sum_{i\in S^*}v_i+\sum_{i\in S}v_i
\end{aligned}
\]</span> Therefore , <span class="math inline">\(\sum_{i\in S}v_i\ge
(1-\epsilon/2) \sum_{i\in S^*}v_i\)</span></p>
<p>We can reach <span
class="math inline">\((1+\epsilon)\)</span>-approximation</p>
<p>FPTAS , running time <span class="math inline">\(\mathcal
O(poly(n,1/\epsilon))\)</span> .</p>
<blockquote>
<p>Note : Here we over-estimate the value of each items , and find valid
solution ( <span class="math inline">\(\sum w_i\le W\)</span> ) . Then
because we don't over-estimate much , our solution is not much worse
than the optimal solution .</p>
<p>Reversely , we can also under-estimate the weight of each items , and
find more optimal solution (<span class="math inline">\(\sum v_i\ge
V\)</span>) . Then because we don't under-estimate much , our solution
is not exceed much than the weight restriction (<span
class="math inline">\(\sum w_i\le (1+\epsilon)W\)</span>)</p>
</blockquote></li>
</ol>
<h3 id="set-cover">6.5 Set Cover</h3>
<ol type="1">
<li><p>Description</p>
<p>Universe <span class="math inline">\(U\)</span> , <span
class="math inline">\(|U|=n\)</span> . <span
class="math inline">\(\mathcal F=\{S_1,\cdots,S_m\}\)</span> , <span
class="math inline">\(S_i\subseteq U\)</span> , each with weight <span
class="math inline">\(w_i\ge 0\)</span> .</p>
<p>Find a set cover <span class="math inline">\(I\)</span> , s.t. <span
class="math inline">\(\bigcup_{i\in I}S_i=U\)</span> , and minimize
<span class="math inline">\(\sum_{i\in I}w_i\)</span> .</p></li>
<li><p>Greedy for SC</p>
<p>选“性价比”最高的 , <span class="math inline">\(|S_i|/w_i\)</span>
like</p>
<p>Algorithm :</p>
<p>repeat choose the subset <span class="math inline">\(s_i\)</span> ,
s.t. <span class="math inline">\(w_i/|\{\text{newly covered elements by
}S_i\}|\)</span> is smallest</p></li>
<li><p>Approximation analysis</p>
<p>Goal : achieves <span class="math inline">\(\alpha=\ln n\)</span></p>
<p>Proof :</p>
<p>Suppose our solution picks <span
class="math inline">\(S_{i_1},S_{i_2},\cdots,S_{i_k}\)</span> , <span
class="math inline">\(R_j:=U\backslash \bigcup_{l\le j}S_{i_l}\)</span>
, let <span class="math inline">\(n_j=|R_j|\)</span> .</p>
<p>Let <span class="math inline">\(T_j:=R_{j-1}\cap S_{i_j}\)</span> ,
<span class="math inline">\(t_j=|T_j|\)</span> . ( <span
class="math inline">\(T_j\)</span> is newly added elements in <span
class="math inline">\(j\)</span>-th iteration)</p>
<p>Key inequality : <span class="math inline">\(w_{i_j}/t_j\le
OPT/n_{j-1}\)</span> (*)</p>
<p>Therefore , <span class="math inline">\(w_{i_j}\le OPT\cdot
t_j/n_{j-1}\)</span> . Since <span
class="math inline">\(t_j=n_{j-1}-n_j\)</span> , we have <span
class="math display">\[
\begin{aligned}
SOL&amp;= \sum_{j=1}^{k}w_{i_j}\\
&amp;\le OPT\sum_{j=1}^k (t_j/n_{j-1})\\
&amp;\le OPT \sum_{j=1}^k \frac{n_{j-1}-n_j}{n_{j-1}}\\
&amp;= OPT\sum_{j=1}^k \sum_{i=n_j+1}^{n_{j-1}}\frac{1}{n_{j-1}}\\
&amp;\le OPT \sum_{j=1}^k \sum_{i=n_j+1}^{n_{j-1}} \frac{1}{i}\\
&amp;=OPT\sum_{i=1}^n \frac{1}{i}
\end{aligned}
\]</span> (*) proof : 在第 <span class="math inline">\(j\)</span> 轮
，<span class="math inline">\(OPT\)</span> 一定可以覆盖完 <span
class="math inline">\(R_{j-1}\)</span> ，设 <span
class="math inline">\(OPT\backslash\{i_1,\cdots,i_{j-1}\}=\{o_1,\cdots,o_l\}\)</span>
，则 <span class="math inline">\(\forall o\)</span> , <span
class="math inline">\(w_o/|S_o\cap R_{j-1}|\ge w_{i_j}/t_j\)</span>
.</p></li>
<li><p>Approximation with <span class="math inline">\((1-\delta)\ln
n\)</span> for Set Cover is NP-Hard</p>
<p>Proof : PCP Theorem ( probabilistic checkable proof )</p>
<p><span class="math inline">\(NP=PCP(\mathcal O(\log n),\mathcal
O(1))\)</span></p>
<p>PCP : read <span class="math inline">\(\log n\)</span> bits of proof
, then decide whether to accept</p>
<p>Used for approximation : easy to make <span
class="math inline">\(h\sim \alpha h\)</span> gap</p></li>
<li><p>Extension : Pricing Model ( see KT's Book )</p>
<p>For partition problem : partition original instance into multiple
parts , each has a "center" . We consider all points are dominated by a
center <span class="math inline">\(c\)</span> , and each center can have
a dominating set <span class="math inline">\(D_c\)</span> .</p>
<p>Then we amortize the cost of part <span
class="math inline">\(c\)</span> to all points in <span
class="math inline">\(D_c\)</span> , and let this be our greedy
reference. Each iteration , we only consider the remaining points , and
choose the part with minimized amortized cost . Then usually we can get
an inequality like (*) , leading to <span class="math inline">\(\mathcal
O(\log n)\)</span> approximation .</p></li>
</ol>
<h3 id="linear-programming-approximation">6.6 Linear Programming &amp;
Approximation</h3>
<ol type="1">
<li><p>Description</p>
<p><span class="math inline">\(n\)</span> variables , <span
class="math inline">\(m\)</span> linear constraints <span
class="math inline">\(M_i=\sum_{j=1}^n a_{i,j}x_j\ge b_i\)</span> .
Maximize <span class="math inline">\(\sum_{i=1}^n c_i x_i\)</span> .</p>
<p>Variants:</p>
<ul>
<li><span class="math inline">\(\sum_{i=1}^n a_{i,j}x_j\le b_i\)</span>
<span class="math inline">\(\to\)</span> <span
class="math inline">\(\sum_{i=1}^n (-a_{i,j})x_j \ge -b_i\)</span></li>
<li><span class="math inline">\(\sum_{i=1}^n a_{i,j}x_j=b_i\)</span>
<span class="math inline">\(\to\)</span> <span
class="math inline">\(\sum_{i=1}^n a_{i,j}x_j\ge b_i\)</span> &amp;
<span class="math inline">\(\sum_{i=1}^n a_{i,j}x_j\le b_i\)</span></li>
<li>Minimize <span class="math inline">\(\sum_{i=1}^n c_ix_i\)</span>
<span class="math inline">\(\to\)</span> Maximize <span
class="math inline">\(\sum_{i=1}^n (-c_i)x_i\)</span></li>
</ul></li>
<li><p>Geometry view</p>
<p>One constraint <span class="math inline">\(\to\)</span> one side of
<span class="math inline">\(n-1\)</span> dim hyperplane</p>
<p>Finally : form a polyhedron / polytope (bounded polyhedron)</p>
<p>Only need to check the vertices of polytope .</p>
<p><span class="math inline">\(n\)</span> dim polytope : surface is
<span class="math inline">\(n-1\)</span> dim polytope , connection
between two <span class="math inline">\(k\)</span> dim polytope is <span
class="math inline">\(k-1\)</span> dim polytope .</p>
<p>Each vertex : solution of linear system with <span
class="math inline">\(n\)</span> constraints</p></li>
<li><p>theoretical results</p>
<p>[Danzig] simplex algorithm</p>
<p>​ idea : find one vertex , choose one edge to move to increase the
goal .</p>
<p>​ worst case : exponential time (enumerate all possible vertex)</p>
<p>​ OPEN : how to solve simplex in poly-time (choose which edge)</p>
<p>​ Current Results : poly-time path exists ; sub-exponential time</p>
<p>interior point method : theoretical poly-time , applicable</p>
<p>ellipsoid method : theoretical poly-time , not-applicable</p></li>
<li><p>Vertex Cover <span class="math inline">\(\to\)</span> integer
linear programming</p>
<ol type="1">
<li><p>integer linear programming</p>
<p><span class="math inline">\(x_v=\begin{cases}0&amp;v\notin
VC\\1&amp;v\in VC\end{cases}\)</span></p>
<p>constraints : <span class="math inline">\(\forall e=(u,v)\in
E\)</span> , <span class="math inline">\(x_u+x_v\ge 1\)</span></p>
<p>integer constraint : <span class="math inline">\(x_v\in
\{0,1\}\)</span></p>
<p>goal : minimize <span class="math inline">\(\sum_{v\in V}
w_vx_v\)</span></p></li>
<li><p>Linear Programming relaxation : relax ILP to LP</p>
<p>replace <span class="math inline">\(x_v\in \{0,1\}\)</span> by <span
class="math inline">\(0\le x_v\le 1\)</span></p>
<p>feasible region of LP contains feasible region of ILP</p>
<p>Obs: <span class="math inline">\(OPT(LP)\le
OPT(ILP)\)</span></p></li>
<li><p>Approximation Algorithm</p>
<p>Step 1 : solve <span class="math inline">\(LP\)</span> , denote the
fractional solution by <span class="math inline">\(\vec x\)</span></p>
<p>Step 2 : rounding : <span class="math inline">\(\bar
x_v=\begin{cases}1&amp;x_v\ge
0.5\\0&amp;otherwise\end{cases}\)</span></p>
<ol type="1">
<li><p>Proof 1 : <span class="math inline">\(\bar x_v\)</span> is a
valid solution for ILP</p>
<p>Since <span class="math inline">\(x_u+x_v\ge 1\)</span> , at least
one of <span class="math inline">\(x_u\)</span> and <span
class="math inline">\(x_v\)</span> is <span class="math inline">\(\ge
0.5\)</span> , then at least one of them is rounded to <span
class="math inline">\(1\)</span> .</p></li>
<li><p>Proof 2 : <span class="math inline">\(SOL\le 2\cdot OPT\)</span>
<span class="math display">\[
\begin{aligned}
SOL&amp;=\sum_{v}w_v\bar x_v\\
&amp;\le \sum_{v}w_v 2x_v\\
&amp;=2\cdot OPT(LP)\\
&amp;\le 2\cdot OPT(ILP)
\end{aligned}
\]</span></p></li>
</ol></li>
<li><p>assuming UGC , it is hard to improve to <span
class="math inline">\(\alpha=2-\epsilon\)</span> .</p></li>
</ol></li>
<li><p>Set Cover <span class="math inline">\(\to\)</span> integer linear
programming</p>
<ol type="1">
<li><p>integer linear programming</p>
<p>constraints : <span class="math inline">\(\forall e\in U\)</span> ,
<span class="math inline">\(\sum_{e\in S_i} x_i\ge 1\)</span></p>
<p>integer constraint : <span class="math inline">\(x_i\in
\{0,1\}\)</span></p>
<p>goal : minimize <span class="math inline">\(\sum_{i=1}^m
w_ix_i\)</span></p></li>
<li><p>How to round ?</p>
<p>randomized rounding scheme : take <span class="math inline">\(\bar
x_i=\begin{cases}1&amp;w.p.\  x_i\\0&amp;w.p.\
1-x_i\end{cases}\)</span></p>
<p><span class="math inline">\(E[SOL]=\sum_{i}w_i\Pr\{\text{we choose
}S_i\}=\sum_{i}w_ix_i=OPT(LP)\le OPT\)</span></p>
<p>feasible probability</p>
<p><span class="math inline">\(\Pr\{e\text{ is not
covered}\}=\prod_{i:e\in S_i}(1-x_i)\le \exp\left(\sum_{i:e\in
S_i}-x_i\right)\le \frac{1}{e}\)</span></p>
<p>We need to repeat <span class="math inline">\(T=2\ln n\)</span> times
, take the union of all rounds .</p>
<p><span class="math inline">\(\Pr\{e\text{ is not covered in all
}T\text{ rounds}\}\le \frac{1}{e^T}=\frac{1}{n^2}\)</span> .</p>
<p><span class="math inline">\(\Pr\{\exists e:e\text{ is not covered in
all }T\text{ rounds}\}\le \frac{1}{n}\)</span> .</p>
<p>feasible probability : <span
class="math inline">\(1-\frac{1}{n}\)</span></p>
<p>approximation : <span class="math inline">\(\alpha=2\ln
n\)</span></p></li>
</ol></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>算法-近似算法</tag>
        <tag>算法-近似算法-背包问题</tag>
        <tag>算法-近似算法-划分问题</tag>
        <tag>算法-近似算法-Pricing Model</tag>
        <tag>算法-线性规划</tag>
        <tag>算法-近似算法-整数线性规划近似</tag>
      </tags>
  </entry>
  <entry>
    <title>Probability and Statistics 6</title>
    <url>/2023/11/04/Probability-and-Statistics-6/</url>
    <content><![CDATA[<h2 id="chapter-2-laws-of-large-numbers">Chapter 2 Laws of Large
Numbers</h2>
<h3 id="conditional-expectation">2.2 Conditional Expectation</h3>
<ol start="4" type="1">
<li><p>Conditioning on general random variables</p>
<ol type="1">
<li><p>Def [ <strong><em>conditional expected value</em></strong> ]
:</p>
<p>Let <span class="math inline">\(X,Z\)</span> be random variables ,
<span class="math inline">\(E[|X|]&lt;\infty\)</span> , define <span
class="math inline">\(Y=E[X|Z]\)</span> with following properties:</p>
<p>(i). <span class="math inline">\(Y\)</span> is a function of <span
class="math inline">\(Z\)</span></p>
<p>(ii). <span class="math inline">\(E[|Y|]&lt;\infty\)</span></p>
<p>(iii). <span class="math inline">\(\forall G\in \sigma(Z)\)</span> ,
<span class="math inline">\(\int_{G}YdP=\int_GXdP\)</span></p>
<p>Moreover , if <span class="math inline">\(\tilde Y\)</span> is a
random variable satisfying (i)(ii)(iii) , then <span
class="math inline">\(\tilde Y=Y\)</span> a.s.</p>
<blockquote>
<p>This definition is given by <strong>Kolmogorov</strong> in 1933 :
"<em>Fundamental Concepts of Probability Theory</em>"</p>
</blockquote></li>
<li><p>Def [ <strong><em>(regular) conditional probability</em></strong>
] <span class="math display">\[
P(X\in A|Z):=E[\mathbb 1(X\in A)|Z]
\]</span> Def [ <strong><em>(regular) conditional density</em></strong>
] <span class="math display">\[
\exists f_{X|Z}(x|z) \text{ s.t. } P(X\in
A|Z)(w)=\int_Af_{X|Z}(x|Z(w))dx
\]</span></p>
<blockquote>
<p>Here <em>regular</em> means this definition only holds in general
cases. There may exist loophole in some special cases.</p>
</blockquote></li>
</ol></li>
<li><p>Famous Results</p>
<ol type="1">
<li><p>Bayes rule <span class="math display">\[
\begin{aligned}
P(B|A)&amp;=\frac{P(A|B)P(B)}{P(A)}\\
P(X=x|Z=z)&amp;=\frac{P(Z=z|X=x)P(X=x)}{P(Z=z)}\\
f_{X|Z}(x|z)&amp;=\frac{f_{Z|X}(z|x)f_X(x)}{f_Z(z)}
\end{aligned}
\]</span></p></li>
<li><p>Correlation</p>
<ol type="1">
<li><p>Intuition</p>
<p><span class="math inline">\(X\perp\!\!\!\perp Y\Rightarrow
E[(X-EX)(Y-EY)]=E[X-EX]E[Y-EY]=0\)</span></p>
<p><span class="math inline">\(E[(X-EX)(Y-EY)]\not\Rightarrow
X\perp\!\!\!\perp Y\)</span></p></li>
<li><p>Def [ <strong><em>uncorrelated</em></strong> ] <span
class="math inline">\(X,Y\)</span> are
<strong><em>uncorrelated</em></strong> random variables , if <span
class="math inline">\(E[(X-EX)(Y-EY)]=0\)</span></p>
<blockquote>
<p>This definition is equivalent as <span
class="math inline">\(E[XY]=E[X]E[Y]\)</span></p>
</blockquote></li>
<li><p>Def [ <strong><em>correlation coefficient</em></strong> ] <span
class="math display">\[
\rho=\frac{E[(X-EX)(Y-EY)]}{\sqrt{Var(X)}\sqrt{Var(Y)}}
\]</span> Def [ <strong><em>covariance</em></strong> ] : <span
class="math inline">\(E[(X-EX)(Y-EY)]\)</span></p></li>
<li><p>THM : Let <span class="math inline">\(X_1,\cdots,X_n\)</span>
have <span class="math inline">\(E[X_i^2]&lt;\infty\)</span> and be
uncorrelated , then <span class="math display">\[
Var(X_1+\cdots+X_n)=\sum_{i=1}^n Var(X_i)
\]</span> Proof : Let <span
class="math inline">\(S_n=X_1+\cdots+X_n\)</span> <span
class="math display">\[
\begin{aligned}
Var(S_n)&amp;=E[(S_n-E[S_n])^2)]\\
&amp;=E\left[\sum_{i=1}^n(X_i-\mu_i)\right]^2\\
&amp;=E\left[\sum_{i=1}^n (X_i-\mu_i)^2+\sum_{i\neq
j}(X_i-\mu_i)(X_j-\mu_j)\right]\\
&amp;=\sum_{i=1}^n E[(X_i-\mu_i)^2]+\sum_{i\neq
j}E[(X_i-\mu_i)(X_j-\mu_j)]\\
&amp;=\sum_{i=1}^n Var(X_i)
\end{aligned}
\]</span></p></li>
<li><p>THM : <span class="math inline">\(Var(cX)=c^2Var(X)\)</span></p>
<p>Remark : for "totally correlated" r.v.</p></li>
</ol></li>
</ol></li>
</ol>
<h3 id="introduction-of-lln">2.3 Introduction of LLN</h3>
<ol type="1">
<li><p>Def [ <strong><em>i.i.d.</em></strong> ] i.i.d. means independent
and identically distributed</p></li>
<li><p>THM [ <strong><em>week law of large numbers (WLLN)</em></strong>
]</p>
<ol type="1">
<li><p>THM : Let <span class="math inline">\(X_1,X_2,\cdots\)</span> be
i.i.d. , with <span
class="math inline">\(\lim\limits_{x\to\infty}xP(|X_1|&gt;x)=0\)</span>.
Let <span class="math inline">\(\mu_n=E[X_1\mathbb 1(|X_1|\le
n)]\)</span> , so <span class="math display">\[
\frac{1}{n}\sum_{i=1}^n X_i-\mu_n\xrightarrow{P} 0
\]</span></p></li>
<li><p>Remarks</p>
<ol type="1">
<li><p>A sufficient condition for <span
class="math inline">\(\lim\limits_{x\to\infty} xP(|X_1|&gt;x)=0\)</span>
is <span class="math inline">\(E[|X_1|]&lt;\infty\)</span></p>
<p>Proof : <span class="math display">\[
\begin{aligned}
&amp;\quad xP(|X_1|&gt;x)\\
&amp;=x\int \mathbb 1(|X_1|&gt;x)dP\\
&amp;\le\int |X_1|\mathbb 1(|X_1|&gt;x)dP\\
&amp;=E[|X_1|\mathbb 1(|X_1|&gt;x)]\\
&amp;\to 0\qquad\qquad\qquad\qquad\qquad\text{since }E[|X_1|]&lt;\infty
\end{aligned}
\]</span></p></li>
<li><p>When <span class="math inline">\(E[|X_1|]&lt;\infty\)</span> ,
using Dominated Convergence Theorem , <span class="math display">\[
\mu_n=E[X_1\mathbb 1(|X_1|\le n)]\to E[X_1]=\mu
\]</span></p></li>
</ol></li>
<li><p>THM [ <strong><em>WLLN in common</em></strong> ] :</p>
<p>Let <span class="math inline">\(X_1,X_2,\cdots\)</span> be i.i.d. ,
with <span class="math inline">\(E[|X_i|]&lt;\infty\)</span> , and <span
class="math inline">\(\mu=E[X_i]\)</span> , then as <span
class="math inline">\(n\to\infty\)</span> , <span
class="math display">\[
\frac{1}{n}\sum_{i=1}^n X_i\xrightarrow{P}\mu
\]</span></p></li>
</ol></li>
<li><p>THM [ <strong><em>strong law of large numbers
(SLLN)</em></strong> ]</p>
<p>Let <span class="math inline">\(X_1,X_2,\cdots\)</span> be
<strong>pairwise independent</strong> and identically distributed ,
<span class="math inline">\(E[|X_i|]&lt;\infty\)</span> , and <span
class="math inline">\(\mu=E[X_i]\)</span> , therefore, as <span
class="math inline">\(n\to\infty\)</span>, <span class="math display">\[
\frac{1}{n}\sum_{i=1}^n X_i\to\mu\qquad a.s.
\]</span></p></li>
</ol>
<h3 id="weak-laws-of-large-numbers">2.4 Weak Laws of Large Numbers</h3>
<ol type="1">
<li><p>(*) <span class="math inline">\(L^2\)</span> Weak Laws</p>
<ol type="1">
<li><p>THM [ <strong><em><span class="math inline">\(L^2\)</span> weak
law</em></strong> ] : Let <span
class="math inline">\(X_1,X_2,\cdots\)</span> be uncorrelated r.v. ,
with <span class="math inline">\(E[X_i]=\mu\)</span> and <span
class="math inline">\(Var(X_i)\le C&lt;\infty\)</span>. Then as <span
class="math inline">\(n\to\infty\)</span>, <span class="math display">\[
\frac{1}{n}\sum_{i=1}^n X_i\to \mu
\]</span> In <span class="math inline">\(L^2\)</span> and in
probability</p>
<blockquote>
<p><span class="math inline">\(L^2\)</span>-convergence : <span
class="math inline">\(A\to B\)</span> in <span
class="math inline">\(L^2\)</span> means <span
class="math inline">\(E[(A-B)^2]\to 0\)</span>.</p>
</blockquote></li>
<li><p>Lemma : If <span class="math inline">\(p&gt;0\)</span> , and
<span class="math inline">\(E[|Z_n|^p]\to 0\)</span> , then <span
class="math inline">\(Z_n\xrightarrow{P} 0\)</span>.</p>
<p>Proof : Use Chebyshev's Lemma for <span
class="math inline">\(\varphi(x)=x^p\)</span> , so <span
class="math inline">\(P(|Z_n|&gt;\epsilon)\le \epsilon^{-p}
E[|Z_n|^p]\to 0\)</span>.</p></li>
<li><p>Proof : Let <span class="math inline">\(S_n=\sum_{i=1}^n
X_i\)</span>,</p>
<p><span class="math inline">\(L^2\)</span> convergence : <span
class="math display">\[
E[(S_n/n-\mu)^2]=Var(S_n/n)=\frac{1}{n^2}\sum_{i=1}^n
Var(X_i)\le  \frac{C}{n}\to\infty
\]</span> In probability : Let <span
class="math inline">\(Z_n=S_n/n-\mu\)</span> , let <span
class="math inline">\(p=2\)</span> , use the lemma above.</p></li>
<li><p>Application : see book</p></li>
</ol></li>
<li><p>Triangular Arrays</p>
<ol type="1">
<li><p>(*) THM</p>
<p>Let <span class="math inline">\(X_1,X_2\cdots,\)</span> be random
variables, <span class="math inline">\(S_n=\sum_{i=1}^n X_i\)</span> ,
<span class="math inline">\(\mu_n=E[S_n]\)</span> , <span
class="math inline">\(\sigma^2_n=Var[S_n]\)</span>. Let <span
class="math inline">\(b_1,b_2,\cdots\)</span> be a sequence, with <span
class="math inline">\(\frac{\sigma_n^2}{b_n^2}\to 0\)</span> , then
<span class="math display">\[
\frac{S_n-\mu_n}{b_n}\xrightarrow{P}0
\]</span> Proof :</p>
<p>Since <span
class="math inline">\(E[(S_n-\mu_n)/b_n]^2=Var(S_n)/b_n^2\to 0\)</span>
, by Lemma above , this conclusion holds.</p></li>
<li><p>Def [ <strong><em>truncation</em></strong> ] : To
<strong><em>truncate r.v. <span class="math inline">\(X\)</span> at
level <span class="math inline">\(M\)</span></em></strong> , means <span
class="math display">\[
\bar X=X\mathbb 1(|X|\le M)=\begin{cases}X&amp;|X|\le
M\\0&amp;|X|&gt;M\end{cases}
\]</span></p></li>
<li><p>THM [ <strong><em>Weak Law for Triangular Arrays</em></strong> ]
:</p>
<p>Conditions</p>
<ol type="1">
<li><p>For each <span class="math inline">\(n\)</span> , <span
class="math inline">\(X_{n,1},\cdots,X_{n,n}\)</span> are
independent</p></li>
<li><p><span class="math inline">\(b_1,b_2,\cdots\)</span> is a sequence
s.t. <span class="math inline">\(b_n&gt;0\)</span> and <span
class="math inline">\(b_n\to\infty\)</span> as <span
class="math inline">\(n\to\infty\)</span> . Let <span
class="math inline">\(\bar X_{n,k}=X_{n,k}\mathbb 1(|X_{n,k}|\le
b_n)\)</span>.</p></li>
<li><p>As <span class="math inline">\(n\to\infty\)</span></p>
<p>(i). <span class="math inline">\(\sum_{k=1}^n P(|X_{n,k}|&gt;b_n)\to
0\)</span></p>
<p>(ii). <span class="math inline">\(\frac{1}{b_n^2}\sum_{k=1}^n E[\bar
X_{n,k}^2]\to 0\)</span></p></li>
</ol>
<p>Result: Let <span class="math inline">\(S_n=\sum_{k=1}^n
X_{n,k}\)</span> , let <span class="math inline">\(a_n=\sum_{k=1}^n
E[\bar X_{n,k}]\)</span> , then <span class="math display">\[
\frac{S_n-a_n}{b_n}\xrightarrow{P}0
\]</span> Proof :</p>
<ol type="1">
<li><p>Let <span class="math inline">\(\bar S_n=\sum_{k=1}^n \bar
X_{n,k}\)</span> , so <span class="math display">\[
P\left(|\frac{S_n-a_n}{b_n}|&gt;\epsilon\right)\le P(S_n\neq \bar
S_n)+P\left(|\frac{\bar S_n-a_n}{b_n}|&gt;\epsilon\right)
\]</span></p></li>
<li><p><span class="math display">\[
P(S_n\neq \bar S_n)\le P\left(\bigcup_{k=1}^n\{\bar X_{n,k}\neq
X_{n,k}\}\right)\le \sum_{k=1}^n P(|X_{n,k}|&gt;b_n)\to 0
\]</span></p></li>
<li><p><span class="math display">\[
\begin{aligned}
&amp;\quad P\left(|\frac{\bar S_n-a_n}{b_n}|&gt;\epsilon\right)\\
&amp;\le \frac{1}{\epsilon^2} E\left[\frac{\bar
S_n-a_n}{b_n}\right]^2\qquad\text{using Chebyshev&#39;s Inequality}\\
&amp;=\frac{1}{\epsilon^2b_n^2}Var(\bar S_n)\\
&amp;=\frac{1}{\epsilon^2b_n^2}\sum_{k=1}^n Var(\bar
X_{n,k})\qquad{\text{using uncorrelated property}}\\
&amp;\le \frac{1}{\epsilon^2b_n^2}\sum_{k=1}^n E[\bar X_{n,k}^2]\to0
\end{aligned}
\]</span></p></li>
</ol></li>
</ol></li>
<li><p>Weak Law of Large Numbers</p>
<ol type="1">
<li><p>THM [ <strong><em>week law of large numbers (WLLN)</em></strong>
]</p>
<p>Let <span class="math inline">\(X_1,X_2,\cdots\)</span> be i.i.d. ,
with <span
class="math inline">\(\lim\limits_{x\to\infty}xP(|X_1|&gt;x)=0\)</span>.
Let <span class="math inline">\(\mu_n=E[X_1\mathbb 1(|X_1|\le
n)]\)</span> , so <span class="math display">\[
\frac{1}{n}\sum_{i=1}^n X_i-\mu_n\xrightarrow{P} 0
\]</span></p></li>
<li><p>Proof:</p>
<p>Let <span class="math inline">\(X_{n,k}=X_k,b_n=n\)</span>, we want
to use <strong><em>Weak Law for Triangular Arrays</em></strong>.</p>
<ol type="1">
<li><p>For condition (i) , <span class="math display">\[
\sum_{k=1}^n P(|X_{n,k}|&gt;b_n)=\sum_{k=1}^n
P(|X_k|&gt;n)=nP(|X_k|&gt;n)\to 0
\]</span></p></li>
<li><p>Lemma [ <strong><em>expected value of <span
class="math inline">\(t\)</span>-th momentum</em></strong> ] : For
random variable <span class="math inline">\(Y\ge 0\)</span> and <span
class="math inline">\(t&gt;0\)</span>, <span class="math display">\[
E[Y^t]=\int_{0}^{\infty}ty^{t-1}P(Y&gt;y)dy
\]</span> Proof : <span class="math display">\[
\begin{aligned}
&amp;\quad \int_{0}^{\infty}ty^{t-1}P(Y&gt;y)dy\\
&amp;=\int_{0}^{\infty}\int_{\Omega}ty^{t-1}\mathbb 1(Y&gt;y)dPdy\\
&amp;=\int_{\Omega}\int_{0}^{\infty}ty^{t-1}\mathbb 1(Y&gt;y)dydP\\
&amp;=\int_{\Omega}\int_{0}^{Y}ty^{t-1}dydP\\
&amp;=\int_{\Omega}y^tdP\\
&amp;=E[Y^t]
\end{aligned}
\]</span></p></li>
<li><p>For condition (ii) , <span class="math display">\[
\begin{aligned}
&amp;\quad \frac{1}{b_n^2}\sum_{k=1}^n E[\bar X_{n,k}^2]\\
&amp;=\frac{1}{b_n^2}\sum_{k=1}^n E[(X_{n,k}\mathbb 1(|X_{n,k}|\le
b_n))^2]\\
&amp;=\frac{1}{n^2}\sum_{k=1}^n E[(X_k\mathbb 1(|X_k|\le n))^2]\\
&amp;=\frac{1}{n}E[(X_1\mathbb 1(|X_1|\le n))^2]\\
&amp;=\frac{1}{n} E[\bar X_{n,1}^2]\\
&amp;=\frac{1}{n}\int_{0}^{\infty} 2yP(\bar X_{n,1}&gt;y)dy\\
&amp;=\frac{2}{n}\int_{0}^{n} yP(|X_1|&gt;y)dy
\end{aligned}
\]</span> Let <span class="math inline">\(g(y)=yP(|X_1|&gt;y)\)</span> ,
since <span class="math inline">\(xP(|X_1|&gt;x)\to 0\)</span> , <span
class="math inline">\(g(y)\)</span> is bounded.</p>
<p>Let <span class="math inline">\(g_n(y):=g(ny)\)</span> , so <span
class="math inline">\(\forall y&gt;0\)</span> , <span
class="math inline">\(g_n\)</span> is bounded and <span
class="math inline">\(\to 0\)</span> <span class="math display">\[
\begin{aligned}
&amp;\quad\frac{2}{n}\int_{0}^n g(y)dy\\
&amp;=2\int_0^1 g_n(y)dy\\
&amp;\to 0\qquad\qquad\qquad\text{using Dominated Convergence Theorem}
\end{aligned}
\]</span></p></li>
</ol></li>
<li><p>Remark :</p>
<p>Use lemma with <span class="math inline">\(t=1-\epsilon\)</span> , so
<span class="math inline">\(xP(|X_1|&gt;x)\to 0\)</span> implies <span
class="math inline">\(E[|X_1|^{1-\epsilon}]&lt;\infty\)</span> .</p>
<p>Which means that <span class="math inline">\(xP(|X_1|&gt;x)\to
0\)</span> is not much weak than <span
class="math inline">\(E[|X_1|]&lt;\infty\)</span> .</p></li>
<li><p>THM [ <strong><em>WLLN in common</em></strong> ] :</p>
<p>Let <span class="math inline">\(X_1,X_2,\cdots\)</span> be i.i.d. ,
with <span class="math inline">\(\mu=E[|X_i|]&lt;\infty\)</span> , then
as <span class="math inline">\(n\to\infty\)</span> , <span
class="math display">\[
\frac{1}{n}\sum_{i=1}^n X_i\xrightarrow{P}\mu
\]</span></p></li>
<li><p>(*) Remarks :</p>
<ol type="1">
<li><p>Weak Law does not hold : [ <strong><em>Cauchy
Distribution</em></strong> ]</p>
<p><span class="math inline">\(P(X_i\le x)=\int_{-\infty}^x
\frac{dt}{\pi(1+t^2)}\)</span></p>
<p>As <span class="math inline">\(x\to\infty\)</span> , <span
class="math display">\[
P(|X_1|&gt;x)=2\int_{x}^{\infty} \frac{dt}{\pi(1+t^2)}\sim
\frac{2}{\pi}\int_{x}^{\infty} t^{-2}dt= \frac{2}{\pi} x^{-1}
\]</span> Therefore , <span
class="math inline">\(xP(|X_1|&gt;x)=\frac{2}{\pi}\not\to 0\)</span>
.</p></li>
<li><p>Weak Law holds but <span
class="math inline">\(E[X_1]=\infty\)</span></p>
<p>E.g. <span class="math inline">\(P(X_i=2^j)=2^{-j}\)</span> for <span
class="math inline">\(j=1,2,\cdots\)</span></p>
<p>SOL : back to <strong><em>weak law for triangular
arrays</em></strong> , choose better <span
class="math inline">\(b_n\)</span></p>
<p><span class="math inline">\(S_n/(n\log n)\xrightarrow P
1\)</span></p></li>
</ol></li>
</ol></li>
</ol>
<h3 id="borel-cantelli-lemma">2.5 Borel-Cantelli Lemma</h3>
<ol type="1">
<li><p><span class="math inline">\(\limsup,\liminf\)</span> of sets</p>
<ol type="1">
<li><p>Def [ <strong><em>limsup of events</em></strong> ] Let <span
class="math inline">\(A_1,A_2,\cdots\)</span> be events , <span
class="math display">\[
\limsup A_n:=\lim_{m\to\infty}\cup_{n=m}^{\infty}A_n
\]</span> THM : <span class="math inline">\(\limsup A_n=\{w:w\text{ in
infinitely many }A_n\}\)</span></p>
<p>Proof :</p>
<p><span class="math inline">\(\supseteq\)</span> : Since <span
class="math inline">\(w\)</span> is in infinitely many <span
class="math inline">\(A_n\)</span> , <span class="math inline">\(\forall
m&gt;0\)</span> , <span class="math inline">\(w\in
B_m=\cup_{n=m}^{\infty} A_n\)</span> , so <span
class="math inline">\(w\in \limsup A_n\)</span></p>
<p><span class="math inline">\(\subseteq\)</span> : If <span
class="math inline">\(w\)</span> is not in infinitely many <span
class="math inline">\(A_n\)</span> , then suppose <span
class="math inline">\(w\in A_{n_1},A_{n_2},\cdots,A_{n_k}\)</span> , let
<span class="math inline">\(m\ge n_k+1\)</span> , so <span
class="math inline">\(w\notin B_m\)</span> , so <span
class="math inline">\(w\notin \limsup A_n\)</span>.</p></li>
<li><p>Def [ <strong><em>liminf of events</em></strong> ] Let <span
class="math inline">\(A_1,A_2,\cdots\)</span> be events , <span
class="math display">\[
\liminf A_n:=\lim_{m\to\infty}\cap_{n=m}^{\infty}A_n
\]</span> THM : <span class="math inline">\(\liminf A_n=\{w:w\text{ in
all but finitely many }A_n\}\)</span></p>
<p>Proof :</p>
<p><span class="math inline">\(\supseteq\)</span> : Suppose <span
class="math inline">\(w\notin A_{n_1},\cdots,A_{n_k}\)</span> , so <span
class="math inline">\(\forall m\ge n_k+1,w\in C_m=\cap_{n=m}^{\infty}
A_n\)</span>, so <span class="math inline">\(w\in \liminf
A_n\)</span>.</p>
<p><span class="math inline">\(\subseteq\)</span> : Suppose <span
class="math inline">\(w\notin A_{n_1},A_{n_2},\cdots\)</span> , so <span
class="math inline">\(\forall m&gt;0,\exists n_k&gt;m,w\notin
A_{n_k}\)</span> , so <span class="math inline">\(w\notin C_m\)</span> ,
so <span class="math inline">\(w\notin \liminf A_n\)</span>.</p></li>
<li><p>Def [ <strong><em>infinitely often (i.o.)</em></strong> ]
<strong><em>infinitely often</em></strong> : appears infinitely times
<span class="math display">\[
\limsup A_n=\{w:w\in A_n\text{ i.o.}\}
\]</span></p></li>
<li><p>(*) Property</p>
<p><span class="math inline">\(P(\limsup A_n)\ge \limsup P(A_n)\)</span>
, <span class="math inline">\(P(\liminf A_n)\le \liminf
P(A_n)\)</span></p></li>
<li><p>THM : The following three statements are equivalent</p>
<p>(i). <span class="math inline">\(X_n\to 0\)</span> a.s.</p>
<p>(ii). <span class="math inline">\(\forall \epsilon&gt;0\)</span> ,
<span class="math inline">\(P(w:|X_n(w)|&gt;\epsilon \text{
i.o.})=0\)</span></p>
<p>(iii). Let <span
class="math inline">\(A_n(\epsilon)=\{w:|X_n(w)|&gt;\epsilon\}\)</span>
, <span class="math inline">\(P\left(\bigcup_{\epsilon&gt;0} \limsup
A_n(\epsilon)\right)=0\)</span></p>
<p>Proof :</p>
<p>(i)<span class="math inline">\(\to\)</span> (ii) : <span
class="math inline">\(X_n\to 0\)</span> a.s. means <span
class="math inline">\(P\{w:X_n(w)\not\to 0\}=0\)</span> .</p>
<p>Let <span class="math inline">\(S=\{w:X_n(w)\to 0\}\)</span> , so
<span class="math inline">\(\forall w\in S\)</span> , <span
class="math inline">\(\forall \epsilon&gt;0\)</span> , <span
class="math inline">\(\exists N_{w,\epsilon}&gt;0\)</span> , <span
class="math inline">\(\forall n&gt;N_{w,\epsilon}\)</span> , <span
class="math inline">\(|X_n(w)|\le \epsilon\)</span> . Therefore , <span
class="math inline">\(w\notin \{w:|X_n(w)|&gt;\epsilon\text{
i.o.}\}\)</span>. Therefore , <span
class="math inline">\(\{w:X_n(w)\not\to 0\}\supseteq
\{w:|X_n(w)|&gt;\epsilon\text{ i.o.}\}\)</span>.</p>
<p>(ii)<span class="math inline">\(\to\)</span>(iii) : <span
class="math inline">\(\forall 0&lt;\epsilon_1&lt;\epsilon_2\)</span> ,
<span class="math inline">\(\limsup A_n(\epsilon_1)\supseteq \limsup
A_n(\epsilon_2)\)</span></p>
<p>Let <span class="math inline">\(\epsilon_1,\epsilon_2,\cdots\)</span>
be a sequence , <span class="math inline">\(\lim_{i\to\infty}
\epsilon_i=0\)</span> , so <span class="math display">\[
\bigcup_{\epsilon&gt;0}\limsup A_n(\epsilon)=\lim_{I\to\infty}
\bigcup_{i=1}^I\limsup A_n(\epsilon_i)
\]</span> Therefore , <span class="math display">\[
\begin{aligned}
&amp;\quad P\left(\bigcup_{\epsilon&gt;0}\limsup A_n(\epsilon)\right)\\
&amp;\le\lim_{I\to\infty}\sum_{i=1}^I P(\limsup A_n(\epsilon_i))\\
&amp;=\lim_{I\to\infty}\sum_{i=1}^I P(\{w:|X_n(w)|&gt;\epsilon_i\text{
i.o.}\})\\
&amp;=0
\end{aligned}
\]</span> (iii)<span class="math inline">\(\to\)</span>(i) : Let <span
class="math inline">\(\Omega_0=\{w:X_n(w)\not\to 0\}\)</span> , we only
need to prove that <span class="math inline">\(\bigcup_{\epsilon&gt;0}
\limsup A_n(\epsilon)=\Omega_0\)</span>.</p>
<ol type="1">
<li><p><span class="math inline">\(\forall \epsilon&gt;0\)</span> ,
<span class="math inline">\(\limsup
A_n(\epsilon)=\{w:|X_n(w)|&gt;\epsilon\text{ i.o.}\}\)</span></p>
<p>For <span class="math inline">\(w\in \limsup A_n(\epsilon)\)</span> ,
<span class="math inline">\(X_n(w)\not\to 0\)</span> , so <span
class="math inline">\(w\in \Omega_0\)</span>.</p></li>
<li><p><span class="math inline">\(\forall w\in \Omega_0\)</span> ,
<span class="math inline">\(\exists \epsilon_0&gt;0\)</span> , <span
class="math inline">\(\forall N&gt;0\)</span> , <span
class="math inline">\(\exists n_N&gt;N\)</span> , s.t. <span
class="math inline">\(|X_n(w)|&gt;\epsilon_0\)</span> .</p>
<p>Therefore , <span class="math inline">\(w\in\)</span> infinitely many
<span class="math inline">\(A_n(\epsilon_0)\)</span></p>
<p>Therefore , <span class="math inline">\(w\in \limsup
A_n(\epsilon_0)\)</span></p></li>
</ol></li>
</ol></li>
<li><p>Borel-Cantelli Lemma</p>
<ol type="1">
<li><p>THM [ <strong><em>Borel-Cantelli Lemma</em></strong> ] : <span
class="math inline">\(A_1,A_2,\cdots\)</span> be a sequence of events ,
<span class="math display">\[
\sum_{n=1}^{\infty}P(A_n)&lt;\infty\Rightarrow P(w:w\in A_n\text{
i.o.})=0
\]</span></p></li>
<li><p>Proof :</p>
<p>Let <span class="math inline">\(N(w)=\sum_{n=1}^{\infty}\mathbb
1_{A_k}(w)\)</span> , ( which indicates the number of <span
class="math inline">\(A_n\)</span> that <span
class="math inline">\(w\)</span> appears ) , therefore, <span
class="math display">\[
\begin{aligned}
E[N]&amp;=E\left[\sum_{n=1}^{\infty}\mathbb 1_{A_k}\right]\\
&amp;=\sum_{n=1}^{\infty} E[\mathbb 1_{A_k}]\qquad \text{using
Fubini&#39;s Theorem}\\
&amp;=\sum_{k=1}^{\infty}P(A_k)\\
&amp;&lt;\infty
\end{aligned}
\]</span> Therefore , <span
class="math inline">\(N(w)&lt;\infty\)</span> a.s.</p>
<p>Therefore , <span class="math display">\[
P(w:w\in A_n\text{ i.o.})=P(w:N(w)=\infty)=0
\]</span></p></li>
</ol></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>概率与统计</category>
      </categories>
      <tags>
        <tag>概率论-条件概率</tag>
        <tag>概率论-条件概率-条件期望</tag>
        <tag>概率论-条件概率-Bayes公式</tag>
        <tag>概率论-条件概率-相关性</tag>
        <tag>概率论-条件概率-协方差</tag>
        <tag>概率论-大数定理</tag>
        <tag>概率论-大数定理-弱大数定理</tag>
        <tag>概率论-大数定理-L2弱大数定理</tag>
        <tag>概率论-大数定理-三角阵弱大数定理</tag>
        <tag>概率论-大数定理-Borel-Cantelli引理</tag>
        <tag>实分析-集合的上下极限</tag>
      </tags>
  </entry>
  <entry>
    <title>Probability and Statistics 5</title>
    <url>/2023/11/03/Probability-and-Statistics-5/</url>
    <content><![CDATA[<h2 id="chapter-2-laws-of-large-numbers">Chapter 2 Laws of Large
Numbers</h2>
<h3 id="independence">2.1 Independence</h3>
<ol start="2" type="1">
<li><p>THM [ <strong><em>Independence of r.v. is a special case of <span
class="math inline">\(\sigma\)</span>-field</em></strong> ]</p>
<ol type="1">
<li><p>THM</p>
<ol type="1">
<li>If random variables <span class="math inline">\(X,Y\)</span> are
independent , then <span
class="math inline">\(\sigma(X),\sigma(Y)\)</span> are independent</li>
<li>If <span class="math inline">\(\mathcal F\)</span> and <span
class="math inline">\(\mathcal G\)</span> are independent , <span
class="math inline">\(X\in \mathcal F,Y\in \mathcal G\)</span> , then
<span class="math inline">\(X,Y\)</span> are independent</li>
</ol></li>
<li><p>Proof</p>
<ol type="1">
<li><p>We want to show that <span class="math inline">\(\forall A\in
\sigma(X),B\in \sigma(Y)\)</span> , <span class="math inline">\(P(A\cap
B)=P(A)P(B)\)</span></p>
<p>By definition of <span class="math inline">\(\sigma(X)\)</span> , we
can find <span class="math inline">\(C\in \mathcal R\)</span> s.t. <span
class="math inline">\(A=\{X\in C\}\)</span> . Similarly , we can find
<span class="math inline">\(D\in \mathcal R\)</span> s.t. <span
class="math inline">\(B=\{Y\in D\}\)</span>. Therefore , <span
class="math display">\[
P(A\cap B)=P(\{X\in C\}\cap \{Y\in D\})=P(\{X\in C\})P(\{Y\in
D\})=P(A)P(B)
\]</span></p></li>
<li><p>We want to show that <span class="math inline">\(\forall C,D\in
\mathcal R\)</span> , <span class="math inline">\(P(\{X\in C\}\cap
\{Y\in D\})=P(\{X\in C\})P(\{Y\in D\})\)</span>.</p>
<p>Let <span class="math inline">\(A=\{X\in C\}\)</span> , <span
class="math inline">\(B=\{Y\in D\}\)</span> , so <span
class="math inline">\(A\in \mathcal F,B\in \mathcal G\)</span> , so
<span class="math inline">\(A,B\)</span> are independent, so <span
class="math display">\[
P(\{X\in C\}\cap \{Y\in D\})=P(A\cap B)=P(A)P(B)=P(\{X\in C\})P(\{Y\in
D\})
\]</span></p></li>
</ol></li>
<li><p>Remark</p>
<p>This means that <span class="math inline">\(X,Y\)</span> independent
is equivalent to <span
class="math inline">\(\sigma(X),\sigma(Y)\)</span> independent , which
is a special case of independence of <span
class="math inline">\(\sigma\)</span>-field.</p></li>
</ol></li>
<li><p>THM [ <strong><em>Independence of events is a special case of
r.v.</em></strong> ]</p>
<ol type="1">
<li><p>THM</p>
<ol type="1">
<li>If <span class="math inline">\(A,B\)</span> are independent , then
<span class="math inline">\(A\perp\!\!\!\perp B^c\)</span> , <span
class="math inline">\(A^c\perp\!\!\!\perp B\)</span> , <span
class="math inline">\(A^c\perp\!\!\!\perp B^c\)</span>.</li>
<li><span class="math inline">\(A\perp\!\!\!\perp B\iff \mathbb
1_A\perp\!\!\!\perp\mathbb 1_B\)</span></li>
</ol></li>
<li><p>Proof</p>
<ol type="1">
<li><p><span class="math inline">\(P(A\cap B^c)=P(A)-P(A\cap B)\)</span>
, and <span class="math inline">\(P(A\cap B)=P(A)P(B)\)</span> , so
<span class="math display">\[
P(A\cap B^c)=P(A)(1-P(B))=P(A)P(B^c)
\]</span> Similar to prove the rest.</p></li>
<li><p>Firstly, If <span class="math inline">\(\mathbb
1_A\perp\!\!\!\perp\mathbb 1_B\)</span> , then let <span
class="math inline">\(C=D=\{1\}\)</span> , so <span
class="math inline">\(\{\mathbb 1_A\in C\}=A\)</span> , <span
class="math inline">\(\{\mathbb 1_B\in D\}=B\)</span> , so <span
class="math inline">\(P(A\cap B)=P(A)P(B)\)</span>.</p>
<p>Secondly, if <span class="math inline">\(A\perp\!\!\!\perp B\)</span>
, we want to prove that <span class="math inline">\(\forall C,D\in
\mathcal R\)</span> , <span class="math inline">\(P(\{\mathbb 1_A\in
C\}\{\mathbb 1_B\in D\})=P(\{\mathbb 1_A\in C\})P(\{\mathbb 1_B\in
D\})\)</span>.</p>
<p><span class="math inline">\(\{\mathbb 1_A\in
C\}\in\{\varnothing.A.A^c,\Omega\}\)</span> , <span
class="math inline">\(\{\mathbb 1_B\in D\}\in
\{\varnothing,B,B^c,\Omega\}\)</span>.</p>
<p>For <span class="math inline">\(\varnothing,\Omega\)</span> the
statement is trivial.</p>
<p>For <span class="math inline">\(A,A^c\)</span> and <span
class="math inline">\(B,B^c\)</span> , by (1) , they are all independent
, so <span class="math inline">\(\mathbb 1_A\perp\!\!\!\perp\mathbb
1_B\)</span>.</p></li>
</ol></li>
<li><p>Remark</p>
<p>This means that <span class="math inline">\(A,B\)</span> independent
is equivalent to <span class="math inline">\(\mathbb 1_A,\mathbb
1_B\)</span> independent , which is a special case of independence of
random variables.</p></li>
</ol></li>
<li><p>Independence of finite collection of objects</p>
<ol type="1">
<li><p>Def [ <strong><em>Independence of finite <span
class="math inline">\(\sigma\)</span>-fields</em></strong> ] : <span
class="math inline">\(\mathcal F_1,\cdots,\mathcal F_n\)</span> are
independent , if <span class="math inline">\(\forall A_i\in \mathcal
F_i\)</span>, <span class="math display">\[
P\left(\bigcap_{i=1}^n A_i\right)=\prod_{i=1}^n P(A_i)
\]</span></p></li>
<li><p>Def [ <strong><em>Independence of finite random
variables</em></strong> ] : <span
class="math inline">\(X_1,\cdots,X_n\)</span> are independent , if <span
class="math inline">\(\forall B_i\in \mathcal R\)</span>, <span
class="math display">\[
P\left(\bigcap_{i=1}^n \{X_i\in B_i\}\right)=\prod_{i=1}^n P(\{X_i\in
B_i\})
\]</span></p></li>
<li><p>Def [ <strong><em>Independence of finite events</em></strong> ] :
<span class="math inline">\(A_1,\cdots,A_n\)</span> are independent , if
<span class="math inline">\(\forall I\subseteq [n]\)</span>, <span
class="math display">\[
P\left(\bigcap_{i\in I} A_i\right)=\prod_{i\in I}P(A_i)
\]</span> Remark : we need to enumerate all subset of events, not just
<span class="math inline">\(I=[n]\)</span>.</p>
<p>This is indeed reasonable since we can let <span
class="math inline">\(X_i=\mathbb 1_{A_i}\)</span> , let <span
class="math inline">\(B_i=\begin{cases}\{1\}&amp;i\in I\\\mathbb
R&amp;i\notin I\end{cases}\)</span> .</p></li>
<li><p>Def [ <strong><em>pairwise independence</em></strong> ] : <span
class="math inline">\(A_1,\cdots,A_n\)</span> are <strong><em>pairwise
independent</em></strong> , if <span class="math inline">\(\forall i\neq
j\in [n]\)</span> , <span class="math inline">\(A_i\perp\!\!\!\perp
A_j\)</span>.</p></li>
<li><p>Exp [ <strong><em>pairwise independence <span
class="math inline">\(\not\Rightarrow\)</span> joint
independence</em></strong> ]</p>
<p>When <span class="math inline">\(A_1,A_2,A_3\)</span> are pairwise
independent , <span class="math inline">\(A_1,A_2,A_3\)</span> may not
be independent.</p>
<p>Let <span class="math inline">\(X_1,X_2,X_3\)</span> be independent
random variables , <span
class="math inline">\(P(X_i=0)=P(X_i=1)=\frac{1}{2}\)</span>.</p>
<p>Let <span class="math inline">\(A_1=\{X_1=X_2\}\)</span> , <span
class="math inline">\(A_2=\{X_2=X_3\}\)</span> , <span
class="math inline">\(A_3=\{X_1=X_3\}\)</span>.</p>
<p><span class="math inline">\(P(A_i)=\frac{1}{2}\)</span> , <span
class="math inline">\(P(A_i\cap A_j)=\frac{1}{4}\)</span> , <span
class="math inline">\(P(A_1\cap A_2\cap
A_3)=\frac{1}{4}\)</span>.</p></li>
<li><p>Prop : If <span class="math inline">\(A_1,\cdots,A_n\)</span> are
independent,</p>
<ol type="1">
<li><span class="math inline">\(A_1^c,A_2,\cdots,A_n\)</span> are
independent</li>
<li><span class="math inline">\(\mathbb 1_{A_1},\mathbb
1_{A_2},\cdots,\mathbb 1_{A_n}\)</span> are independent</li>
</ol></li>
</ol></li>
<li><p>Independence of infinite collection of objects</p>
<p><span class="math inline">\(O_1,O_2,\cdots\)</span> are independent ,
if any finite sub-collection is independent. ( <span
class="math inline">\(O_n\)</span> can be <span
class="math inline">\(\sigma\)</span>-field , r.v. , event ).</p>
<p>THM : <span class="math inline">\(X_1,X_2,\cdots\)</span> are
independent <span class="math inline">\(\iff\)</span> <span
class="math inline">\(\sigma(X_1),\sigma(X_2),\cdots\)</span> are
independent</p></li>
<li><p>Sufficient condition of independence</p>
<ol type="1">
<li><p>(*) Def [ <span class="math inline">\(\pi\)</span>-system ] :
<span class="math inline">\(\mathcal A\)</span> is a <span
class="math inline">\(\pi\)</span>-system , if <span
class="math inline">\(\forall A,B\in \mathcal A\)</span> , <span
class="math inline">\(A\cap B\in \mathcal A\)</span>.</p></li>
<li><p>(*) THM : Suppose <span class="math inline">\(\mathcal
A_1,\cdots,A_n\)</span> are independent , and each <span
class="math inline">\(A_i\)</span> is a <span
class="math inline">\(\pi\)</span>-system , then <span
class="math inline">\(\sigma(\mathcal A_1),\sigma(\mathcal
A_2),\cdots,\sigma(\mathcal A_n)\)</span> are independent.</p>
<blockquote>
<p><span class="math inline">\(\mathcal A_1,\cdots,\mathcal A_n\)</span>
here are not necessarily <span
class="math inline">\(\sigma\)</span>-field , we define its independence
similar to definition of <span
class="math inline">\(\sigma\)</span>-field.</p>
</blockquote></li>
<li><p>Cor : If <span class="math inline">\(\forall x_1,\cdots,x_n\in
(-\infty,\infty]\)</span> , <span class="math display">\[
P\left(\bigcap_{i=1}^n \{X_i\le x_i\}\right)=\prod_{i=1}^n P(\{X_i\le
x_i\})
\]</span> Then <span class="math inline">\(X_1,\cdots,X_n\)</span> are
independent.</p>
<blockquote>
<p>Let <span class="math inline">\(\mathcal A_i=\left\{\{X_i\le
x_i\}:x_i\in (-\infty,\infty]\right\}\)</span> , so <span
class="math inline">\(\{X_i\le x\}\cap \{X_i\le y\}=\{X_i\le x\land
y\}\)</span> , so <span class="math inline">\(\mathcal A_i\)</span> is
<span class="math inline">\(\pi\)</span>-system.</p>
<p>Since we allow <span class="math inline">\(x_i=\infty\)</span> ( that
is <span class="math inline">\(\Omega\in A_i\)</span> ) , so <span
class="math inline">\(\sigma(\mathcal A_i)=\sigma(X_i)\)</span> , so
<span class="math inline">\(X_i\)</span> are independent.</p>
</blockquote></li>
<li><p>Cor : Suppose <span class="math inline">\(\mathcal
F_{i,j}\)</span> ( <span class="math inline">\(1\le i\le n,1\le j\le
m(i)\)</span> ) are independent , let <span
class="math inline">\(\mathcal G_i=\sigma(\cup_{j}\mathcal
F_{i,j})\)</span> , so <span class="math inline">\(\mathcal
G_1,\cdots,\mathcal G_n\)</span> are independent.</p>
<blockquote>
<p>Let <span class="math inline">\(\mathcal A_i=\left\{\cap_j
A_{i,j}:A_{i,j}\in \mathcal F_{i,j}\right\}\)</span> , so <span
class="math inline">\(\mathcal A_i\)</span> is a <span
class="math inline">\(\pi\)</span>-system containing <span
class="math inline">\(\Omega\)</span> and <span
class="math inline">\(\cup_{j}\mathcal F_{i,j}\)</span> , so <span
class="math inline">\(\sigma(\mathcal A_i)=\mathcal G_i\)</span>.</p>
</blockquote></li>
<li><p>Cor : Suppose <span class="math inline">\(X_{i,j}\)</span> (
<span class="math inline">\(1\le i\le n,1\le j\le m(i)\)</span> ) are
independent , <span class="math inline">\(f_i:\mathbb R^{m(i)}\to
\mathbb R\)</span> are measurable, then</p>
<p><span
class="math inline">\(Y_i=f_i(X_{i,1},\cdots,X_{i,m(i)})\)</span> are
independent.</p>
<blockquote>
<p>Let <span class="math inline">\(\mathcal
F_{i,j}=\sigma(X_{i,j})\)</span> , <span class="math inline">\(\mathcal
G_i=\sigma(\cup_{j}\mathcal F_{i,j})\)</span> , so <span
class="math inline">\(Y_i\in \mathcal G_i\)</span>.</p>
</blockquote>
<p>Remark : when <span class="math inline">\(X_1,\cdots,X_n\)</span> are
independent , let <span class="math inline">\(X=X_1,Y=X_2X_3\cdots
X_n\)</span> , so <span class="math inline">\(X\perp\!\!\!\perp
Y\)</span>.</p></li>
</ol></li>
<li><p>Distribution, Expectation of independent random variables</p>
<ol type="1">
<li><p>THM [ <strong><em>Distribution of Independent r.v.</em></strong>
] : Suppose <span class="math inline">\(X_1,\cdots,X_n\)</span> are
independent , and <span class="math inline">\(X_i\)</span> has
distribution <span class="math inline">\(\mu_i\)</span> , then</p>
<p><span class="math inline">\((X_1,\cdots,X_n)\)</span> has
distribution <span class="math inline">\(\mu=\mu_1\times\cdots\times
\mu_n\)</span></p>
<p>Proof : <span class="math display">\[
\begin{aligned}
&amp;\quad P((X_1,\cdots,X_n)\in (A_1,\cdots,A_n))\\
&amp;=P\left(\bigcap_{i=1}^n \{X_i\in A_i\}\right)\\
&amp;=\prod_{i=1}^n P(\{X_i\in A_i\})\\
&amp;=\prod_{i=1}^n \mu_i(A_i)\\
&amp;=\mu(A_1\times\cdots\times A_n)
\end{aligned}
\]</span></p></li>
<li><p>THM [ <strong><em>Expectation of Independent r.v.</em></strong> ]
:</p>
<ol type="1">
<li><p>THM</p>
<p>Suppose <span class="math inline">\(X\perp\!\!\!\perp Y\)</span> ,
and have distribution <span class="math inline">\(\mu,\nu\)</span> . If
<span class="math inline">\(h:\mathbb R^2\to\mathbb R\)</span> is a
measurable function, and either <span class="math inline">\(h\ge
0\)</span> or <span class="math inline">\(E[|h(X,Y)|]&lt;\infty\)</span>
, then <span class="math display">\[
E[h(X,Y)]=\iint h(x,y)\mu(dx)\nu(dy)
\]</span> In particular , if <span
class="math inline">\(h(x,y)=f(x)g(y)\)</span> , where <span
class="math inline">\(f,g:\mathbb R\to\mathbb R\)</span> are measurable
functions , and either <span class="math inline">\(f,g\ge 0\)</span> or
<span class="math inline">\(E[|f(X)|],E[|g(Y)|]&lt;\infty\)</span> ,
then <span class="math display">\[
E[f(X)g(Y)]=E[f(X)]E[g(Y)]
\]</span></p></li>
<li><p>Proof</p>
<p>Since <span class="math inline">\(X,Y\)</span> are independent ,
<span class="math inline">\(\mu\times \nu\)</span> is the distribution
of <span class="math inline">\(X\times Y\)</span> . By
<strong><em>Fubini's Theorem</em></strong> , <span
class="math display">\[
E[h(X,Y)]=\int hd(\mu\times \nu)=\iint h(x,y)\mu(dx)\nu(dy)
\]</span> When <span class="math inline">\(f,g\ge 0\)</span> , <span
class="math inline">\(h=fg\ge 0\)</span> , so <span
class="math display">\[
E[f(X)g(Y)]=\iint f(x)g(y)\mu(dx)\nu(dy)=\int
g(y)E[f(X)]\nu(dy)=E[f(X)]E[g(Y)]
\]</span> When <span
class="math inline">\(E[|f(X)|],E[|g(Y)|]&lt;\infty\)</span> , <span
class="math inline">\(E[|f(X)g(Y)|]=E[|f(X)|]E[|g(Y)|]&lt;\infty\)</span>
(by above) , so <span class="math display">\[
E[f(X)g(Y)]=\iint f(x)g(y)\mu(dx)\nu(dy)=\int
g(y)E[f(X)]\nu(dy)=E[f(X)]E[g(Y)]
\]</span></p></li>
<li><p>Loophole : when <span class="math inline">\(f,g\ge 0\)</span> ,
<span class="math inline">\(E[f(X)]=\infty\)</span> , <span
class="math inline">\(E[g(Y)]=0\)</span> , what's the result ?</p>
<p>Fix : <span class="math inline">\(E[g(Y)]=0\)</span> , so <span
class="math inline">\(g(Y)=0\)</span> a.s. , so <span
class="math inline">\(f(X)g(Y)=0\)</span> a.s. , so <span
class="math inline">\(E[f(X)g(Y)]=0\)</span></p></li>
<li><p>Remarks :</p>
<ol type="1">
<li><p>This holds for <span class="math inline">\(n\)</span> independent
r.v. . If <span class="math inline">\(X_1,\cdots,X_n\)</span> are
independent , either <span class="math inline">\(\forall i\in [n],X_i\ge
0\)</span> or <span class="math inline">\(\forall i\in [n] ,
E[|X_i|]&lt;\infty\)</span> , then <span class="math display">\[
E\left[\prod_{i=1}^n X_i\right]=\prod_{i=1}^n E[X_i]
\]</span></p></li>
<li><p>Even if <span class="math inline">\(X,Y\)</span> are not
independent , <span class="math inline">\(E[XY]=E[X]E[Y]\)</span> can
still hold .</p>
<p>Def [ <strong><em>uncorrelated</em></strong> ] : If <span
class="math inline">\(E[X^2],E[Y^2]&lt;\infty\)</span> , and <span
class="math inline">\(E[XY]=E[X]E[Y]\)</span> , then <span
class="math inline">\(X,Y\)</span> are
<strong><em>uncorrelated</em></strong>.</p></li>
</ol></li>
</ol></li>
</ol></li>
<li><p>Sum of Independent Random Variables</p>
<ol type="1">
<li><p>THM : If <span class="math inline">\(X,Y\)</span> are independent
, <span class="math inline">\(F(x)=P(X\le x) , G(y)=P(Y\le y)\)</span> ,
then <span class="math display">\[
P(X+Y\le Z)=\int F(z-y)d G(y)
\]</span> Suppose that <span class="math inline">\(\nu\)</span> is the
distribution of <span class="math inline">\(Y\)</span> , <span
class="math inline">\(dG(y)\)</span> means <span
class="math inline">\(\nu(dy)\)</span> .</p>
<p>Remark : this is also called the convolution of <span
class="math inline">\(F\)</span> and <span
class="math inline">\(G\)</span> , denoted as <span
class="math inline">\(F*G\)</span> . <span class="math display">\[
(F*G)(z)=\int F(z-y)dG(y)
\]</span></p></li>
<li><p>THM : Suppose <span class="math inline">\(X\)</span> with density
<span class="math inline">\(f\)</span> , <span
class="math inline">\(Y\)</span> with distribution <span
class="math inline">\(G\)</span> , <span
class="math inline">\(X,Y\)</span> are independent , then <span
class="math inline">\(X+Y\)</span> has density <span
class="math inline">\(h\)</span> : <span class="math display">\[
h(x)=\int f(x-y)dG(y)
\]</span> Moreover , when <span class="math inline">\(Y\)</span> with
density <span class="math inline">\(g\)</span> , <span
class="math display">\[
h(x)=\int f(x-y)g(y)dy
\]</span></p></li>
</ol></li>
</ol>
<h3 id="conditional-expectation">2.2 Conditional Expectation</h3>
<ol type="1">
<li><p>Conditioning on set</p>
<p>Def [ <strong><em>conditioning on set</em></strong> ] : <span
class="math inline">\(A,B\)</span> be two events , the probability of
<span class="math inline">\(A\)</span> given <span
class="math inline">\(B\)</span> is : <span
class="math inline">\(P(A|B)=\frac{P(A\cap B)}{P(B)}\)</span>.</p></li>
<li><p>Conditioning on discrete random variables</p>
<ol type="1">
<li><p>Derivation :</p>
<p>Considering <span class="math inline">\(X,Z\)</span> : discrete ,
with finite possibilities .</p>
<p><span class="math inline">\(X\in \{x_1,\cdots,x_m\},Z\in
\{z_1,\cdots,z_n\}\)</span> <span class="math display">\[
P(X=x_i|Z=z_j)=\frac{P(X=x_i,Z=z_j)}{P(Z=z_j)}
\]</span></p>
<blockquote>
<p>Remark : <span class="math inline">\(\sum\limits_{i=1}^m
P(X=x_i|Z=z_j)=1\)</span></p>
</blockquote>
<p>We can define <span class="math display">\[
E[X|Z=z_j]=\sum_{i=1}^m x_iP(X=x_i|Z=z_j)=h(z_j)
\]</span> which is a function of <span
class="math inline">\(z_j\)</span> . Therefore , we define <span
class="math inline">\(Y=E[X|Z]\)</span> is a random variable s.t.</p>
<p><span class="math inline">\(\forall w\in
\Omega,Y(w)=h(Z(w))\)</span></p></li>
<li><p>Properties</p>
<ol type="1">
<li><span class="math inline">\(E[X|Z]\)</span> is a function of <span
class="math inline">\(Z\)</span></li>
<li><span class="math inline">\(\forall G\in \sigma(Z)\)</span> , <span
class="math inline">\(\int_G YdP=\int_G XdP\)</span></li>
</ol></li>
<li><p>Proof of Prop. 2</p>
<p>Consider <span class="math inline">\(G_i=\{w:Z(w)=z_i\}\)</span> , so
by definition , there exists <span class="math inline">\(I\subseteq
[n]\)</span> , s.t. <span class="math inline">\(G=\cup_{i\in
I}G_i\)</span>.</p>
<p>We only need to prove that <span
class="math inline">\(\int_{G_i}YdP=\int_{G_i}XdP\)</span> <span
class="math display">\[
\begin{aligned}
&amp;\quad\int_{Z=z_i}YdP\\
&amp;=h(z_i)P(Z=z_i)\\
&amp;=\sum_{j=1}^m x_j P(X=x_j|Z=z_i)P(Z=z_i)\\
&amp;=\sum_{j=1}^m x_j P(X=x_j,Z=z_i)\\
&amp;=\int_{G_i}XdP
\end{aligned}
\]</span></p></li>
</ol></li>
<li><p>Conditioning on continuous random variables</p>
<ol type="1">
<li><p>Derivation</p>
<ol type="1">
<li><p>Def [ <strong><em>joint PDF</em></strong> ] : <span
class="math inline">\(f_{X,Z}\)</span> is joint PDF , if <span
class="math inline">\(\forall B\in \mathcal R^2\)</span> , <span
class="math display">\[
P((X,Z)\in B)=\int_{(x,z)\in B}f_{X,Z}(x,z)dxdz
\]</span></p></li>
<li><p>Def [ <strong><em>marginal PDF</em></strong> ] : <span
class="math inline">\(f_Z\)</span> is marginal PDF for <span
class="math inline">\(Z\)</span> , defined as <span
class="math display">\[
f_Z(z)=\int_{-\infty}^{\infty}f_{X,Z}(x,z)dx
\]</span></p></li>
<li><p>Prop : <span class="math display">\[
\begin{aligned}
P(Z\in A)&amp;=P(Z\in A,X\in (-\infty,\infty))\\
&amp;=\int_{Z\in A}\int_{-\infty}^{\infty} f_{X,Z}(x,z)dxdz\\
&amp;=\int_{Z\in A}f_Z(z)dz
\end{aligned}
\]</span></p></li>
<li><p>Def [ <strong><em>conditional PDF</em></strong> ] : <span
class="math inline">\(f_{X|Z}\)</span> is a conditional PDF , defined as
<span class="math display">\[
f_{X|Z}(x|z)=\frac{f_{X,Z}(x,z)}{f_Z(z)}
\]</span> Therefore <span class="math display">\[
\begin{aligned}
P(X\in A,Z\in B)&amp;=\int_{Z\in B}\int_{X\in A}f_{X,Z}(x,z)dxdz\\
&amp;=\int_{Z\in B}\int_{X\in A}f_{X|Z}(x|z)f_Z(z)dxdz\\
&amp;=\int_{Z\in B} f_Z(z)\left(\int_{X\in A}f_{X|Z}(x,z)dx\right)dz\\
&amp;\sim \int_{Z\in B}f_Z(z)P(X\in A|Z=z) dz
\end{aligned}
\]</span> Compare to discrete version : <span class="math display">\[
P(X\in A,Z\in B)=\sum_{z_i\in B}P(X\in A|Z=z_i) P(Z=z_i)
\]</span> We can use <span class="math inline">\(\int_{X\in A}
f_{X|Z}(x,z)dx\)</span> to denote <span class="math inline">\(P(X\in
A|Z=z)\)</span></p></li>
<li><p>Def [ <strong><em>conditional expected value for continuous
r.v.</em></strong> ] : <span class="math display">\[
h(z)=E[X|Z=z]=\int xf_{X|Z}(x|z)dx
\]</span> which is a function of <span class="math inline">\(z\)</span>
. Let <span class="math inline">\(Y=h\circ Z\)</span> .</p></li>
</ol></li>
<li><p>Properties</p>
<ol type="1">
<li><span class="math inline">\(E[X|Z]\)</span> is a function of <span
class="math inline">\(Z\)</span></li>
<li><span class="math inline">\(\forall G\in \sigma(Z)\)</span> , <span
class="math inline">\(\int_G YdP=\int_G XdP\)</span></li>
</ol></li>
<li><p>Proof of Prop.2</p>
<p>Let <span class="math inline">\(Z_G=\{z:\exists w\in
G,Z(w)=z\}\)</span> <span class="math display">\[
\begin{aligned}
\int_{G}YdP&amp;=\int_{G}h\circ ZdP\\
&amp;=\int_{Z_G}h(z)f_Z(z)dz\quad\quad\text{using change of variable
formula}\\
&amp;=\int_{Z_G}f_Z(z)\int_{-\infty}^{\infty} xf_{X|Z}(x|z)dxdz\\
&amp;=\int_{Z_G}\int_{-\infty}^{\infty} xf_{X,Z}(x,z)dxdz\\
&amp;=\int_{(X,Z)\in R\times Z_G} xf_{X,Z}(x,z)dxdz\\
&amp;=\int_{(X,Z)\in R\times Z_G} x\mu_{X,Z}(dxdz)\\
&amp;=\int_{\tilde G}XdP\quad\quad\quad\quad\quad\text{using change of
variable formula}
\end{aligned}
\]</span> Where <span class="math inline">\(\tilde G=(X,Z)^{-1}(R\times
Z_G)=\{w:(X(w),Z(w))\in R\times Z_G\}=\{w:Z(w)\in Z_G\}\)</span>.</p>
<p>Firstly , For any <span class="math inline">\(w\in G\)</span> , <span
class="math inline">\(Z(w)\in Z_G\)</span> , so <span
class="math inline">\(w\in \tilde G\)</span> .</p>
<p>Secondly , For any <span class="math inline">\(w\in \tilde G\)</span>
, <span class="math inline">\(Z(w)\in Z_G\)</span> , so <span
class="math inline">\(\exists w_0\in G\)</span> , <span
class="math inline">\(Z(w)=Z(w_0)\)</span> .</p>
<p>Lemma : <span class="math inline">\(\forall G\in \sigma(Z)\)</span>
for random variable <span class="math inline">\(Z\)</span> , If <span
class="math inline">\(\exists w,w&#39;\in \Omega\)</span> with <span
class="math inline">\(Z(w)=Z(w&#39;)\)</span> , then <span
class="math inline">\(w\in G\iff w&#39;\in G\)</span>.</p>
<p>Proof : Let <span class="math inline">\(G_z=\{w:Z(w)=z\}\)</span> ,
Let <span class="math inline">\(\mathcal G=\{\cup_{z\in B}G_z:B\in
\mathcal R\}\)</span> .</p>
<p>Since <span class="math inline">\(\forall B\in \mathcal R\)</span> ,
<span class="math inline">\(\{w:Z(w)\in B\}=\cup_{z\in B}G_z\in \mathcal
G\)</span> , <span class="math inline">\(Z(w)\)</span> is a measurable
map from <span class="math inline">\((\Omega,\mathcal G)\)</span> to
<span class="math inline">\((\mathbb R,\mathcal R)\)</span></p>
<p>Therefore , <span class="math inline">\(\sigma(Z)\subseteq \mathcal
G\)</span> ( since <span class="math inline">\(\sigma(Z)\)</span> is the
smallest ). Therefore , all sets in <span
class="math inline">\(\sigma(Z)\)</span> is of the form <span
class="math inline">\(\cup_{z\in B}G_z:B\in \mathcal R\)</span> , so
<span class="math inline">\(G\)</span> will contain either both of <span
class="math inline">\(w,w&#39;\)</span> or neither of <span
class="math inline">\(w,w&#39;\)</span>. <span
class="math inline">\(\Box\)</span></p>
<p>Therefore , if <span class="math inline">\(w_0\in G\)</span> , then
<span class="math inline">\(w\in G\)</span>. Finally <span
class="math inline">\(G=\tilde G\)</span>.</p></li>
</ol></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>概率与统计</category>
      </categories>
      <tags>
        <tag>概率论-独立性</tag>
        <tag>概率论-独立性-独立充分条件</tag>
        <tag>概率论-独立性-独立变量分布与期望</tag>
        <tag>实分析-卷积</tag>
        <tag>概率论-条件概率</tag>
        <tag>概率论-条件概率-条件期望</tag>
      </tags>
  </entry>
  <entry>
    <title>Probability and Statistics 4</title>
    <url>/2023/11/03/Probability-and-Statistics-4/</url>
    <content><![CDATA[<h2 id="chapter-1-background-in-probability">Chapter 1 Background in
Probability</h2>
<h3 id="properties-of-integration">1.5 Properties of Integration</h3>
<ol start="2" type="1">
<li><p>Holder's Inequality</p>
<ol type="1">
<li><p>Def [ <strong><em><span class="math inline">\(L_p\)</span>
Norm</em></strong> ] : For <span class="math inline">\(p\ge 1\)</span> ,
we can define <span class="math inline">\(L_p\)</span> Norm : <span
class="math inline">\(||f||_p=\left(\int |f|^p
d\mu\right)^{1/p}\)</span></p></li>
<li><p>THM [ <strong><em>Holder's Inequality</em></strong> ] : <span
class="math inline">\(\forall p,q&gt;1\)</span> , s.t. <span
class="math inline">\(\frac{1}{p}+\frac{1}{q}=1\)</span> , then <span
class="math display">\[
\int |fg|d\mu \le ||f||_p||g||_q
\]</span></p></li>
<li><p>Proof</p>
<blockquote>
<p>Lemma : If <span
class="math inline">\(p,q&gt;1,\frac{1}{p}+\frac{1}{q}=1\)</span> , then
<span class="math inline">\(\forall x,y\ge 0,xy\le
\frac{1}{p}x^p+\frac{1}{q}y^q\)</span> .</p>
<p>Proof :</p>
<p>Fix <span class="math inline">\(y\)</span> , let <span
class="math inline">\(f(x)=\frac{1}{p}x^p-yx+\frac{1}{q}y^q\)</span> ,
so <span class="math inline">\(f&#39;(x)=x^{p-1}-y\)</span></p>
<p>Therefore , <span
class="math inline">\(f(x)_{\min}=f(x=y^{1/(p-1)})=0\)</span> , so <span
class="math inline">\(f(x)\ge 0\)</span> . <span
class="math inline">\(\Box\)</span></p>
</blockquote>
<p>For either <span class="math inline">\(||f||_p=0\)</span> or <span
class="math inline">\(||g||_q=0\)</span> , that means <span
class="math inline">\(f=0\)</span> a.e. or <span
class="math inline">\(g=0\)</span> a.e. , so <span
class="math inline">\(\int |fg|d\mu=0\)</span></p>
<p>When <span class="math inline">\(||f||_p\neq 0,||g||_q\neq 0\)</span>
, we can suppose that <span
class="math inline">\(||f||_p=||g||_q=1\)</span> without loss of
generality .</p>
<p>Therefore , using the lemma , <span class="math display">\[
\begin{aligned}
\int |fg|d\mu&amp;\le \int
\left(\frac{1}{p}|f|^p+\frac{1}{q}|g|^q\right)d\mu\\
&amp;=\frac{1}{p}||f||_p^p+\frac{1}{q}||g||_q^q\\
&amp;=\frac{1}{p}+\frac{1}{q}\\
&amp;=1\\
&amp;=||f||_p||g||_q
\end{aligned}
\]</span></p>
<blockquote>
<p>Remark : Here it is necessary to suppose <span
class="math inline">\(||f||_p=||g||_q=1\)</span> , or otherwise we will
get <span class="math display">\[
\frac{1}{p}||f||_p^p+\frac{1}{q}||g||_q^q\ge ||f||_p||g||_q
\]</span> Which is not what we want .</p>
<p>But exactly when <span
class="math inline">\(||f||_p=||g||_q=1\)</span> , the above inequality
is equality .</p>
</blockquote></li>
<li><p>Remark : When <span class="math inline">\(p=q=2\)</span> ,
Holder's Inequality becomes <strong><em>Cauchy-Schwarz
Inequality</em></strong>. <span class="math display">\[
\left(\int |fg|d\mu\right)^2\le \left(\int f^2d\mu\right)\left(\int
g^2d\mu\right)
\]</span></p></li>
</ol></li>
<li><p>Convergence of functions</p>
<ol type="1">
<li><p>Def [ <strong><em>converge a.e.</em></strong> ] : <span
class="math inline">\(f_1,f_2,\cdots\)</span> is a sequence of functions
. If <span class="math inline">\(f_n\)</span> <strong><em>converge to
<span class="math inline">\(f\)</span> almost everywhere</em></strong> ,
it means that <span class="math display">\[
\mu\left(\left\{w:\lim_{n\to\infty} f_n(w)\neq f(w)\right\}\right)=0
\]</span></p></li>
<li><p>Def [ <strong><em>converge in measure</em></strong> ] : <span
class="math inline">\(f_1,f_2,\cdots\)</span> is a sequence of functions
. If <span class="math inline">\(f_n\)</span> *** converge to <span
class="math inline">\(f\)</span> in measure*** , it means that <span
class="math display">\[
\forall \epsilon&gt;0 ,
\lim_{n\to\infty}\mu\left(\left\{w:|f_n(w)-f(w)|\ge
\epsilon\right\}\right)=0
\]</span></p></li>
<li><p>Def [ <strong><em>almost uniform convergence</em></strong> ] :
<span class="math inline">\(f_1,f_2,\cdots\)</span> is a sequence of
functions . Suppose <span class="math inline">\(f:E\to \mathbb
R\)</span> is a function . If <span class="math inline">\(f_n\)</span>
<strong><em>converge to <span class="math inline">\(f\)</span> almost
uniformly</em></strong> , it means that</p>
<p><span class="math inline">\(\forall \epsilon_1&gt;0\)</span> , there
exists a measurable set <span class="math inline">\(D\subseteq
E\)</span> such that <span
class="math inline">\(\mu(D)&lt;\epsilon_1\)</span> , such that <span
class="math display">\[
\forall \epsilon&gt;0,\exists N&gt;0,\forall n&gt;N,x\in E\backslash
D,|f_n(x)-f(x)|&lt;\epsilon
\]</span></p></li>
<li><p>THM [ <strong><em>Egovov's THM</em></strong> ] : If the support
set <span class="math inline">\(E\)</span> of <span
class="math inline">\(f\)</span> with <span
class="math inline">\(\mu(E)&lt;\infty\)</span> , then</p>
<p><span class="math inline">\(f_n\to f\)</span> a.e. <span
class="math inline">\(\Rightarrow\)</span> <span
class="math inline">\(f_n\to f\)</span> almost uniformly</p></li>
<li><p>THM : <span class="math inline">\(f_n\to f\)</span> almost
uniformly <span class="math inline">\(\Rightarrow\)</span> <span
class="math inline">\(f_n\to f\)</span> in measure</p></li>
<li><p>Remarks</p>
<ol type="1">
<li><p>the difference between convergences</p>
<ul>
<li>a.e. : 相当于逐点收敛，但每个点的收敛速度可能不一样</li>
<li>uniform : 相当于一致收敛，趋于无穷时存在的 <span
class="math inline">\(N\)</span> 只和 <span
class="math inline">\(\epsilon\)</span> 有关（而不依赖于 <span
class="math inline">\(w\)</span> ），相当于衡量一致的收敛速度</li>
<li>in measure : 类似于一致收敛，但可以允许不收敛的地方依赖 <span
class="math inline">\(n\)</span></li>
</ul>
<p>converge almost uniform 强于 converge in measure</p>
<p>在 <span class="math inline">\(\mu(E)&lt;\infty\)</span> 时，converge
a.e. 强于 converge almost uniformly</p></li>
<li><p><span class="math inline">\(\mu(E)=\infty\)</span> , Egovov's THM
可能不成立</p>
<p><span class="math inline">\((\mathbb R,\mathcal R,\lambda)\)</span> ,
<span class="math inline">\(f_n(x)=\mathbb 1_{[n,n+1]}(x)\)</span></p>
<p><span class="math inline">\(f_n\to 0\)</span> a.e. , but let <span
class="math inline">\(\epsilon=\frac{1}{2}\)</span> , <span
class="math inline">\(\mu(\{w:|f_n(w)|&gt;\frac{1}{2}\})=1\)</span></p></li>
<li><p>Convergence of random variables</p>
<p><span class="math inline">\(\mu\)</span> : probability measure ,
<span class="math inline">\(f\)</span> , <span
class="math inline">\(f_1,\cdots\)</span> random variables</p>
<p>Def [ <strong><em>converge a.s.</em></strong> ] : <span
class="math inline">\(f_n\to f\)</span> a.e. , then we say <span
class="math inline">\(f_n\)</span> <strong><em>converges to <span
class="math inline">\(f\)</span> almost surely</em></strong> .</p>
<p>Def [ <strong><em>converge in probability</em></strong> ] : <span
class="math inline">\(f_n\to f\)</span> in measure , then we say <span
class="math inline">\(f_n\)</span> <strong><em>converges to <span
class="math inline">\(f\)</span> in probability</em></strong> , denote
as <span class="math inline">\(f_n\xrightarrow{P} f\)</span>.</p></li>
</ol></li>
</ol></li>
<li><p>Bounded Convergence Theorem</p>
<ol type="1">
<li><p>THM [ <strong><em>Bounded Convergence Theorem (BCT)</em></strong>
]</p>
<p>Condition :</p>
<ol type="i">
<li><p><span class="math inline">\(E\in \mathcal F\)</span> , s.t. <span
class="math inline">\(\mu(E)&lt;\infty\)</span> and <span
class="math inline">\(\forall n\ge 1,f_n(E^c)=0\)</span> .</p></li>
<li><p><span class="math inline">\(\exists M&gt;0\)</span> , <span
class="math inline">\(\forall n\ge 1\)</span> , <span
class="math inline">\(|f_n(x)|\le M\)</span></p></li>
<li><p><span class="math inline">\(f_n\to f\)</span> in measure</p></li>
</ol>
<p>Result : <span class="math display">\[
\lim_{n\to\infty}\int f_n d\mu=\int fd\mu
\]</span></p></li>
<li><p>Proof</p>
<p><span class="math inline">\(\forall \epsilon&gt;0\)</span> , let
<span class="math inline">\(G_n=\{x:|f_n(x)-f(x)|&lt;\epsilon\}\)</span>
, let <span class="math inline">\(B_n=\Omega-G_n\)</span> <span
class="math display">\[
\begin{aligned}
&amp;\quad\left|\int f_nd\mu-\int fd\mu\right|\\
&amp;\le \int |f_n-f|d\mu\\
&amp;=\int_{G_n}|f_n-f|d\mu+\int_{B_n\cap \{|f|\le
M+1\}}|f_n-f|d\mu+\int_{B_n\cap \{|f|&gt;M+1\}}|f_n-f|d\mu\\
&amp;\le \epsilon \mu(G_n)+3M\mu(B_n)+\int_{\{|f|\ge M+\frac{1}{2}\}}
(|f|+M)d\mu\\
&amp;\le \epsilon \mu(G_n)+3M\mu(B_n)+M\mu\left\{|f|\ge
M+\frac{1}{2}\right\}+\int_{\{|f|\ge M+\frac{1}{2}\}}|f|d\mu
\end{aligned}
\]</span> Since <span class="math inline">\(f_n\to f\)</span> in measure
, <span class="math inline">\(\mu(B_n)\to 0\)</span> as <span
class="math inline">\(n\to \infty\)</span></p>
<p>If <span class="math inline">\(\mu\{|f|\ge M+\frac{1}{2}\}\neq
0\)</span> ,then as <span class="math inline">\(n\to\infty\)</span> ,
<span class="math inline">\(\mu\{x:|f_n(x)-f(x)|\ge \frac{1}{2}\} \ge
\mu\{|f|\ge M+\frac{1}{2}\}\not\to 0\)</span> , contradict with <span
class="math inline">\(f_n\to f\)</span> in measure</p>
<p><span class="math inline">\(\int_{\{|f|\ge
M+\frac{1}{2}\}}|f|d\mu=0\)</span> ( though I don't know how to prove it
, maybe here is a bug)</p></li>
</ol></li>
<li><p>Fatou's Lemma</p>
<ol type="1">
<li><p>Lemma [ <strong><em>Fatou's Lemma</em></strong> ] : If <span
class="math inline">\(f_n\ge 0\)</span> , then <span
class="math display">\[
\liminf_{n\to\infty} \int f_nd\mu\ge \int
\left(\liminf_{n\to\infty}f_n\right)d\mu
\]</span></p></li>
<li><p>Proof</p>
<p>Let <span class="math inline">\(g_n(x)=\inf_{m\ge n} f_m(x)\)</span>
, so <span class="math inline">\(f_n(x)\ge g_n(x)\)</span> . And as
<span class="math inline">\(n\uparrow\infty\)</span> , <span
class="math inline">\(g_n(x)\uparrow g(x)=\liminf\limits_{n\to\infty}
f_n(x)\)</span> .</p>
<p>Therefore , we only need to prove that <span class="math display">\[
\lim_{n\to\infty} \int g_nd\mu \ge \int d\mu
\]</span> Consider <span class="math inline">\(E_m\uparrow
\Omega\)</span> , <span class="math inline">\(E_1\subseteq
E_2\subseteq\cdots\)</span> , and <span
class="math inline">\(\bigcup_{m=1}^{\infty} E_i=\Omega\)</span> ,
then</p>
<p><span class="math inline">\(\forall m&gt;0\)</span> , <span
class="math inline">\(m\)</span> fixed , <span
class="math inline">\((g_n\land m)\mathbb 1_{E_m}\to (g\land m)\mathbb
1_{E_m}\)</span> a.e.</p>
<p>Therefore , for any fixed <span class="math inline">\(m&gt;0\)</span>
, <span class="math display">\[
\begin{aligned}
&amp;\quad \lim_{n\to\infty}\int g_nd\mu\\
&amp;\ge \lim_{n\to\infty} \int_{E_m}(g_n\land m)d\mu\\
&amp;=\int_{E_m}(g\land m)d\mu\\
\end{aligned}
\]</span> Therefore , <span class="math display">\[
\begin{aligned}
&amp;\quad \lim_{n\to\infty}\int g_nd\mu\\
&amp;\ge \sup_{m&gt;0}\int_{E_m}(g\land m)d\mu\\
&amp;=\lim_{m\to\infty} \int_{E_m}(g\land m)d\mu\\
&amp;=\int gd\mu
\end{aligned}
\]</span></p></li>
</ol></li>
<li><p>Monotone Convergence Theorem</p>
<ol type="1">
<li><p>THM [ <strong><em>Monotone Convergence Theorem
(MCT)</em></strong> ]</p>
<p>Condition :</p>
<ol type="i">
<li><p><span class="math inline">\(f_n\ge 0\)</span></p></li>
<li><p><span class="math inline">\(f_n\uparrow f\)</span> a.e.</p></li>
</ol>
<p>Result : <span class="math display">\[
\int f_nd\mu \uparrow \int fd\mu
\]</span></p></li>
<li><p>Proof</p>
<p>Since <span class="math inline">\(f_n\uparrow f\)</span> , <span
class="math inline">\(\limsup\limits_{n\to\infty} \int f_nd\mu\le \int
fd\mu\)</span>.</p>
<p>By <strong><em>Fatou's Lemma</em></strong> ,<br />
<span class="math display">\[
\begin{aligned}
&amp;\quad \liminf_{n\to\infty}\int f_nd\mu\\
&amp;\ge \int \left(\liminf_{n\to\infty} f_n\right)d\mu\\
&amp;=\int \left(\lim_{n\to\infty} f_n\right)d\mu\\
&amp;=\int fd\mu
\end{aligned}
\]</span> Therefore , <span
class="math inline">\(\lim\limits_{n\to\infty} \int f_nd\mu=\int
fd\mu\)</span> .</p></li>
</ol></li>
<li><p>Dominated Convergence Theorem</p>
<ol type="1">
<li><p>THM [ <strong><em>Dominated Convergence Theorem
(DCT)</em></strong> ]</p>
<p>Condition :</p>
<ol type="i">
<li><p><span class="math inline">\(f_n\to f\)</span> a.e.</p></li>
<li><p>There exists a function <span class="math inline">\(g\)</span>
that <span class="math inline">\(\forall n\ge 1,|f_n|\le g\)</span>
a.e.</p></li>
<li><p><span class="math inline">\(g\)</span> is integrable , i.e. <span
class="math inline">\(\int |g|d\mu &lt;\infty\)</span></p></li>
</ol>
<p>Result : <span class="math display">\[
\int f_n d\mu\to \int fd\mu
\]</span></p></li>
<li><p>Proof</p>
<p>Since <span class="math inline">\(|f_n|\le g\)</span> , <span
class="math inline">\(f_n+g\ge 0\)</span> a.e. and <span
class="math inline">\(-f_n+g\ge 0\)</span> a.e.</p>
<p>By <strong><em>Fatou's Lemma</em></strong> , <span
class="math display">\[
\begin{aligned}
\liminf_{n\to\infty} \int (f_n+g)d\mu\ge \int (f+g)d\mu
\quad&amp;\Rightarrow\quad \liminf_{n\to\infty} \int f_nd\mu \ge \int
fd\mu\\
\liminf_{n\to\infty} \int (-f_n+g)d\mu\ge \int (-f+g)d\mu
\quad&amp;\Rightarrow\quad \limsup_{n\to\infty} \int f_nd\mu \le \int
fd\mu\\
\end{aligned}
\]</span></p></li>
</ol></li>
</ol>
<h3 id="expected-value">1.6 Expected Value</h3>
<ol type="1">
<li><p>Basic concept of Expectation</p>
<ol type="1">
<li><p>Def [ <strong><em>Expected Value</em></strong> ] : For <span
class="math inline">\(X\ge 0\)</span> be a random variable on <span
class="math inline">\((\Omega,\mathcal F,P)\)</span> , its
<strong><em>expected value</em></strong> is <span
class="math display">\[
E[X]=\int X dP
\]</span> For general case , let <span
class="math inline">\(X^+=\max\{X,0\}\)</span> , <span
class="math inline">\(X^-=\max\{-X,0\}\)</span> .</p>
<p>Define <span class="math inline">\(E[X]\)</span> when <span
class="math inline">\(E[X^+]&lt;\infty\)</span> or <span
class="math inline">\(E[X^-]&lt;\infty\)</span> , as <span
class="math inline">\(E[X]=E[X^+]-E[X^-]\)</span> .</p></li>
<li><p>Remarks</p>
<ol type="1">
<li><p>The definition of expected value is a little bit bigger than
<strong><em>integrable</em></strong> , since we also define <span
class="math inline">\(E[X]\)</span> when <span
class="math inline">\(E[X^+]=\infty\)</span> or <span
class="math inline">\(E[X^-]=\infty\)</span> . But usually this does not
matter.</p></li>
<li><p>We can construct an example that <span
class="math inline">\(E[X]=\infty\)</span></p>
<p><span class="math inline">\(P(X=2^{j})=2^{-j}\)</span> for integer
<span class="math inline">\(j\ge 1\)</span> , then <span
class="math inline">\(E[X]=\sum_{j=1}^{\infty}
2^jP(X=2^j)=\sum_{j=1}^{\infty}1=\infty\)</span></p></li>
<li><p><span class="math inline">\(E[X]\)</span> is often called
<strong><em>mean</em></strong> of <span class="math inline">\(X\)</span>
, and denoted as <span class="math inline">\(\mu\)</span> (different
from measure!).</p></li>
</ol></li>
<li><p>Basic properties</p>
<ol type="1">
<li><span class="math inline">\(E[X+Y]=E[X]+E[Y]\)</span></li>
<li><span class="math inline">\(E[aX+b]=aE[X]+b\)</span></li>
<li>If <span class="math inline">\(X\ge Y\)</span> , then <span
class="math inline">\(E[X]\ge E[Y]\)</span></li>
</ol></li>
</ol></li>
<li><p>Inequalities</p>
<ol type="1">
<li><p>THM [ <strong><em>Jensen's Inequality</em></strong> ] : Suppose
<span class="math inline">\(\varphi\)</span> is a convex , and <span
class="math inline">\(E[\varphi(X)],E[X]\)</span> exist , then <span
class="math display">\[
E[\varphi(X)]\ge \varphi(E[X])
\]</span></p>
<blockquote>
<p>Corollary : <span class="math inline">\(|E[X]|\le E[|X|]\)</span> ,
<span class="math inline">\((E[X])^2\le E[X^2]\)</span></p>
</blockquote></li>
<li><p>THM [ <strong><em>Holder's Inequality</em></strong> ] : If <span
class="math inline">\(p,q\in [1,\infty]\)</span> , <span
class="math inline">\(\frac{1}{p}+\frac{1}{q}=1\)</span> , then <span
class="math display">\[
E|XY|\le ||X||_p||Y||_q
\]</span> Here define <span
class="math inline">\(||X||_r=(E[X^r])^{1/r}\)</span> for <span
class="math inline">\(r\in [1,\infty)\)</span> ,</p>
<p>define <span
class="math inline">\(||X||_{\infty}=\inf\{M:P(|X|&gt;M)=0\}\)</span> (
like the maximum )</p></li>
<li><p>THM [ <strong><em>Chebyshev's Inequality</em></strong> ] : <span
class="math inline">\(\varphi : \mathbb R\to \mathbb R\)</span> , <span
class="math inline">\(\varphi\ge 0\)</span></p>
<p>Let <span class="math inline">\(A\in \mathcal R\)</span> , and let
<span class="math inline">\(i_A=\inf\{\varphi(y):y\in A\}\)</span> .
Therefore , <span class="math display">\[
i_A P(X\in A)\le E[\varphi(X)\mathbb 1(X\in A)]\le E[\varphi(X)]
\]</span> Proof : <span class="math display">\[
i_A\mathbb 1(X\in A)\le \varphi(X)\mathbb 1(X\in A)\le \varphi(X)
\]</span></p></li>
<li><p>THM [ <strong><em>Chebyshev's Inequality 2</em></strong> ] : Let
<span class="math inline">\(\varphi(x)=(x-\mu)^2\)</span> , then <span
class="math display">\[
\Pr\{|X-\mu|\ge t\}=\frac{Var(X)}{t^2}
\]</span></p></li>
<li><p>THM [ <strong><em>Markov's Inequality</em></strong> ] : If <span
class="math inline">\(X\ge 0\)</span> a.s. then <span
class="math display">\[
\Pr\{X\ge t\}\le \frac{E[X]}{t}
\]</span></p></li>
</ol></li>
<li><p>Convergence</p>
<ol type="1">
<li><p>THM [ <strong><em>Egovov's Theorem</em></strong> ] : <span
class="math inline">\(X_1,X_2 \cdots\)</span> is a sequence of random
variables , then</p>
<p><span class="math inline">\(X_n\to X\)</span> a.s. <span
class="math inline">\(\Rightarrow\)</span> <span
class="math inline">\(X_n\to X\)</span> in probability</p></li>
<li><p>THM [ <strong><em>Bounded Convergence Theorem</em></strong> ]
:</p>
<p>Condition :</p>
<ol type="i">
<li><p><span class="math inline">\(X_n\to X\)</span> a.s.</p></li>
<li><p><span class="math inline">\(\exists M&gt;0\)</span> , <span
class="math inline">\(\forall n\ge 1,|X_n|\le M\)</span></p></li>
</ol>
<p>Result : <span class="math inline">\(E[X_n]\to E[X]\)</span></p></li>
<li><p>THM [ <strong><em>Fatou's Lemma</em></strong> ] : If <span
class="math inline">\(X_n\ge 0\)</span> , then <span
class="math display">\[
\liminf_{n\to\infty} E[X_n]\ge E\left[\liminf_{n\to\infty} X_n\right]
\]</span></p></li>
<li><p>THM [ <strong><em>Monotone Convergence Theorem</em></strong> ]
:</p>
<p>Condition :</p>
<ol type="i">
<li><p><span class="math inline">\(X_n\ge 0\)</span></p></li>
<li><p><span class="math inline">\(X_n\uparrow X\)</span> a.s.</p></li>
</ol>
<p>Result : <span class="math inline">\(E[X_n]\uparrow E[X]\)</span>
a.s.</p></li>
<li><p>THM [ <strong><em>Dominated Convergence Theorem</em></strong> ]
:</p>
<p>Condition :</p>
<ol type="i">
<li><p><span class="math inline">\(X_n\to X\)</span> a.s.</p></li>
<li><p><span class="math inline">\(\exists Y\)</span> , <span
class="math inline">\(\forall n\ge 1\)</span> , <span
class="math inline">\(|X_n|\le Y\)</span></p></li>
<li><p><span class="math inline">\(E[|Y|]&lt;\infty\)</span></p></li>
</ol>
<p>Result : <span class="math inline">\(E[X_n]\to E[X]\)</span></p></li>
<li><p>THM</p>
<p>Condition :</p>
<ol type="i">
<li><p><span class="math inline">\(X_n\to X\)</span> a.s.</p></li>
<li><p><span class="math inline">\(g,h\)</span> continuous
functions</p></li>
<li><p><span class="math inline">\(g\ge 0\)</span> , <span
class="math inline">\(g(x)\to\infty\)</span> as <span
class="math inline">\(|x|\to\infty\)</span></p></li>
<li><p><span class="math inline">\(|h(x)|/g(x)\to 0\)</span> as <span
class="math inline">\(|x|\to\infty\)</span></p></li>
<li><p><span class="math inline">\(\exists K&gt;0\)</span> , s.t. <span
class="math inline">\(E[g(X)]\le K&lt;\infty\)</span></p></li>
</ol>
<p>Result : <span class="math inline">\(E[h(X_n)]\to E[h(X)]\)</span>
.</p>
<blockquote>
<p>Remark : similar to DCT , use <span class="math inline">\(g\)</span>
to dominate <span class="math inline">\(h\)</span></p>
</blockquote>
<blockquote>
<p>Proof : see book</p>
<p>Intuition : truncation , consider <span class="math inline">\(\bar
X=X\cdot \mathbb 1(|X|\le M)\)</span> , <span
class="math inline">\(M\)</span> to make <span
class="math inline">\(g(x)&gt;0\)</span> . <span class="math display">\[
E[h(X_n)]\to E[h(\bar X_n)]\to E[h(\bar X)]\to E[h(X)]
\]</span></p>
</blockquote></li>
</ol></li>
<li><p>Computing Expected Value</p>
<ol type="1">
<li><p>THM [ <strong><em>Change of variable formula</em></strong> ]
:</p>
<p>Condition :</p>
<p>(i). <span class="math inline">\(X\)</span> is a random variable on
<span class="math inline">\((S,\mathcal S)\)</span> , with distribution
<span class="math inline">\(\mu\)</span> ( i.e. <span
class="math inline">\(\mu(A)=P(X\in A)\)</span> )</p>
<p>(ii). <span class="math inline">\(f\)</span> is a measurable function
from <span class="math inline">\((S,\mathcal S)\)</span> to <span
class="math inline">\((\mathbb R,\mathcal R)\)</span></p>
<p>(iii). <span class="math inline">\(f\ge 0\)</span> or <span
class="math inline">\(E[|f(X)|]&lt;\infty\)</span></p>
<p>Result : <span class="math display">\[
E[f(X)]=\int_S f(y)\mu(dy)
\]</span></p>
<blockquote>
<p>Remark : let <span class="math inline">\(\mu=P\circ X^{-1}\)</span> ,
so <span class="math display">\[
\int f(X)dP=\int_S f(y)d(P\circ X^{-1})
\]</span></p>
</blockquote></li>
<li><p>Further computation</p>
<p>When <span class="math inline">\(X\)</span> is a continuous random
variable , we can use <strong>Radon-Nikodym derivative</strong> to
represent <strong>PDF</strong> , so <span class="math display">\[
\begin{aligned}
E[f(X)]&amp;=\int_S f(y)d\mu\\
&amp;=\int_\mathbb R f \frac{d\mu }{d\lambda} d\lambda\\
&amp;=\int_{-\infty}^{\infty} f(x)p(x)dx
\end{aligned}
\]</span></p></li>
<li><p>Momentum</p>
<ol type="1">
<li><p>Def [ <strong><em><span class="math inline">\(k\)</span>-th
momentum</em></strong> ] : If <span class="math inline">\(k\in \mathbb
N^*\)</span> , then <span class="math inline">\(E[X^k]\)</span> is
called the <strong><em><span class="math inline">\(k\)</span>-th moment
of <span class="math inline">\(X\)</span></em></strong> .</p></li>
<li><p>Def [ <strong><em>Variance</em></strong> ] : <span
class="math inline">\(Var(X)=E[(X-E[X])^2]\)</span></p>
<p>Property : <span
class="math inline">\(Var(X)=E[X^2]-(E[X])^2\)</span></p></li>
</ol></li>
<li><p>Examples</p>
<ol type="1">
<li><p>Def [ <strong><em>Bernoulli random variable</em></strong> ] :
<span class="math inline">\(X\in \{0,1\}\)</span> , <span
class="math inline">\(\begin{cases}P(X=0)&amp;=1-p\\P(X=1)&amp;=p\end{cases}\)</span></p>
<p><span class="math inline">\(E[X]=p\)</span> , <span
class="math inline">\(Var(X)=p(1-p)\)</span></p></li>
<li><p>Def [ <strong><em>Poisson random variable</em></strong> ] : with
parameter <span class="math inline">\(\lambda\)</span> , <span
class="math display">\[
P(X=k)=e^{-\lambda} \frac{\lambda^k}{k!} \quad k=0,1,2,\cdots
\]</span> Property : <span class="math display">\[
E[\prod_{i=0}^{k-1}(X-i)]=\lambda^k
\]</span> <span class="math inline">\(Var(X)=\lambda\)</span></p></li>
<li><p>Def [ <strong><em>Geometric distribution</em></strong> ] : with
success probability <span class="math inline">\(p\)</span> , <span
class="math display">\[
P(X=k)=p(1-p)^{k-1}
\]</span> <span class="math inline">\(E[X]=\frac{1}{p}\)</span> , <span
class="math inline">\(Var(X)=\frac{1-p}{p^2}\)</span></p></li>
<li><p>Def [ <strong><em>Gaussian random variable / Normal
distribution</em></strong> ] : with mean <span
class="math inline">\(\mu\)</span> and variance <span
class="math inline">\(\sigma^2\)</span> , <span class="math display">\[
f(x)=\frac{1}{\sqrt{2\pi\sigma^2}}\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)
\]</span> <span class="math inline">\(E[X]=\mu\)</span> , <span
class="math inline">\(Var[X]=\sigma^2\)</span></p></li>
</ol></li>
</ol></li>
</ol>
<h3 id="product-measure-and-fubinis-theorem">1.7 Product Measure and
Fubini's Theorem</h3>
<ol type="1">
<li><p>product of measure</p>
<ol type="1">
<li><p>Definition</p>
<p>For measurable space <span class="math inline">\((X,\mathcal
A,\mu_1)\)</span> and <span class="math inline">\((Y,\mathcal
B,\mu_2)\)</span> ,</p>
<p>Let <span class="math inline">\(\Omega=X\times Y=\{(x,y):x\in X,y\in
Y\}\)</span></p>
<p>Let <span class="math inline">\(\mathcal S=\{A\times B:A\in \mathcal
A,B\in \mathcal B\}\)</span> , so <span class="math inline">\(\mathcal
S\)</span> is a semi-algebra since <span class="math inline">\((A\times
B)^c=(A^c\times B)\cup (A\times B^c)\cup (A^c\times B^c)\)</span> .</p>
<p>Let <span class="math inline">\(\mathcal F=\sigma(\mathcal
S)\)</span> , that is the <span
class="math inline">\(\sigma\)</span>-field generated by <span
class="math inline">\(\mathcal S\)</span> , denote as <span
class="math inline">\(\mathcal F=\mathcal A\times\mathcal B\)</span>
.</p></li>
<li><p>THM [ <strong><em>measure for measurable space
product</em></strong> ] :</p>
<p>There is a unique measure <span class="math inline">\(\mu\)</span> on
<span class="math inline">\(\mathcal F\)</span> , s.t. <span
class="math inline">\(\mu(A\times B)=\mu_1(A)\mu_2(B)\)</span> .</p>
<p>(*) Proof :</p>
<p>By the property of <span class="math inline">\(\sigma\)</span>-field
generation , we only need to prove that :</p>
<p>If <span class="math inline">\(A\times B=+_i (A_i\times B_i)\)</span>
is a finite or countable disjoint union , then <span
class="math display">\[
\mu(A\times B)=\sum_{i} \mu(A_i\times B_i)
\]</span> For each <span class="math inline">\(x\in A\)</span> , let
<span class="math inline">\(I(x)=\{i:x\in A_i\}\)</span> , firstly
consider <span class="math inline">\(B=+_{i\in I(x)}B_i\)</span> , so
<span class="math display">\[
\mathbb 1_A(x)\mu_2(B)=\sum_{i}\mathbb 1_{A_i}(x)\mu_2(B_i)
\]</span> Integrate it with <span class="math inline">\(\mu_1\)</span> ,
so <span class="math display">\[
\begin{aligned}
\int \mathbb 1_A(x)\mu_2(B)d\mu_1&amp;=\int \sum_{i}\mathbb
1_{A_i}(x)\mu_2(B_i)d\mu_1\\
\iff\quad\quad\quad\mu_1(A)\mu_2(B)&amp;=\sum_{i}\mu_2(B_i)\int\mathbb
1_{A_i}(x)d\mu_1\\
\iff\quad\quad\quad\mu_1(A)\mu_2(B)&amp;=\sum_{i}\mu_1(A_i)\mu_2(B_i)\\
\end{aligned}
\]</span></p></li>
<li><p>Remark</p>
<ol type="1">
<li><p><span class="math inline">\(\mu\)</span> is often denoted as
<span class="math inline">\(\mu=\mu_1\times \mu_2\)</span></p></li>
<li><p>We can generate this result to <span
class="math inline">\(n\)</span> measurable space product , so</p>
<p>Consider measurable space <span
class="math inline">\((\Omega_i,\mathcal F_i,\mu_i)\)</span> , Let <span
class="math inline">\(\Omega=\times_{i=1}^n \Omega_i\)</span> , <span
class="math inline">\(\mathcal F=\times_{i=1}^n \mathcal F_i\)</span> ,
so there is a unique measure <span class="math inline">\(\mu\)</span>
that for <span class="math inline">\(A=\times_{i=1}^n A_i\)</span> ,
where <span class="math inline">\(A_i\in \mathcal F_i\)</span> , <span
class="math display">\[
\mu(\times_{i=1}^n A_i)=\prod_{i=1}^n \mu_i(A_i)
\]</span></p></li>
<li><p>When <span class="math inline">\((\Omega_i,\mathcal
F_i,\mu_i)=(\mathbb R,\mathcal R,\lambda)\)</span> , then <span
class="math inline">\(\mu\)</span> is the <strong><em>Lebesgue
measure</em></strong> on the Borel subsets of <span
class="math inline">\(\mathbb R^n\)</span></p></li>
</ol></li>
</ol></li>
<li><p>Fubini's Theorem</p>
<ol type="1">
<li><p>THM [ <strong><em>Fubini's Theorem</em></strong> ] : For
measurable space <span class="math inline">\((X,\mathcal
A,\mu_1)\)</span> and <span class="math inline">\((Y,\mathcal
B,\mu_2)\)</span>. If <span class="math inline">\(f\ge 0\)</span> or
<span class="math inline">\(\int |f|d\mu&lt;\infty\)</span> , then <span
class="math display">\[
\int_X\int_Y f(x,y)\mu_2(dy)\mu_1(dx)=\int_{X\times Y}
fd\mu=\int_Y\int_X f(x,y)\mu_1(dx)\mu_2(dy)
\]</span></p></li>
<li><p>(*) Proof</p>
<p>Firstly , we need to make sure that</p>
<ul>
<li>Fixing <span class="math inline">\(x\)</span> , <span
class="math inline">\(y\to f(x,y)\)</span> is a measurable map on <span
class="math inline">\(\mathcal B\)</span></li>
<li><span class="math inline">\(x\to\int_Y f(x,y)\mu_2(dy)\)</span> is a
measurable map on <span class="math inline">\(\mathcal A\)</span></li>
</ul>
<p>We have the following lemma , dealing with <span
class="math inline">\(f=\mathbb 1_{E}\)</span></p>
<ul>
<li><p>Lemma 1 : Let <span class="math inline">\(E_x=\{y:(x,y)\in
E\}\)</span> . If <span class="math inline">\(E\in \mathcal A\times
\mathcal B\)</span> , then <span class="math inline">\(E_x\in \mathcal
B\)</span></p></li>
<li><p>Lemma 2 : If <span class="math inline">\(E\in \mathcal A\times
\mathcal B\)</span> , then <span
class="math inline">\(g(x):=\mu_2(E_x)\)</span> is a measurable map on
<span class="math inline">\(\mathcal A\)</span> , and <span
class="math display">\[
\int_X gd\mu_1=\mu(E)
\]</span></p></li>
</ul>
<p>By these lemma , we can prove Fubini's Theorem on <span
class="math inline">\(f=\mathbb 1_E\)</span> for any <span
class="math inline">\(E\in \mathcal A\times \mathcal B\)</span></p>
<p>Using the linearity of integration , we can prove Fubini's Theorem
holds for all simple functions.</p>
<p>For non-negative function , we can let <span
class="math inline">\(f_n(x,y)=([2^nf(x,y)]/2^n)\land n\)</span> , so
<span class="math inline">\(f_n\)</span> is simple function and <span
class="math inline">\(f_n\uparrow f\)</span> . By MCT , Fubini's Theorem
holds for all non-negative functions.</p>
<p>For general function , we can compute <span
class="math inline">\(f=f^+-f^-\)</span> each . Then Fubini's Theorem
holds for all integrable functions.</p></li>
<li><p>Remarks</p>
<ol type="1">
<li><p>Toneli's Theorem : proved that Fubini's Theorem holds for <span
class="math inline">\(f\ge 0\)</span> .</p></li>
<li><p>When <span class="math inline">\(f\)</span> is not non-negative
and not integrable , Fubini's Theorem may fail :</p>
<p>Let <span class="math inline">\(X=Y=\mathbb N^*\)</span> , <span
class="math inline">\(\mathcal A=\mathcal B=\{S:S\subseteq \mathbb
N^*\}\)</span> , <span class="math inline">\(\mu_1=\mu_2=\text{counting
measure}\)</span> .</p>
<p>Let <span class="math inline">\(f(m,n)=\begin{cases}1&amp;m=n\\
-1&amp;m=n+1\\ 0&amp;otherwise\end{cases}\)</span> for all <span
class="math inline">\(m,n\ge 1\)</span> , so <span
class="math display">\[
\sum_m\sum_n f(m,n)=1,\sum_{n}\sum_{m}f(m,n)=0
\]</span></p></li>
</ol></li>
</ol></li>
</ol>
<h2 id="chapter-2-laws-of-large-numbers">Chapter 2 Laws of Large
Numbers</h2>
<h3 id="independence">2.1 Independence</h3>
<ol type="1">
<li>Definition ( for probability space <span
class="math inline">\((\Omega,\mathcal F,P)\)</span> )
<ol type="1">
<li>Def [ <strong><em>Independence of Events</em></strong> ] : Let <span
class="math inline">\(A,B\in \mathcal F\)</span> , <span
class="math inline">\(A\)</span> and <span
class="math inline">\(B\)</span> are
<strong><em>independent</em></strong> if <span
class="math inline">\(P(A\cap B)=P(A)P(B)\)</span>, denote as <span
class="math inline">\(A\perp\!\!\!\perp B\)</span>.</li>
<li>Def [ <strong><em>Independence of Random Variables</em></strong> ] :
Let <span class="math inline">\(X,Y\)</span> be two random variables ,
<span class="math inline">\(X\)</span> and <span
class="math inline">\(Y\)</span> are
<strong><em>independent</em></strong> if <span
class="math inline">\(\forall C,D\in \mathcal R\)</span> , <span
class="math inline">\(P(\{X\in C\}\cap \{Y\in D\})=P(\{X\in C\})P(\{Y\in
D\})\)</span>, denote as <span class="math inline">\(X\perp\!\!\!\perp
Y\)</span>.</li>
<li>Def [ <strong><em>Independence of <span
class="math inline">\(\sigma\)</span>-field</em></strong> ] : Two <span
class="math inline">\(\sigma\)</span>-field <span
class="math inline">\(\mathcal F,\mathcal G\)</span> are
<strong><em>independent</em></strong> if <span
class="math inline">\(\forall A\in \mathcal F\)</span> , <span
class="math inline">\(\forall B\in \mathcal G\)</span> , <span
class="math inline">\(A,B\)</span> are independent, denote as <span
class="math inline">\(\mathcal F\perp\!\!\!\perp \mathcal
G\)</span>.</li>
</ol></li>
<li>THM [ <strong><em>Independence of r.v. is a special case of <span
class="math inline">\(\sigma\)</span>-field</em></strong> ]
<ol type="1">
<li>If random variables <span class="math inline">\(X,Y\)</span> are
independent , then <span
class="math inline">\(\sigma(X),\sigma(Y)\)</span> are independent</li>
<li>If <span class="math inline">\(\mathcal F\)</span> and <span
class="math inline">\(\mathcal G\)</span> are independent , <span
class="math inline">\(X\in \mathcal F,Y\in \mathcal G\)</span> , then
<span class="math inline">\(X,Y\)</span> are independent</li>
</ol></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>概率与统计</category>
      </categories>
      <tags>
        <tag>实分析-积分性质</tag>
        <tag>实分析-积分收敛定理</tag>
        <tag>概率论-期望</tag>
        <tag>概率论-期望-期望性质</tag>
        <tag>概率论-期望-换元公式</tag>
        <tag>实分析-测度积</tag>
        <tag>实分析-Fubini定理</tag>
        <tag>概率论-独立性</tag>
      </tags>
  </entry>
  <entry>
    <title>Algorithm Design 7</title>
    <url>/2023/10/31/Algorithm-Design-7/</url>
    <content><![CDATA[<ol type="1">
<li><p>NP-Hardness</p>
<ul>
<li><p>NP Completeness : Decision Problem</p></li>
<li><p>NP Hardness : Optimization Problem</p></li>
<li><p>Problem in <strong>NP-Hard</strong> : not in <strong>NPC</strong>
, but the decision version is <strong>NPC</strong></p></li>
</ul></li>
<li><p>How to almost solve <strong>NP-Hard</strong> problem ?</p>
<ul>
<li>Special Restrictions</li>
<li>Approximation</li>
<li>Randomization</li>
</ul></li>
</ol>
<h2 id="chapter-5-extending-the-limits-of-tractability">Chapter 5
Extending the Limits of Tractability</h2>
<p>Solving <strong>NP-Hard</strong> problem under some special
restrictions</p>
<ol type="1">
<li><p>Max Independent Set</p>
<ul>
<li>general graphs : NP-Hard</li>
<li>interval graph : P</li>
<li>tree / graph with bounded tree width : P</li>
</ul></li>
<li><p>FPT (fixed parameter tractable)</p>
<p>Intuition : Consider the parametrization of an instance</p>
<p>Def [ <strong><em>FPT</em></strong> ]: Input size <span
class="math inline">\(n\)</span> , parameter <span
class="math inline">\(k\)</span> . A problem is
<strong><em>FPT</em></strong> if there exists an algorithm that solve
the problem in <span class="math inline">\(\mathcal
O(f(k)poly(n))\)</span>.</p>
<blockquote>
<p><span class="math inline">\(k\)</span> can be viewed similar to
constant , <span class="math inline">\(\mathcal O(2^kpoly(n))\)</span>
is still FPT , but <span class="math inline">\(\mathcal O(n^k)\)</span>
is not FPT ( <span class="math inline">\(poly(n)\)</span> is independent
from <span class="math inline">\(k\)</span> ).</p>
</blockquote></li>
<li><p>Vertex Cover , <span class="math inline">\(k=|VC|\)</span></p>
<blockquote>
<p>We want to construct an algorithm in <span
class="math inline">\(\mathcal O(f(k)poly(n))\)</span> , which is better
than trivial algorithm in <span class="math inline">\(\mathcal
O(n^{k})\)</span> .</p>
</blockquote>
<ol type="1">
<li><p>Key Observation : <span class="math inline">\(e=(u,v)\)</span> ,
<span class="math inline">\(VC(G)\le k\iff VC(G-u)\le k-1\lor CV(G-v)\le
k-1\)</span> .</p></li>
<li><p>Algorithm : check <span class="math inline">\((G,k)\)</span></p>
<ol type="1">
<li>If <span class="math inline">\(G\)</span> has no edge , then the
empty set is a VC , return <code>true</code> .</li>
<li>If <span class="math inline">\(|E|&gt; k|V|\)</span> , then <span
class="math inline">\(\not\exists VC,|VC|\le k\)</span> , return
<code>false</code> .</li>
<li>Let <span class="math inline">\(e=(u,v)\in E\)</span> , return
<code>check(G-u,k-1)||check(G-v,k-1)</code> .</li>
</ol></li>
<li><p>Running time : By recursive tree , at most <span
class="math inline">\(k\)</span> levels , level <span
class="math inline">\(i\)</span> has <span
class="math inline">\(2^{i-1}\)</span> possibilities , with <span
class="math inline">\(\mathcal O(n(k-i+1))\)</span> computation in each
node.</p>
<p>Total time : <span class="math inline">\(\mathcal O(2^kkn)\)</span>
.</p></li>
<li><p>Best FPT : <span class="math inline">\(\mathcal
O(1.2738^kpoly(n))\)</span> .</p>
<p>Method : reduce more in Key Observation (e.g. <span
class="math inline">\(k\to k-1\)</span> improve to <span
class="math inline">\(k\to k-2\)</span>) by considering more edges.</p>
<p>Kernel method : <span class="math inline">\(\mathcal
O(|E|+(5^{1/4})^k k^2)\)</span> .</p></li>
</ol></li>
</ol>
<h2 id="chapter-6-approximation-algorithm">Chapter 6 Approximation
Algorithm</h2>
<h3 id="approximation-criteria">6.1 Approximation Criteria</h3>
<ol type="1">
<li><p>Approximation ration <span class="math inline">\(\alpha\ge
1\)</span></p>
<p>minimization problem : <span class="math inline">\(\alpha:=\sup_I
\frac{SOL(I)}{OPT(I)}\)</span>.</p>
<p>maximization problem : <span class="math inline">\(\alpha:=\sup_I
\frac{OPT(I)}{SOL(I)}\)</span>.</p></li>
<li><p>Some examples</p>
<ul>
<li>Exact solution : <span class="math inline">\(\alpha=1\)</span></li>
<li>Constant approximation : <span class="math inline">\(\alpha=\mathcal
O(1)\)</span></li>
<li>Logarithm approximation : <span
class="math inline">\(\alpha=\mathcal O(\log n)\)</span></li>
<li><span class="math inline">\(\alpha=\mathcal O(n)\)</span> : usually
not interesting.</li>
<li>Poly-Time Approximation Scheme ( <strong><em>PTAS</em></strong> ) :
<span class="math inline">\(\alpha=1+\epsilon\)</span> , Time : poly
when <span class="math inline">\(\epsilon\)</span> is constant.</li>
<li>Fully Poly-Time Approximation Scheme (
<strong><em>FPTAS</em></strong> ) : Time : <span
class="math inline">\(\mathcal
O(poly(\frac{1}{\epsilon},n))\)</span>.</li>
</ul></li>
</ol>
<h3 id="load-balancing-problem">6.2 Load Balancing Problem</h3>
<ol type="1">
<li><p>Description :</p>
<p>Input : <span class="math inline">\(n\)</span> jobs , <span
class="math inline">\(t_i\)</span> : time for job <span
class="math inline">\(i\)</span> , <span
class="math inline">\(m\)</span> machines</p>
<p>Output : assign each job to one machine , say machine <span
class="math inline">\(j\)</span> is assigned jobs <span
class="math inline">\(S_j\)</span> , minimize <span
class="math inline">\(\max_{j=1}^m \left\{\sum_{i\in
S_j}t_i\right\}\)</span> .</p></li>
<li><p><strong>Load Balancing Problem</strong> is
<strong>NP-Hard</strong></p>
<p>Prove 1 : <strong>special subset sum</strong> <span
class="math inline">\(\le_p\)</span> <strong>Load Balancing
Problem</strong></p>
<p><strong>special subset sum</strong> : <span
class="math inline">\(W=\frac{\sum_{i=1}^n w_i}{2}\)</span> . This is
still <strong>NPC</strong>.</p>
<p>Prove 2 : <strong>3-partition problem</strong> <span
class="math inline">\(\le_p\)</span> <strong>Load Balancing
Problem</strong></p>
<p><strong>3-partition problem </strong> :</p>
<blockquote>
<p>Given <span class="math inline">\(3n\)</span> numbers <span
class="math inline">\(a_1,\cdots,a_{3n}\)</span> , partition them into
<span class="math inline">\(n\)</span> groups , with each group <span
class="math inline">\(3\)</span> numbers.</p>
<p>Decide whether we can find a partition , s.t. the sum of each group
is the same.</p>
</blockquote>
<p>Theorem : <strong>3-partition problem</strong> is
<strong>NPC</strong> ( even if <span class="math inline">\(a_i\)</span>
is poly-sized )</p></li>
<li><p>Greedy Algorithm 1</p>
<p>Process jobs in arbitrary order one-by-one . Assign the job to the
machine with current minimal load.</p>
<ol type="1">
<li><p>Analysis : <span
class="math inline">\(\alpha=2\)</span>.</p></li>
<li><p>Proof :</p>
<p>load of OPT : <span class="math inline">\(T^*\)</span>.</p>
<ul>
<li>Observation 1 : <span class="math inline">\(T^*\ge \max_{i=1}^n
t_i\)</span>.</li>
<li>Observation 2 : <span class="math inline">\(T^*\ge
\frac{1}{m}\sum_{i=1}^n t_i\)</span>.</li>
</ul>
<p>Let's prove that <span class="math inline">\(SOL\le \max_{i=1}^n
t_i+\frac{1}{m}\sum_{i=1}^n t_i\)</span> , so <span
class="math inline">\(SOL\le 2T^*\)</span>.</p>
<p>Let <span class="math inline">\(SOL=\sum_{i\in S_j} t_i\)</span> ,
and <span class="math inline">\(t_k\)</span> is the last job assigned to
machine <span class="math inline">\(j\)</span> . Therefore , we can
prove that before assigning <span class="math inline">\(t_k\)</span> ,
the load of machine <span class="math inline">\(j\)</span> is <span
class="math inline">\(\le \frac{1}{m}\sum_{i=1}^m t_i\)</span> ( since
machine <span class="math inline">\(j\)</span> with current minimal load
) . <span class="math inline">\(t_k\le \max_{i=1}^n t_i\)</span> .
Therefore , <span class="math inline">\(SOL\le \frac{1}{m}\sum_{i=1}^m
t_i +\max_{i=1}^n t_i\)</span>.</p></li>
<li><p>Worst case :</p>
<p><span class="math inline">\(n=m(m-1)+1\)</span> , with <span
class="math inline">\(m(m-1)\)</span> jobs <span
class="math inline">\(t_j=1\)</span> , and <span
class="math inline">\(1\)</span> job <span
class="math inline">\(t_j=m\)</span>.</p>
<p>Then <span class="math inline">\(SOL=2m-1\)</span> , but <span
class="math inline">\(OPT=m\)</span>. <span
class="math inline">\(\alpha=\frac{2m-1}{m}\to 2\)</span> .</p></li>
</ol></li>
<li><p>Greedy Algorithm 2 ( improvement )</p>
<p>Process jobs in decreasing order.</p>
<ol type="1">
<li><p>Analysis : <span class="math inline">\(\alpha=1.5\)</span> ( not
tight )</p></li>
<li><p>Proof :</p>
<ol type="1">
<li><p>Case 1 : <span class="math inline">\(n\le m\)</span> , <span
class="math inline">\(OPT=SOL\)</span></p></li>
<li><p>Case 2 : <span class="math inline">\(n&gt;m\)</span> , so <span
class="math inline">\(t_k\le t_{m+1}\le \frac{OPT}{2}\)</span>.</p>
<p><span class="math inline">\(t_{m+1},t_i\)</span> must be both in one
machine for some <span class="math inline">\(1\le i\le
m\)</span>.</p></li>
</ol></li>
</ol></li>
</ol>
<h3 id="k-center-problem">6.3 <span
class="math inline">\(k\)</span>-Center Problem</h3>
<ol type="1">
<li><p>Description</p>
<p>Input : a metric graph <span class="math inline">\(G=(V,E,d)\)</span>
, number of centers <span class="math inline">\(k\)</span>.</p>
<p>Def [ <strong><em>metric graph</em></strong> ] : <span
class="math inline">\(d\)</span> satisfies <span
class="math inline">\(\begin{cases}d(u,u)=0&amp;u\in
V\\d(u,v)=d(v,u)&amp;u,v\in V\\d(s,t)\le d(s,w)+d(w,t)&amp;s,t,w\in
V\end{cases}\)</span>.</p>
<p>Output : Choose <span class="math inline">\(k\)</span> centers <span
class="math inline">\(C=\{c_1,\cdots,c_k\}\subseteq V\)</span>. Minimize
max cluster radius. <span class="math display">\[
d(v,C):=\min_{i=1}^k \{d(v,c_i)\}
\]</span> Minimize <span class="math inline">\(\max_{v\in
V}d(v,C)\)</span>.</p></li>
<li><p>Remark : Applications in ML</p>
<p>unsupervised learning.</p>
<p><span class="math inline">\(k\)</span>-means : minimize <span
class="math inline">\(\sum_{v\in V}d(v,C)^2\)</span>.</p>
<p><span class="math inline">\(k\)</span>-median : minimize <span
class="math inline">\(\sum_{v\in V}d(v,C)\)</span>.</p></li>
<li><p>Greedy Algorithm with <span
class="math inline">\(\alpha=2\)</span></p>
<ol type="1">
<li><p>Algorithm</p>
<ol type="1">
<li><p>Guess optimal radius <span class="math inline">\(r\)</span> ( at
most <span class="math inline">\(|E|\)</span> possibilities )</p></li>
<li><p>Choose an arbitrary vertex <span class="math inline">\(v\in
S\)</span> , <span class="math inline">\(C\gets
C\cup\{v\}\)</span></p></li>
<li><p>Let <span class="math inline">\(D=\{u\in S:d(u,v)\le
2r\}\)</span> , <span class="math inline">\(S\gets S\backslash
D\)</span></p></li>
<li><p>If <span class="math inline">\(S\neq \varnothing\)</span> , go to
(2)</p></li>
<li><p>Finally , if <span class="math inline">\(|C|&gt;k\)</span> , our
guess is too small ( fail )</p>
<p>If <span class="math inline">\(|C|\le k\)</span> , our guess
succeed.</p></li>
</ol>
<p>Hopefully , we want if we success at <span
class="math inline">\(r\)</span> , then <span
class="math inline">\(OPT\ge r\)</span> and <span
class="math inline">\(SOL\le 2r\)</span> .</p></li>
<li><p>Proof</p>
<p>Key Observation : If <span class="math inline">\(r\ge OPT\)</span> ,
then we will definitely success.</p>
<p>Suppose <span
class="math inline">\(C_{OPT}=\{c_1,\cdots,c_{opt}\}\)</span> , and
<span class="math inline">\(b_v=\arg\min_{1\le i\le OPT}
d(v,c_i)\)</span> .</p>
<p>In each step , suppose that we choose <span
class="math inline">\(v\)</span> , then <span
class="math inline">\(\forall u\in V,b_u=b_v\)</span> , <span
class="math inline">\(d(u,v)\le d(u,c_{b_v})+d(c_{b_v},v)\le 2r\)</span>
. Therefore , <span class="math inline">\(\{u:b_u=b_v\}\subseteq
D_{v}\)</span> .</p>
<p>That is , each step we can remove at least one optimal cluster
.</p></li>
</ol></li>
<li><p><span class="math inline">\(\alpha=2\)</span> is the best
result</p>
<p>Inapproximability : <span class="math inline">\(\forall
\alpha&lt;2\)</span> , this problem is NP-Hard .</p>
<p>Promise Problem : For a given instance , such that the optimal value
is <span class="math inline">\(\le h\)</span> or <span
class="math inline">\(&gt; \alpha h\)</span> . Decide which case it
is.</p>
<p>Claim : Any <span class="math inline">\(\alpha\)</span>-approximation
algorithm can be used to solve Promise Problem.</p>
<blockquote>
<p>Approximation answer : <span class="math inline">\(\le \alpha
h\)</span> , then <span class="math inline">\(OPT\le \alpha h\)</span> ,
then <span class="math inline">\(OPT\le h\)</span> , output
<code>true</code></p>
<p>Approximation answer : $&gt;h $ , then <span
class="math inline">\(OPT&gt;h\)</span> , then <span
class="math inline">\(OPT&gt;\alpha h\)</span> , output
<code>false</code></p>
</blockquote></li>
<li><p>THM : Promise <span class="math inline">\(k\)</span>-center for
any <span class="math inline">\(\alpha&lt;2\)</span> is NPC</p>
<p>Proof : Dominating Set <span class="math inline">\(\le_p\)</span>
Promise <span class="math inline">\(k\)</span>-center for <span
class="math inline">\(\alpha&lt;2\)</span></p>
<p><span class="math inline">\(G\)</span> for Dominating Set , construct
<span class="math inline">\(G&#39;\)</span> for Promise <span
class="math inline">\(k\)</span>-center for <span
class="math inline">\(\alpha&lt;2\)</span></p>
<p><span class="math inline">\(G&#39;\)</span> : <span
class="math inline">\(E&#39;\)</span> complete graph , <span
class="math inline">\(w(e)=\begin{cases}1&amp;e\in E\\2&amp;e\notin
E\end{cases}\)</span></p>
<p><span class="math inline">\((G,k)\in DS\iff\)</span> the answer of
<span class="math inline">\(G&#39;\)</span> is <code>true</code> for
<span class="math inline">\((G&#39;,k,h=1)\)</span> .</p></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>计算理论</tag>
        <tag>计算理论-NP-Hard</tag>
        <tag>算法-近似算法</tag>
        <tag>算法-近似算法-负载平衡问题</tag>
        <tag>算法-近似算法-聚类问题</tag>
      </tags>
  </entry>
  <entry>
    <title>Algorithm Design 6</title>
    <url>/2023/10/31/Algorithm-Design-6/</url>
    <content><![CDATA[<h2 id="chapter-4-np-completeness">Chapter 4 NP Completeness</h2>
<h3 id="other-problems-in-npc">4.4 Other Problems in NPC</h3>
<p>Intuition : Find problems <span class="math inline">\(L\in
NP\)</span> that <span class="math inline">\(SAT\le_p L\)</span> .</p>
<h4 id="sat">4.4.1 3-SAT</h4>
<ol type="1">
<li><p>Description :</p>
<ol type="1">
<li><p>Def [ <strong><em>literal</em></strong> ] : For <span
class="math inline">\(\vec x=(x_1,\cdots,x_n)\in \{0,1\}^n\)</span> ,
<strong><em>literal</em></strong> is an element of <span
class="math inline">\(X=\{x_i|i\in [n]\}\cup\{\bar x_i|i\in
[n]\}\)</span> .</p>
<p>Def [ <strong><em>CNF-clause</em></strong> ] : A
<strong><em>CNF-clause</em></strong> <span
class="math inline">\(c\)</span> is a formula with form : <span
class="math display">\[
c=\lor_{j=1}^m a_j
\]</span> where <span class="math inline">\(a_1,\cdots,a_m\)</span> are
distinct literals .</p>
<p>Def [ <strong><em>CNF</em></strong> ] : A <strong><em>conjunctive
normal formula (CNF)</em></strong> is a formula <span
class="math inline">\(\{0,1\}^n\to \{0,1\}\)</span> with following form
: <span class="math display">\[
f(\vec x)=\land_{i=1}^k c_i(\vec x)
\]</span> where <span class="math inline">\(c_i(\vec x)\)</span> is a
CNF-clause .</p>
<p>Def [ <strong><em>3-CNF</em></strong> ] : A
<strong><em>3-CNF</em></strong> is a CNF with all clauses having exactly
<span class="math inline">\(3\)</span> literals . <span
class="math inline">\(c_i=a_{i,1}\lor a_{i,2}\lor a_{i,3}\)</span> ,
where <span class="math inline">\(a_{i,1},a_{i,2},a_{i,3}\)</span> are
distinct literals .</p></li>
<li><p>Input : a 3-CNF</p>
<p>Output : Decide whether there exists an assignment for <span
class="math inline">\(\vec x\)</span> such that <span
class="math inline">\(f(\vec x)=1\)</span> .</p></li>
</ol></li>
<li><p><strong>3-SAT</strong> is <strong>NPC</strong></p>
<p>Obviously , <strong>3-SAT</strong> is <strong>NP</strong> .</p>
<p>Goal : <strong>SAT</strong> $_p $ <strong>3-SAT</strong></p>
<p>Given a SAT instance <span class="math inline">\(I\)</span> , we
construct in poly-time a 3-SAT instance <span
class="math inline">\(I&#39;\)</span> , s.t. <span
class="math inline">\(I\in SAT\iff I&#39;\in 3-SAT\)</span> .</p>
<blockquote>
<p><strong><em>Cook reduction</em></strong> : can use oracle poly
times<br />
<strong><em>Karp reduction</em></strong> : only use oracle once</p>
<p>Originally , Cook define NPC by Cook reduction , but almost always
Karp reduction is enough (though stronger than Cook's) ,</p>
</blockquote>
<p>For each node <span class="math inline">\(v\in I\)</span> , construct
a variable <span class="math inline">\(x_v\)</span> in <span
class="math inline">\(I&#39;\)</span></p>
<ul>
<li>If <span class="math inline">\(v=\lnot\)</span> with child <span
class="math inline">\(u\)</span> , we need <span
class="math inline">\(x_v=\overline{x_u}\)</span> , so add to <span
class="math inline">\(I&#39;\)</span> a clause <span
class="math inline">\((x_v\lor x_u)\land
(\overline{x_v}\lor\overline{x_u})\)</span> .</li>
<li>If <span class="math inline">\(v=\lor\)</span> with children <span
class="math inline">\(u,w\)</span> , we need <span
class="math inline">\(x_v=x_u\lor x_w\)</span> , so add to <span
class="math inline">\(I&#39;\)</span> a clause <span
class="math inline">\((x_v\lor \bar x_u)\land (x_v\lor \bar x_w)\land
(\bar x_v\lor x_u\lor x_w)\)</span> .</li>
<li>If <span class="math inline">\(v=\land\)</span> with children <span
class="math inline">\(u,v\)</span> , we need <span
class="math inline">\(x_v=x_u\land x_w\)</span> , so add to <span
class="math inline">\(I&#39;\)</span> a clause <span
class="math inline">\((\bar x_v\lor x_u)\land (\bar x_v\lor x_w)\land
(x_v\lor \bar x_u\lor \bar x_w)\)</span> .</li>
<li>If <span class="math inline">\(v=0/1\)</span> , add to <span
class="math inline">\(I&#39;\)</span> a clause <span
class="math inline">\(\bar x_v\)</span> or <span
class="math inline">\(x_v\)</span> .</li>
</ul>
<p>Extend each clause into <span class="math inline">\(3\)</span>
literals : <span class="math inline">\((a\lor b)=(a\lor b\lor c)\land
(a\lor b\lor \bar c)\)</span></p>
<p>Claim : <span class="math inline">\(I\in SAT\iff I&#39;\in
3-SAT\)</span></p></li>
</ol>
<h4 id="independent-set-is">4.4.2 Independent Set ( <strong>IS</strong>
)</h4>
<p>Obviously , <strong>IS</strong> is <strong>NP</strong></p>
<p>Goal : <strong>3-SAT</strong> <span
class="math inline">\(\le_p\)</span> <strong>IS</strong></p>
<ol type="1">
<li><p>Construction</p>
<p>Given <strong>3-SAT</strong> instance <span
class="math inline">\(I\)</span> , construct an <strong>IS</strong>
instance <span class="math inline">\(I&#39;=(G,k)\)</span> , s.t. <span
class="math inline">\((G,k)\in\)</span> <strong>IS</strong> $I$
<strong>3-SAT</strong> .</p>
<ul>
<li>Each clause <span class="math inline">\(x_{i,1}\lor x_{i,2}\lor \bar
x_{i,3}\)</span> , construct a 3-cycle <span
class="math inline">\((x_{i,1} , x_{i,2} , \bar x_{i,3})\)</span>
(finally <span class="math inline">\(3n\)</span> nodes , <span
class="math inline">\(3n\)</span> edges).</li>
<li>Then connecting <span class="math inline">\(x_i\)</span> and <span
class="math inline">\(\bar x_i\)</span> for all <span
class="math inline">\(i\)</span> .</li>
</ul>
<blockquote>
<p>In each triangle , we can choose at most <span
class="math inline">\(1\)</span> node .</p>
</blockquote></li>
<li><p>Claim : <span class="math inline">\((G,k)\in\)</span>
<strong>IS</strong> $I$ <strong>3-SAT</strong> ( <span
class="math inline">\(k\)</span> is the number of clauses )</p>
<p>Proof :</p>
<ol type="1">
<li>If $I$ <strong>3-SAT</strong> , then there is a valid assignment
<span class="math inline">\(\vec x\)</span> . Each clause has at least
one <code>true</code> , so we choose that literal in the clause , this
is an independent set in <span class="math inline">\(G\)</span> with
size <span class="math inline">\(k\)</span> .</li>
<li>If $(G,k)$ <strong>IS</strong> , for each literal <span
class="math inline">\(x_i\)</span> (<span class="math inline">\(\bar
x_i\)</span>) chosen , let <span class="math inline">\(x_i=1\)</span> (
<span class="math inline">\(x_i=0\)</span> ) . The rest of variables are
assigned arbitrarily . This assignment is a valid assignment for <span
class="math inline">\(I\)</span> , so <span
class="math inline">\(I\in\)</span> <strong>3-SAT</strong> .</li>
</ol></li>
<li><p>Cor : Vertex Coloring , Set Coloring <span
class="math inline">\(\in\)</span> NPC</p></li>
</ol>
<h4 id="hamiltonian-cycle-hc">4.4.3 Hamiltonian Cycle (
<strong>HC</strong> )</h4>
<ol type="1">
<li><p>Description</p>
<ol type="1">
<li>Input : A directed/undirected graph</li>
<li>Output : Decide whether there exists a cycle visiting each vertex
exactly once</li>
</ol>
<blockquote>
<p><strong>TSP</strong> : weighted , determine the shortest Hamiltonian
Cycle</p>
<p><strong>TSP</strong> is <strong>NP-Hard</strong></p>
</blockquote></li>
<li><p><strong>HC</strong> is <strong>NPC</strong></p>
<p>Obviously , <strong>HC</strong> is <strong>NP</strong></p>
<p>Goal : <strong>3-SAT</strong> <span
class="math inline">\(\le_p\)</span> <strong>HC</strong></p>
<ol type="1">
<li><p>Construction</p>
<p>Consider an instance <span class="math inline">\(I\)</span> for
<strong>3-SAT</strong> , construct <span
class="math inline">\(G\)</span> , s.t. <span
class="math inline">\(G\in\)</span> <strong>HC</strong> $I$
<strong>3-SAT</strong> .</p>
<p>Suppose <span class="math inline">\(I\)</span> has <span
class="math inline">\(n\)</span> variables , <span
class="math inline">\(k\)</span> clauses .</p>
<ul>
<li><p><span class="math inline">\(S,T\in V\)</span> , <span
class="math inline">\((T,S)\in E\)</span></p></li>
<li><p><span class="math inline">\(n\)</span> lines (<span
class="math inline">\(P_i\)</span>) : on line <span
class="math inline">\(i\)</span> : <span
class="math inline">\(x_{i,0/1}\in V\)</span> , <span
class="math inline">\(v_{i,j,0/1/2}\in V\)</span> , <span
class="math inline">\(1\le j\le k\)</span> , <span
class="math inline">\((v_{i,j,0},v_{i,j,1}),(v_{i,j,1},v_{i,j,2}),(v_{i,j,2},v_{i,j+1,0})\in
E\)</span></p>
<p><span class="math inline">\((x_{i,0},v_{i,1,0}) ,
(v_{i,k,2},x_{i,1})\in E\)</span></p></li>
<li><p>Between two lines : <span
class="math inline">\((x_{i,0},x_{i+1,0}) , (x_{i,0},x_{i+1},1) ,
(x_{i,1},x_{i+1,0}),(x_{i,1},x_{i+1,1})\in E\)</span></p>
<p><span
class="math inline">\((S,x_{1,0}),(S,x_{1,1}),(x_{n,0},T),(x_{n,1},T)\in
E\)</span></p></li>
<li><p>Clause : For example <span class="math inline">\(c_a=(x_i\lor
\bar x_j\lor x_k)\)</span></p>
<p><span class="math inline">\(c_a\in V\)</span> , <span
class="math inline">\((v_{i,a,0},c_a),(c_a,v_{i,a,1})\in E\)</span> ,
<span class="math inline">\((v_{j,a,1},c_a),(c_a,v_{j,a,0})\in
E\)</span> , <span
class="math inline">\((v_{k,a,0},c_a),(c_a,v_{k,a,1})\in
E\)</span></p></li>
</ul>
<p><img src="/images/posts/AD6_fig1.png" /></p>
<blockquote>
<p><span class="math inline">\(v_{i,j,2}\)</span> for constraining the
Hamiltonian cycle be the form <span class="math inline">\(S\to
x_{1,0/1}\to x_{1,1/0}\to \cdots\to x_{n,0/1}\to x_{n,1/0}\to T\to
S\)</span> .</p>
</blockquote></li>
<li><p>Cor : <strong>Hamiltonian Cycle</strong> / <strong>Hamiltonian
Path</strong> for directed/undirected graph are all <strong>NPC</strong>
.</p></li>
</ol></li>
</ol>
<h4 id="dimensional-matching-3dm">4.4.4 3-Dimensional Matching (
3DM)</h4>
<ol type="1">
<li><p>Description</p>
<p>Def [ <strong><em>3D graph</em></strong> ] : <span
class="math inline">\(G=(X,Y,Z,E)\)</span> , <span
class="math inline">\(E=\{e=(x_e,y_e,z_e)|x_e\in X,y_e\in Y,z_e\in
Z\}\)</span> .</p>
<p>Def [ <strong><em>3D Matching</em></strong> ] : <span
class="math inline">\(M\subseteq E\)</span> satisfies</p>
<ul>
<li>$e_1e_2M $, <span class="math inline">\(x_{e_1}\neq x_{e_2}\)</span>
, <span class="math inline">\(y_{e_1}\neq y_{e_2}\)</span> , <span
class="math inline">\(z_{e_1}\neq z_{e_2}\)</span></li>
<li><span class="math inline">\(\bigcup\limits_{e\in M}x_e=X\)</span> ,
<span class="math inline">\(\bigcup\limits_{e\in M}y_e=Y\)</span> ,
<span class="math inline">\(\bigcup\limits_{e\in M}z_e=Z\)</span>.</li>
</ul>
<p>Input : A 3D graph</p>
<p>Output : Decide whether there is a 3D Matching for this
graph.</p></li>
<li><p><strong>3DM</strong> is <strong>NPC</strong></p>
<p>Obviously , <strong>3DM</strong> is <strong>NP</strong></p>
<p>Goal : <strong>3-SAT</strong> <span
class="math inline">\(\le_p\)</span> <strong>3DM</strong></p>
<p>Consider an instance <span class="math inline">\(I\)</span> for
<strong>3-SAT</strong> , construct <span
class="math inline">\(G\)</span> , s.t. <span
class="math inline">\(G\in\)</span> <strong>3DM</strong> $I$
<strong>3-SAT</strong> .</p>
<p>Suppose <span class="math inline">\(I\)</span> has <span
class="math inline">\(n\)</span> variables , <span
class="math inline">\(k\)</span> clauses .</p>
<p>Construction :</p>
<p><strong>Vertices :</strong></p>
<ul>
<li>For each variable <span class="math inline">\(x_i\)</span> ,
construct <span
class="math inline">\(A_i=\{a_{i,1},\cdots,a_{i,2k}\}\)</span> , <span
class="math inline">\(B_i=\{b_{i,1},\cdots,b_{i,k}\}\)</span> , <span
class="math inline">\(C_i=\{c_{i,1},\cdots,c_{i,k}\}\)</span> .</li>
<li>For each clause <span class="math inline">\(c_i\)</span> , construct
<span class="math inline">\(d_i,e_i\)</span> .</li>
<li>Cleanup garget : for <span class="math inline">\(i\)</span>-th
garget construct <span class="math inline">\(y_i,z_i\)</span> .</li>
</ul>
<p><span class="math inline">\(X=\bigcup A_i\)</span> , <span
class="math inline">\(Y=\left(\bigcup B_i \right)\cup
\{d_i\}\cup\{y_i\}\)</span> , <span
class="math inline">\(Z=\left(\bigcup C_i \right)\cup
\{e_i\}\cup\{z_i\}\)</span></p>
<p><strong>Hyperedges</strong> :</p>
<ul>
<li>For each variable <span class="math inline">\(x_i\)</span> : <span
class="math inline">\(e_{i,2j-1}=(a_{i,2j-1},b_{i,j},c_{i,j})\)</span> ,
<span
class="math inline">\(e_{i,2j}=(a_{i,2j},b_{i,j+1},c_{i,j})\)</span> .
(<span class="math inline">\(1\le j\le k\)</span> )</li>
<li>For each clause <span class="math inline">\(c_t\)</span> : suppose
<span class="math inline">\(c_t=x_i\lor \bar x_j\lor x_k\)</span> :
<span class="math inline">\(e_{t,1}=(a_{i,2t},d_t,e_t)\)</span> , <span
class="math inline">\(e_{t,2}=(a_{j,2t-1},d_t,e_t)\)</span> , <span
class="math inline">\(e_{t,3}=(a_{k,2t},d_t,e_t)\)</span> . (<span
class="math inline">\(1\le t\le k\)</span>)</li>
<li>For each cleanup garget <span class="math inline">\(g_t\)</span> :
<span class="math inline">\(e_{t,i,j}=(a_{i,j},y_t,z_t)\)</span> .
(<span class="math inline">\(1\le i\le n,1\le j\le 2k\)</span>)</li>
</ul>
<p><img src="/images/posts/AD6_fig2.png" /></p>
<blockquote>
<p>Cleanup garget : we need <span class="math inline">\((n-1)k\)</span>
gargets to recycle all free <span class="math inline">\(a_{i,j}\)</span>
( tips ) .</p>
</blockquote></li>
</ol>
<h4 id="coloring-problem-3col">4.4.5 3-Coloring Problem (3CoL)</h4>
<ol type="1">
<li><p>Description</p>
<ol type="1">
<li><p>Def [ <strong><em>valid coloring</em></strong> ] : Assign each
vertex a color , such that no two vertices having same color share an
edge</p>
<p><span class="math inline">\(\forall e=(u,v)\in E\)</span> , <span
class="math inline">\(c(u)\neq c(v)\)</span> .</p></li>
<li><p>Def [ <strong><em>chromatic number</em></strong> ] <span
class="math inline">\(\chi(G)\)</span> :The minimum number of colors
that can color the graph validly .</p></li>
<li><p>Input : A graph <span class="math inline">\(G\)</span></p>
<p>Output : Decide whether <span class="math inline">\(\chi(G)\le
3\)</span>.</p></li>
</ol>
<blockquote>
<ul>
<li><p><strong>2CoL</strong> is <strong>P</strong> , using
<code>dfs</code></p></li>
<li><p><strong>4CoL Theorem</strong> : Any planar graph can be <span
class="math inline">\(4\)</span>-colored.</p>
<p>Current Proof : need computer assistant ( <span
class="math inline">\(\sim 10^2\)</span> cases to verify )</p>
<p>Open : a proof of <strong>4CoL Theorem</strong> without computer
assistant ?</p></li>
</ul>
</blockquote></li>
<li><p><strong>3CoL</strong> is <strong>NPC</strong></p>
<p>Obviously <strong>3CoL</strong> is <strong>NP</strong> .</p>
<p>Goal : <strong>3CoL</strong> <span
class="math inline">\(\le_p\)</span> <strong>3-SAT</strong> .</p>
<p>Consider an instance <span class="math inline">\(I\)</span> for
<strong>3-SAT</strong> , construct <span
class="math inline">\(G\)</span> , s.t. <span
class="math inline">\(G\in\)</span> <strong>3CoL</strong> $I$
<strong>3-SAT</strong> .</p>
<p>Suppose <span class="math inline">\(I\)</span> has <span
class="math inline">\(n\)</span> variables , <span
class="math inline">\(k\)</span> clauses .</p>
<p>Construction :</p>
<p><strong>Vertices</strong> :</p>
<ul>
<li>Special garget : <span class="math inline">\(T,F,B\)</span> .</li>
<li>Variable garget : <span class="math inline">\(v_i\)</span> , <span
class="math inline">\(\bar v_i\)</span> .</li>
<li>Clause garget : <span class="math inline">\(6\)</span> vertices for
each clause .</li>
</ul>
<p><strong>Edges</strong> :</p>
<ul>
<li><p>Special garget : <span
class="math inline">\((T,F),(T,B),(F,B)\)</span></p>
<blockquote>
<p>To ensure <span class="math inline">\(T,F,B\)</span> having distinct
colors</p>
</blockquote></li>
<li><p>Variable garget : <span class="math inline">\((B,v_i)\)</span> ,
<span class="math inline">\((v_i,\bar v_i)\)</span> , <span
class="math inline">\((B,\bar v_i)\)</span></p>
<blockquote>
<p>To ensure either <span class="math inline">\(c(v_i)=c(T),c(\bar
v_i)=c(F)\)</span> , or <span class="math inline">\(c(v_i)=c(F),c(\bar
v_i)=c(T)\)</span></p>
</blockquote></li>
<li><p>Clause garget : Example <span class="math inline">\(c=(x_i\lor
\bar x_j\lor x_k)\)</span></p>
<p><span class="math inline">\((v_i,a),(T,a),(a,b),(T,b)\)</span></p>
<p><span class="math inline">\((\bar v_j,c),(T,c)\)</span></p>
<p><span class="math inline">\((v_k,d),(T,d),(d,e),(F,e)\)</span></p>
<p><span class="math inline">\((f,a),(f,c),(f,e)\)</span></p>
<blockquote>
<p>To ensure that <span class="math inline">\(f\)</span> can be colored
if and only if <span class="math inline">\(v_i,\bar v_j,v_k\)</span> are
not all colored with <span class="math inline">\(c(F)\)</span>.</p>
<p>Otherwise , <span class="math inline">\(c(v_i)=c(F)\)</span> , so
<span class="math inline">\(c(a)=c(B)\)</span> , so <span
class="math inline">\(c(b)=c(F)\)</span></p>
<p><span class="math inline">\(c(\bar v_j)=c(F)\)</span> , so <span
class="math inline">\(c(c)=c(B)\)</span></p>
<p><span class="math inline">\(c(v_3)=c(F)\)</span> , so <span
class="math inline">\(c(d)=c(B)\)</span> , so <span
class="math inline">\(c(e)=c(T)\)</span></p>
<p>Then <span class="math inline">\(f\)</span> cannot be colored</p>
</blockquote></li>
</ul>
<p><img src="/images/posts/AD6_fig3.png" /></p></li>
</ol>
<h4 id="subset-sum-problem">4.4.6 Subset Sum Problem</h4>
<ol type="1">
<li><p>Description</p>
<p>Input : <span class="math inline">\(w_1,\cdots,w_n\)</span> , target
<span class="math inline">\(W\)</span> .</p>
<p>Output : determine whether there exists <span
class="math inline">\(S\subseteq [n]\)</span> , s.t. <span
class="math inline">\(\sum_{i\in S}w_i=W\)</span> .</p>
<blockquote>
<p>Subset Sum can be solved in <span class="math inline">\(\mathcal
O(nW)\)</span> with DP method</p>
</blockquote></li>
<li><p><strong>Subset Sum</strong> is <strong>NPC</strong></p>
<p>Obviously <strong>Subset Sum</strong> is <strong>NP</strong>.</p>
<blockquote>
<p><span class="math inline">\(W\)</span> in the reduction must be
exponentially large</p>
</blockquote>
<p>Goal : <strong>3DM</strong> <span
class="math inline">\(\le_p\)</span> <strong>Subset Sum</strong></p>
<ol type="1">
<li><p>Construction</p>
<p>For each hyperedge <span class="math inline">\(e=(x,y,z)\)</span> ,
construct <span
class="math inline">\(w_e=2^{x-1+|Y|+|Z|}+2^{y-1+|Z|}+2^{z-1}\)</span></p>
<p>Let <span
class="math inline">\(W=\sum\limits_{x=1}^{|X|}2^{x-1+|Y|+|Z|}+\sum\limits_{y=1}^{|Y|}2^{y-1+|Z|}+\sum\limits_{z=1}^{|Z|}2^{z-1}\)</span></p>
<p><strong>Problem : carry ( 进位 )</strong></p></li>
<li><p>Improved Construction</p>
<p>Let <span class="math inline">\(b=|E|+1\)</span></p>
<p>For each hyperedge <span class="math inline">\(e=(x,y,z)\)</span> ,
construct <span
class="math inline">\(w_e=b^{x-1+|Y|+|Z|}+b^{y-1+|Z|}+b^{z-1}\)</span></p>
<p>Let <span
class="math inline">\(W=\sum\limits_{x=1}^{|X|}b^{x-1+|Y|+|Z|}+\sum\limits_{y=1}^{|Y|}b^{y-1+|Z|}+\sum\limits_{z=1}^{|Z|}b^{z-1}\)</span></p></li>
</ol></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>算法</tag>
        <tag>算法-图论</tag>
        <tag>算法-动态规划</tag>
        <tag>算法-动态规划-背包</tag>
        <tag>计算理论</tag>
        <tag>计算理论-NPC</tag>
        <tag>计算理论-3SAT</tag>
        <tag>算法-图论-独立集</tag>
        <tag>算法-图论-哈密顿路</tag>
        <tag>算法-图论-高维图匹配</tag>
        <tag>算法-图论-染色问题</tag>
      </tags>
  </entry>
  <entry>
    <title>Probability and Statistics 3</title>
    <url>/2023/10/16/Probability-and-Statistics-3/</url>
    <content><![CDATA[<h2 id="chapter-1-background-in-probability">Chapter 1 Background in
Probability</h2>
<h3 id="integration">1.4 Integration</h3>
<ol type="1">
<li><p>Judgement : an integration should satisfies :</p>
<ol type="i">
<li><p><span class="math inline">\(\varphi\ge 0\)</span> <span
class="math inline">\(\mu\)</span>-a.e. , then <span
class="math inline">\(\int\varphi d\mu\ge 0\)</span></p></li>
<li><p><span class="math inline">\(\int a\varphi d\mu=a\int \varphi
d\mu\)</span></p></li>
<li><p><span class="math inline">\(\int (\varphi+\psi)d\mu=\int \varphi
d\mu+\int \psi d\mu\)</span></p></li>
</ol></li>
<li><p>Basic Properties of integration</p>
<ol start="4" type="i">
<li><p><span class="math inline">\(\varphi\le \psi\)</span> <span
class="math inline">\(\mu\)</span>-a.e. , then <span
class="math inline">\(\int \varphi d\mu\le \int \psi
d\mu\)</span></p></li>
<li><p><span class="math inline">\(\varphi=\psi\)</span> <span
class="math inline">\(\mu\)</span>-a.e. , then <span
class="math inline">\(\int \varphi d\mu=\int \psi d\mu\)</span></p></li>
<li><p><span class="math inline">\(\left|\int \varphi d\mu\right|\le
\int |\varphi|d\mu\)</span></p></li>
</ol></li>
<li><p>Bounded Function</p>
<ol type="1">
<li><p>Def [ <strong><em>Bounded Function</em></strong> ] : <span
class="math inline">\(\exists M&lt;\infty,|f(x)|\le M\)</span> , and
<span class="math inline">\(\exists E\subseteq \Omega ,
\mu(E)&lt;\infty,f(E^c)=0\)</span> .</p></li>
<li><p>Def [ <strong><em>Integration for Bounded Function</em></strong>
] : Let <span class="math inline">\(\Phi_s\)</span> be the set of simple
functions . If <span class="math inline">\(f\)</span> is a bounded
function, <span class="math display">\[
\int fd\mu:=\sup_{\varphi\in \Phi_s , \varphi\le f}\int \varphi d\mu
=\inf_{\psi\in \Phi_s , \psi\ge f}\int \psi d\mu
\]</span></p>
<blockquote>
<p><span class="math inline">\(\varphi \le f,\psi\ge f\)</span> : <span
class="math inline">\(\mu\)</span>-a.e.</p>
</blockquote>
<blockquote>
<p>Intuition : Like Darboux's upper sum and lower sum</p>
</blockquote></li>
<li><p>Proof of <span class="math inline">\(\sup\limits_{\varphi\in
\Phi_s , \varphi\le f}\int \varphi d\mu =\inf\limits_{\psi\in \Phi_s ,
\psi\ge f}\int \psi d\mu\)</span></p>
<ol type="1">
<li><p>By Prop (iv) , <span class="math inline">\(\varphi\le f\le
\psi\)</span> , so <span class="math inline">\(\int \varphi d\mu\le \int
\psi d\mu\)</span></p></li>
<li><p>Prove <span class="math inline">\(\sup\limits_{\varphi\in \Phi_s
, \varphi\le f}\int \varphi d\mu \ge \inf\limits_{\psi\in \Phi_s ,
\psi\ge f}\int \psi d\mu\)</span></p>
<p>Since <span class="math inline">\(|f|\le M\)</span> , let <span
class="math inline">\(E_k:=\{x\in E|\frac{(k-1)M}{n}&lt;f(x)\le
\frac{kM}{n}\}\)</span> , <span class="math inline">\(-n\le k\le
n\)</span> .</p>
<p>Let <span class="math inline">\(\varphi_n=\sum\limits_{k=-n}^n
\frac{(k-1)M}{n}\mathbb 1_{E_k}\)</span> , <span
class="math inline">\(\psi_n=\sum\limits_{k=-n}^n \frac{kM}{n}\mathbb
1_{E_k}\)</span></p>
<p>Therefore , <span class="math inline">\(\int
(\psi_n-\varphi_n)d\mu=\frac{M}{n}\mu(E)\)</span> , <span
class="math display">\[
\sup\limits_{\varphi\in \Phi_s , \varphi\le f}\int \varphi d\mu \ge\int
\varphi_nd\mu =-\frac{M}{n}\mu(E)+\int \psi_n d\mu \ge
-\frac{M\mu(E)}{n}+\inf\limits_{\psi\in \Phi_s , \psi\ge f}\int \psi
d\mu
\]</span> When <span class="math inline">\(n\to +\infty\)</span> , <span
class="math inline">\(-\frac{M\mu(E)}{n}\to 0\)</span> , so <span
class="math inline">\(\sup\limits_{\varphi\in \Phi_s , \varphi\le f}\int
\varphi d\mu =\inf\limits_{\psi\in \Phi_s , \psi\ge f}\int \psi
d\mu\)</span> .</p>
<figure>
<img src="/images/posts/PS3_fig1.png" alt="diagram of the proof" />
<figcaption aria-hidden="true">diagram of the proof</figcaption>
</figure></li>
</ol></li>
<li><p>Proof of (i)(ii)(iii)</p>
<ol type="i">
<li><p>: Let <span class="math inline">\(\varphi=0\)</span> ,
trivial</p></li>
<li><p>: For <span class="math inline">\(a&gt;0\)</span> , <span
class="math inline">\(a\varphi\le af\iff \varphi\le f\)</span> <span
class="math display">\[
\int afd\mu =\sup_{a\varphi\le af}\int a\varphi d\mu =\sup_{\varphi\le
f}\int a\varphi d\mu=a\sup_{\varphi\le f}\int \varphi d\mu=a\int fd\mu
\]</span></p></li>
<li><p>: Firstly prove <span class="math inline">\(\int (f+g)d\mu\le
\int fd\mu +\int gd\mu\)</span> .</p></li>
</ol>
<p>Let <span class="math inline">\(\Phi_s&#39;=\{\psi|\exists
\psi_1,\psi_2,\psi=\psi_1+\psi_2,\psi_1\ge f,\psi_2\ge g\}\)</span> , so
<span class="math inline">\(\Phi_s&#39;\subseteq \{\psi|\psi\ge
f+g\}\)</span> . <span class="math display">\[
\begin{aligned}
\int(f+g)d\mu&amp;=\inf_{\psi\ge f+g} \psi d\mu\\
&amp;\le \inf_{\psi=\psi_1+\psi_2\in \Phi_s&#39;} (\psi_1+\psi_2)d\mu\\\
&amp;=\inf_{\psi_1\ge f,\psi_2\ge g} \int \psi_1d\mu+\int \psi_2d\mu\\
&amp;=\inf_{\psi_1\ge f} \int \psi_1d\mu+\inf_{\psi_2\ge g}\int
\psi_2d\mu\\
&amp;=\int fd\mu+\int gd\mu
\end{aligned}
\]</span> Use (ii) , let <span class="math inline">\(a=-1\)</span> , so
<span class="math inline">\(\int -(f+g)d\mu\le \int (-f)d\mu+\int
(-g)d\mu\)</span> , so <span class="math inline">\(\int (f+g)d\mu\ge
\int fd\mu+\int gd\mu\)</span> .</p></li>
</ol></li>
<li><p>Non-Negative Function</p>
<ol type="1">
<li><p>Compare with bounded function</p>
<ul>
<li>can exists <span class="math inline">\(x\)</span> , <span
class="math inline">\(f(x)\to\infty\)</span></li>
<li>the smallest <span class="math inline">\(E\)</span> s.t. <span
class="math inline">\(f(E^c)=0\)</span> , may be <span
class="math inline">\(\mu(E)\to\infty\)</span></li>
</ul></li>
<li><p>Notation :</p>
<ul>
<li><p><span class="math display">\[
\int_E fd\mu:=\int f\cdot \mathbb 1_{E}d\mu
\]</span></p></li>
<li><p><span class="math display">\[
a\land b:=\min\{a,b\}
\]</span></p></li>
<li><p>Def [ <strong><em>Integration for non-negative
function</em></strong> ] For non-negative function <span
class="math inline">\(f\)</span> , <span class="math display">\[
\int fd\mu=\sup\left\{\small\int hd\mu\mid 0\le h\le f,h\text{ bounded
function}\right\}
\]</span></p>
<blockquote>
<p><span class="math inline">\(h\)</span> is bounded function , so <span
class="math inline">\(\int hd\mu\)</span> is bounded , but <span
class="math inline">\(\sup \int hd\mu\)</span> can be unbounded .</p>
</blockquote></li>
</ul></li>
<li><p>Lemma :</p>
<p>If <span class="math inline">\(E_n\uparrow \Omega ,
\mu(E_n)&lt;\infty\)</span> , then <span class="math display">\[
\int_{E_n}f\land nd\mu\uparrow \int fd\mu
\]</span></p>
<blockquote>
<p>This lemma needs <span class="math inline">\(\mu\)</span> be a <span
class="math inline">\(\sigma\)</span>-finite measure.</p>
</blockquote>
<blockquote>
<p><span class="math inline">\((f\land n)\cdot \mathbb 1_E\in \{h|0\le
h\le f,h\text{ bounded function}\}\)</span></p>
</blockquote>
<p><strong>Proof</strong></p>
<p>It's easy to find that <span
class="math inline">\(\lim\limits_{n\to\infty}\int_{E_n}f\land n\
d\mu=\sup\limits_{n\ge 1}\int_{E_n}f\land n\ d\mu\le \int fd\mu\)</span>
.</p>
<p>We only need to prove that <span class="math inline">\(\forall
h,|h|&lt;M,0\le h\le f\)</span> , <span
class="math inline">\(\lim\limits_{n\to\infty}\int_{E_n}f\land n\
d\mu\ge \int fd\mu\)</span> .</p>
<p><span class="math inline">\(\forall n&gt;M\)</span> , <span
class="math inline">\(\int_{E_n} f\land n\ d\mu\ge \int_{E_n}hd\mu=\int
hd\mu-\int_{E_n^c}hd\mu\)</span> .</p>
<p>And <span class="math inline">\(\int_{E_n^c} hd\mu\le M\mu(E_n^c\cap
\{x|h(x)&gt;0\})\)</span> .</p>
<p>Since <span class="math inline">\(n\to \infty , \mu(E_n^c\cap
\{x|h(x)&gt;0\})\to 0\)</span> <span class="math inline">\((*)\)</span>
.</p>
<p>Therefore , <span class="math inline">\(\liminf\limits_{n\to \infty}
\int_{E_n} f\land n\ d\mu\ge \int hd\mu\)</span> .</p>
<p><strong>To prove <span
class="math inline">\((*)\)</span></strong></p>
<p>Let <span class="math inline">\(a_n=\mu(E_n^c\cap
\{x|h(x)&gt;0\})\)</span> , we only need to prove that <span
class="math inline">\(\lim_{n\to \infty} a_n=0\)</span> .</p>
<p>Way 1 : <span class="math inline">\(a_n\)</span> non-increasing ,
<span class="math inline">\(a_n\ge 0\)</span> , so <span
class="math inline">\(\lim_{n\to\infty} a_n=c\)</span></p>
<p>If <span class="math inline">\(c&gt;0\)</span> , then <span
class="math inline">\(\exists \tilde \Omega\subseteq \Omega ,
\mu(\tilde\Omega)=c\)</span> , <span class="math inline">\(\forall
n,\tilde\Omega\cap E_n=\varnothing\)</span> .</p>
<p>Therefore , <span class="math inline">\(\tilde \Omega\cap
\bigcup_{n=1}^{\infty}E_n=\varnothing\)</span> , so <span
class="math inline">\(\tilde \Omega\cap\Omega=\varnothing\)</span> ,
contradicts .</p>
<p>Way 2 : By definition , <span class="math inline">\(E_1^c\supset
E_2^c \supset\cdots\)</span> , so <span
class="math inline">\(E_n^c=\bigcap_{m=1}^n E_m^c\)</span> .</p>
<p><span
class="math inline">\(\lim_{n\to\infty}a_n=\mu(\bigcap_{m=1}^{\infty}E_m^c\cap\{x|h(x)&gt;0\})=\mu(\Omega^c\cap\{x|h(x)&gt;0\})=0\)</span></p></li>
<li><p>Proof of (i)(ii)(iii)</p>
<ol type="i">
<li><p>trivial (non-negative)</p></li>
<li><p><span class="math inline">\(a&gt;0\)</span> , so <span
class="math inline">\(ah\le af\iff h\le f\)</span> <span
class="math display">\[
\begin{aligned}
\int af\ d\mu&amp;=\sup\{\int ah\ d\mu|0\le ah\le af,ah\text{ bounded
function}\}\\
&amp;=\sup\{a\int h\ d\mu|0\le h\le f,h\text{ bounded function}\}\\
&amp;=a\int fd\mu
\end{aligned}
\]</span></p></li>
<li><p>Firstly prove <span class="math inline">\(\int (f+g)d\mu\ge \int
fd\mu+\int gd\mu\)</span></p></li>
</ol>
<p>Consider bounded function <span class="math inline">\(k,h\)</span> ,
<span class="math inline">\(f\ge h,g\ge k\)</span> , so ,</p>
<p><span class="math inline">\(\int (f+g)d\mu\ge \sup_{0\le h\le f}\int
hd\mu+\sup_{0\le k\le g}\int gd\mu=\int fd\mu+\int gd\mu\)</span></p>
<p>Secondly prove <span class="math inline">\(\int(f+g)d\mu\le \int
fd\mu+\int gd\mu\)</span></p>
<p>Use Lemma , <span class="math inline">\((f+g)\land n\le (f\land
n)+(g\land n)\)</span> .</p>
<p>Therefore , <span class="math inline">\(\int_{E_n}(f+g)\land n\
d\mu\le \int_{E_n}f\land n\ d\mu+\int_{E_n} g\land n\ d\mu\)</span>
.</p>
<p>As <span class="math inline">\(n\to\infty\)</span> , <span
class="math inline">\(\int (f+g)d\mu\le \int fd\mu+\int
gd\mu\)</span></p></li>
</ol></li>
<li><p>General Function</p>
<ol type="1">
<li><p>Def [ <strong><em>Integralable</em></strong> ] : <span
class="math inline">\(f\)</span> is
<strong><em>integralable</em></strong> if <span
class="math inline">\(\int |f|d\mu&lt;\infty\)</span></p></li>
<li><p>Def [ <strong><em>Integration for General Function</em></strong>
] : If <span class="math inline">\(f\)</span> is integralable ,
define</p>
<p><span class="math inline">\(f^+(x):=f(x)\cdot\mathbb 1(f(x)\ge
0)\)</span> , <span class="math inline">\(f^-(x):=|f(x)|\cdot \mathbb
1(f(x)&lt;0)\)</span></p>
<p>Therefore , <span class="math inline">\(f(x)=f^+(x)-f^-(x)\)</span> ,
and <span class="math inline">\(f^+,f^-\)</span> non-negative <span
class="math display">\[
\int fd\mu:=\int f^+d\mu-\int f^-d\mu
\]</span></p>
<blockquote>
<p>Note : If <span class="math inline">\(f\)</span> is not integralable
, then <span class="math inline">\(\int f^+d\mu\)</span> , <span
class="math inline">\(\int f^-d\mu\)</span> can both be <span
class="math inline">\(\infty\)</span> , <span
class="math inline">\(\infty-\infty\)</span> not well-defined.</p>
</blockquote></li>
</ol></li>
<li><p>Lebesgue Integration</p>
<ol type="1">
<li><p>When <span class="math inline">\(\mu\)</span> is Lebesgue measure
: <span class="math inline">\(\mu((a,b])=b-a\)</span> .</p>
<p><span class="math inline">\(\int fd\mu\)</span> is Lebesgue
Integration of <span class="math inline">\(f\)</span> .</p></li>
<li><p>Compare with Riemann Integration</p>
<figure>
<img src="/images/posts/PS3_fig2.png"
alt="Riemann Integration and Lebesgue Integration" />
<figcaption aria-hidden="true">Riemann Integration and Lebesgue
Integration</figcaption>
</figure>
<p>Lebesgue is more powerful</p>
<p>E.g. Dirichlet function <span class="math inline">\(\mathbb
1_{\mathbb Q}\)</span> , <span class="math inline">\(\int_{[0,1]}\mathbb
1_{\mathbb Q}\)</span></p>
<ul>
<li>Riemann integration : not integralable , <span
class="math inline">\(\mu(\{x:\mathbb 1_{\mathbb Q}\text{ incontinuous
at }x\})&gt;0\)</span></li>
<li>Lebesgue integration : integralable : <span
class="math inline">\(\int_{[0,1]}\mathbb 1_{\mathbb Q}=\mu(\mathbb
Q\cap [0,1])=0\)</span></li>
</ul></li>
</ol></li>
<li><p>Formal Definition of PDF</p>
<ol type="1">
<li><p>Def [ <strong><em>Absolutely Continuous</em></strong> ] : <span
class="math inline">\(\nu\)</span> is <strong><em>absolutely
continuous</em></strong> w.r.t <span class="math inline">\(\mu\)</span>
( denote as <span class="math inline">\(\nu\ll \mu\)</span> ) if</p>
<p><span class="math inline">\(\forall A\in \mathcal F\)</span> , <span
class="math inline">\(\mu(A)=0\Rightarrow \nu(A)=0\)</span></p></li>
<li><p>Thm [ Radon-Nikodym ] : measurable space <span
class="math inline">\((\Omega,\mathcal F)\)</span> with <span
class="math inline">\(\sigma\)</span>-finite measure <span
class="math inline">\(\nu,\mu\)</span> , <span
class="math inline">\(\nu\ll\mu\)</span></p>
<p>Then <span class="math inline">\(\exists g\ge 0,\forall E\in \mathcal
F,\int_Egd\mu=\nu(E)\)</span></p>
<p><span class="math inline">\(g\)</span> is called Radon-Nikodym
derivative</p></li>
<li><p>Prop : If <span class="math inline">\(\exists g,h\)</span> ,
<span class="math inline">\(\int_Egd\mu=\int_Ehd\mu=\nu(E)\)</span> ,
then <span class="math inline">\(g\equiv h\)</span> a.e.</p></li>
<li><p>Def [ <strong><em>PDF</em></strong> ] : When <span
class="math inline">\(\nu\)</span> is a distribution measure , <span
class="math inline">\(\mu\)</span> is Lebesgue measure , <span
class="math inline">\(g\)</span> is the <strong><em>probability density
function</em></strong> for <span class="math inline">\(\nu\)</span>
.</p>
<blockquote>
<p>Observation : <span class="math inline">\(\nu(E)=P(X\in E)=\int
gd\mu\)</span></p>
</blockquote></li>
<li><p>Prop : <span class="math inline">\(g\)</span> is <span
class="math inline">\(\mathcal F\)</span>-measurable , i.e. <span
class="math inline">\(g\)</span> is a measurable map <span
class="math inline">\((\Omega,\mathcal F)\to(\mathbb R,\mathcal
R)\)</span> .</p></li>
</ol></li>
</ol>
<h3 id="properties-of-integration">1.5 Properties of Integration</h3>
<ol type="1">
<li><p>Jenson's Inequality</p>
<ol type="1">
<li><p>Def [ <strong><em>convex</em></strong> ] : <span
class="math inline">\(\varphi\)</span> is a convex , if <span
class="math inline">\(\forall \lambda\in (0,1) ,
\lambda\varphi(x)+(1-\lambda)\varphi(y)\ge \varphi(\lambda
x+(1-\lambda)y)\)</span></p></li>
<li><p>THM [ <strong><em>Jenson's Inequality</em></strong> ] : If <span
class="math inline">\(\varphi\)</span> is a convex , <span
class="math inline">\(\varphi\circ f\)</span> is integralable , <span
class="math inline">\(\mu\)</span> is a probability measure , <span
class="math display">\[
\varphi\left(\int fd\mu\right)\le \int (\varphi \circ f)d\mu
\]</span></p></li>
<li><p>Proof :</p>
<p>For convex <span class="math inline">\(\varphi\)</span> , we can find
a line <span class="math inline">\(L(x)=ax+b\)</span> , s.t. <span
class="math inline">\(\varphi(x)\ge ax+b\)</span> , and equal iff <span
class="math inline">\(x=\int fd\mu\)</span> . <span
class="math display">\[
\begin{aligned}
\int (\varphi \circ f)d\mu&amp;\ge \int (ax+b)\circ fd\mu\\
&amp;=a\int fd\mu+\int bd\mu\\
&amp;=a\int fd\mu+b\mu(\Omega)\\
&amp;=a\int fd\mu+b\\
&amp;=L\left(\int fd\mu\right)\\
&amp;=\varphi\left(\int fd\mu\right)
\end{aligned}
\]</span></p></li>
</ol></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>概率与统计</category>
      </categories>
      <tags>
        <tag>概率论-概率密度函数</tag>
        <tag>实分析-Lebesgue积分</tag>
        <tag>实分析-积分性质</tag>
      </tags>
  </entry>
  <entry>
    <title>ZKP and MPC 3</title>
    <url>/2023/10/16/ZKP-and-MPC-3/</url>
    <content><![CDATA[<h2 id="lec02-zero-knowledge-proof-under-composition">Lec02
Zero-Knowledge Proof under Composition</h2>
<h3 id="sequential-composition">2.3 Sequential Composition</h3>
<blockquote>
<p><span class="math inline">\(V^*\)</span> : should be viewed as a
black box , input <span
class="math inline">\((input,randomness,messages)\)</span> , output
response</p>
</blockquote>
<blockquote>
<p>If guess wrong at <span class="math inline">\(m_i\)</span> , we can
rewind <span class="math inline">\(V^*\)</span> to the time when it
receives <span class="math inline">\(m_i\)</span> , using the same input
, randomness , first <span class="math inline">\(i-1\)</span>
messages</p>
</blockquote>
<ol type="1">
<li><p>Def [ <strong><em>Zero-Knowledge Proof with Auxiliary
Input</em></strong> ] :</p>
<p><span class="math inline">\((P,V)\)</span> is an
<strong><em>interactive zero-knowledge proof with auxiliary
input</em></strong> for <span class="math inline">\(L\)</span> if</p>
<ul>
<li><p><span class="math inline">\(V\)</span> is an efficient algorithm
( poly-time )</p></li>
<li><p><strong><em>Completeness</em></strong> : <span
class="math inline">\(\forall x\in L,\forall y,z\in \{0,1\}^*\)</span>
<span class="math display">\[
\Pr\{\braket{P(y),V(z)}(x)=1\}=1
\]</span></p></li>
<li><p><strong><em>Soundness with error <span
class="math inline">\(\epsilon\)</span></em></strong> : <span
class="math inline">\(\forall x\notin L,\forall y,z\in
\{0,1\}^*\)</span> <span class="math display">\[
\forall P^*,\Pr\{\braket{P^*(y),V(z)}(x)=1\}&lt;\epsilon
\]</span></p></li>
<li><p><strong><em>Zero-Knowledge</em></strong> : <span
class="math inline">\(\forall V^*\)</span> in P.P.T. , <span
class="math inline">\(\exists\)</span> expected poly-time algorithm
<span class="math inline">\(M^*\)</span> , <span
class="math inline">\(\forall x\in L,\forall y,z\in \{0,1\}^*\)</span> ,
<span class="math display">\[
M^*(x,z)\sim View_{V^*(z)}^{P(y)}(x)
\]</span></p></li>
</ul>
<blockquote>
<p><span class="math inline">\(y\)</span> is indeed redundant since
<span class="math inline">\(P\)</span> is not computationally
bounded</p>
</blockquote>
<blockquote>
<p>Alternating Form of Zero-Knowledge : <span
class="math inline">\(\forall V^*,\exists M^*,\forall x\in L,\forall
y,z\in \{0,1\}^*\)</span> <span class="math display">\[
M^*(x,z)\sim \braket{P(y),V^*(z)}(x)
\]</span></p>
</blockquote></li>
<li><p>Lemma [ <strong><em>Composition Lemma</em></strong> ] :</p>
<p><span class="math inline">\((P,V)\)</span> : zero-knowledge proof
with auxiliary input for <span class="math inline">\(L\)</span> , <span
class="math inline">\(q\)</span> is a polynomial of <span
class="math inline">\(\kappa\)</span></p>
<p><span class="math inline">\((P_q,V_q)\)</span> : An IP runs <span
class="math inline">\((P,V)\)</span> <span
class="math inline">\(q\)</span> times</p>
<p>Claim : <span class="math inline">\((P_q,V_q)\)</span> are still
zero-knowledge .</p></li>
<li><p>Proof of Composition Lemma</p>
<blockquote>
<p>Only Proof for perfect zero-knowledge</p>
<p>Using Alternating Form of Zero-Knowledge</p>
</blockquote>
<h4 id="construction">Construction</h4>
<ol type="1">
<li><p><span class="math inline">\(M^*\)</span> inputs <span
class="math inline">\((x,z)\)</span> , <span
class="math inline">\(st_0=\varnothing\)</span> .</p></li>
<li><p>For <span class="math inline">\(i=1,2,\cdots q\)</span> :</p>
<ol type="1">
<li><p>Construct <span class="math inline">\(V_i^*\)</span> : <span
class="math inline">\(V_i^*(x,st_{i-1},z)=\begin{cases}V^*(x,z)&amp;i=1\\V^*(st_{i-1})&amp;i\ge
2\end{cases}\)</span> , output state of <span
class="math inline">\(V^*\)</span> as <span
class="math inline">\(st_i\)</span></p>
<blockquote>
<p>Note : for <span class="math inline">\(i\ge 2\)</span> , <span
class="math inline">\(x,z\)</span> are contained in <span
class="math inline">\(st_{i-1}\)</span> , so we do not need to input
them.</p>
</blockquote></li>
<li><p>By zero-knowledge , <span class="math inline">\(\exists
M_i^*(x,st_{i-1}|z)\)</span> that can simulate the output of <span
class="math inline">\(V_i^*\)</span>.</p></li>
<li><p><span class="math inline">\(M^*\)</span> runs <span
class="math inline">\(M_i^*(x,st_{i-1}|z)\)</span> to obtain <span
class="math inline">\(st_i\)</span>.</p></li>
</ol></li>
<li><p><span class="math inline">\(M^*\)</span> invokes <span
class="math inline">\(V^*\)</span> with <span
class="math inline">\(st_q\)</span> to obtain final output.</p></li>
</ol>
<h4 id="proof-of-mxzequiv-braketpyvzx">Proof of <span
class="math inline">\(M^*(x,z)\equiv
\braket{P(y),V^*(z)}(x)\)</span></h4>
<ol start="0" type="1">
<li><p><span class="math inline">\(Hyb_0\)</span> : Real Execution <span
class="math inline">\(\braket{P(y),V^*(z)}(x)\)</span></p></li>
<li><p><span class="math inline">\(Hyb_1\)</span> :</p>
<ol type="1">
<li>Run <span class="math inline">\(P(y)\)</span> and <span
class="math inline">\(V^*(z)\)</span> until the last phase .</li>
<li>Record the state of <span class="math inline">\(V^*\)</span> as
<span class="math inline">\(st_{q-1}\)</span> .</li>
<li>Invoke <span class="math inline">\(M_q^*(x,st_{q-1}|z)\)</span> to
obtain <span class="math inline">\(st_q\)</span> .</li>
<li>Invoke <span class="math inline">\(V^*\)</span> with <span
class="math inline">\(st_q\)</span> to obtain final output .</li>
</ol>
<p><strong>Proof of <span class="math inline">\(Hyb_0\equiv
Hyb_1\)</span></strong> :</p>
<p>Fix state <span class="math inline">\(st_{q-1}\)</span> . For <span
class="math inline">\(Hyb_0\)</span> , its distribution is the same
as</p>
<ol type="1">
<li>Run <span class="math inline">\(V_q^*(x,st_{q-1},z)\)</span> to
obtain <span class="math inline">\(st_q\)</span></li>
<li>Run <span class="math inline">\(V^*\)</span> with <span
class="math inline">\(st_q\)</span> to obtain final output</li>
</ol>
<p>By zk , given <span class="math inline">\(x,st_{q-1},z\)</span> ,
<span class="math inline">\(V_q^*(x,st_{q-1},z)\equiv
M_q^*(x,st_{q-1}|z)\)</span>.</p>
<p>Therefore , <span class="math inline">\(Hyb_0\equiv
Hyb_1\)</span>.</p></li>
<li><p><span class="math inline">\(Hyb_i\)</span> : Change to <span
class="math inline">\(M^*\)</span> after the first <span
class="math inline">\(q-i\)</span> phases</p>
<p><span class="math inline">\(Hyb_{i-1}\)</span> : using <span
class="math inline">\(V^*_{q-i+1}\)</span> , <span
class="math inline">\(Hyb_i\)</span> : using <span
class="math inline">\(M^*_{q-i+1}\)</span></p>
<p>Therefore , <span class="math inline">\(Hyb_{i-1}\equiv
Hyb_i\)</span></p></li>
<li><p><span class="math inline">\(Hyb_{q+1}\)</span> : The execution of
<span class="math inline">\(M^*\)</span> .</p></li>
</ol></li>
</ol>
<h2 id="lec03-commitment-scheme-zk-proof-for-general-np">Lec03
Commitment Scheme , zk-proof for general NP</h2>
<h3 id="important">3.0 Important</h3>
<p><strong><em>In the next lecture , we re-defined commitment in a more
simple way .</em></strong></p>
<h3 id="commitment">3.1 Commitment</h3>
<blockquote>
<p>Intuition :</p>
<p>Binding : make some choice , then won't change it . (change will be
rejected)</p>
<p>Hiding : like "safe" , until open it , not knowing the choice .
(uniform distribution)</p>
</blockquote>
<ol type="1">
<li><p>Def [ <strong><em>Commitment</em></strong> ] : Three P.P.T.
algorithms</p>
<ul>
<li><span class="math inline">\(Gen(1^\kappa)\to pp\)</span> ( public
parameter )</li>
<li><span class="math inline">\(Commit_{pp}(m,r)\to
(c,aux)\)</span></li>
<li><span
class="math inline">\(Verify_{pp}(m,c,aux)\to\{Accept,Reject\}\)</span></li>
</ul>
<p>Satisfying properties :</p>
<ul>
<li><strong><em>Hiding</em></strong> : <span
class="math inline">\(\forall m,m&#39;\)</span> , the following
distributions are identical/statistically
indistinguishable/computationally indistinguishable</li>
</ul>
<p><span class="math display">\[
Dist\{pp\gets Gen(1^\kappa),(c,aux)\gets Commit_{pp}(m,r) : (pp,c)\}
\]</span></p>
<p><span class="math display">\[
Dist\{pp\gets Gen(1^\kappa),(c,aux)\gets Commit_{pp}(m&#39;,r) :
(pp,c)\}
\]</span></p>
<ul>
<li><p><strong><em>Binding</em></strong> : <span
class="math inline">\(\forall m\neq m&#39;\)</span> , <span
class="math inline">\(\forall pp\gets Gen(1^\kappa)\)</span> ,</p>
<p><strong><em>Perfect Binding</em></strong> : <span
class="math display">\[
\{c|(c,aux)\gets Commit_{pp}(m,r)\}\cap\{c|(c,aux)\gets
Commit_{pp}(m&#39;,r)\}=\varnothing
\]</span> <strong><em>Computationally Binding</em></strong> : <span
class="math inline">\(\forall P^*\)</span> in P.P.T.</p></li>
</ul></li>
</ol>
<p><span class="math display">\[
\small{\Pr\{pp\gets Gen(1^\kappa) , (c,m,aux,m&#39;,aux&#39;)\gets
P^*(pp) :
Verify_{pp}(c,m,aux)=Verify(c,m&#39;,aux&#39;)=Accept\}&lt;\epsilon}
\]</span></p>
<ol start="2" type="1">
<li><p>Open Commitment ( implement <span
class="math inline">\(Verify_{pp}\)</span> )</p>
<p>One general way : let <span class="math inline">\(aux=r\)</span></p>
<p>Verify checks whether <span
class="math inline">\((c,aux)=Commit_{pp}(m,r=aux)\)</span></p>
<blockquote>
<p>There exists faster Verify for some specific problems.</p>
</blockquote></li>
<li><p>Note : hiding and binding are incompatible</p>
<p>hiding : <span class="math inline">\(c\)</span> can from different
<span class="math inline">\(m\)</span></p>
<p>binding : <span class="math inline">\(c\)</span> cannot from
different <span class="math inline">\(m\)</span></p></li>
</ol>
<h3 id="perfect-binding-computationally-hiding-commitment">3.2 Perfect
Binding &amp; Computationally Hiding Commitment</h3>
<ol type="1">
<li><p>DDH assumption</p>
<p><span class="math inline">\(G=\braket g\)</span> , <span
class="math inline">\(|g|=p\)</span> , <span class="math inline">\(p\sim
\kappa\)</span> , The following distributions are computationally
indistinguishable <span class="math display">\[
Dist\{a,b\gets \mathbb Z_p :(g,g^a,g^b,g^{ab})\}
\]</span></p>
<p><span class="math display">\[
Dist\{a,b,c\gets\mathbb Z_p:(g,g^a,g^b,g^c)\}
\]</span></p></li>
<li><p>Construction</p>
<p>Suppose that <span class="math inline">\(m\in G\)</span> <span
class="math display">\[
Gen(1^\kappa)\to(G,p,g)
\]</span></p>
<p><span class="math display">\[
Commit_{pp}(m,(a,b))\to c=(g^a,g^b,mg^{ab}),aux=(a,b)
\]</span></p>
<p><span class="math display">\[
Verify_{pp}(c,m,aux):\text{check }c_1=g^a\land c_2=g^b\land c_3=g^{ab}
\]</span></p></li>
<li><p>Perfect Binding</p>
<p>For <span class="math inline">\(c=(c_1,c_2,c_3)\)</span> , <span
class="math inline">\(\exists \ \)</span> unique <span
class="math inline">\(a,b\)</span> , s.t. <span
class="math inline">\(c_1=g^a,c_2=g^b\)</span> .</p>
<p>Therefore , <span class="math inline">\(m=\frac{c_3}{g^{ab}}\)</span>
, which is uniquely determined</p></li>
<li><p>Computationally Hiding [Prove by Hybrid Proof]</p>
<p>We want to prove that <span class="math display">\[
\{a,b\gets \mathbb Z_p:(g^a,g^b,mg^{ab})\}\sim \{a,b\gets \mathbb
Z_p:(g^a,g^b,m&#39;g^{ab})\}
\]</span></p>
<ol start="0" type="1">
<li><span class="math inline">\(Hyb_0\)</span> : <span
class="math inline">\(a,b\gets \mathbb Z_p
:(g^a,g^b,mg^{ab})\)</span></li>
<li><span class="math inline">\(Hyb_1\)</span> : <span
class="math inline">\(a,b,c\gets \mathbb
Z_p:(g^a,g^b,mg^c)\)</span></li>
<li><span class="math inline">\(Hyb_2\)</span> : <span
class="math inline">\(a,b,c\gets \mathbb
Z_p:(g^a,g^b,m&#39;g^c)\)</span>\</li>
<li><span class="math inline">\(Hyb_3\)</span> : <span
class="math inline">\(a,b,c\gets \mathbb
Z_p:(g^a,g^b,m&#39;g^{ab})\)</span></li>
</ol>
<p><span class="math display">\[
Hyb_0\sim Hyb_1\equiv Hyb_2\sim Hyb_3
\]</span></p></li>
</ol>
<h3 id="computationally-binding-perfect-hiding-commitment">3.3
Computationally Binding &amp; Perfect Hiding Commitment</h3>
<ol type="1">
<li><p>Discrete Log Assumption</p>
<p><span class="math inline">\(G=\braket{g}\)</span> , <span
class="math inline">\(|g|=p\)</span> . <span
class="math inline">\(\forall A\)</span> in P.P.T. , <span
class="math display">\[
\Pr\{a\gets \mathbb Z_p:A(G,p,g,g^a)=a\}&lt;\epsilon
\]</span></p></li>
<li><p>Construction</p>
<p>Suppose that <span class="math inline">\(m\in \mathbb Z_p\)</span>
<span class="math display">\[
Gen(1^\kappa)\to (G,p,g,h=g^a)
\]</span> <span class="math inline">\(a\gets \mathbb Z_p\)</span> , and
we need to "forget" <span class="math inline">\(a\)</span> . <span
class="math display">\[
Commit_{pp}(m,r):r\gets \mathbb Z_p,c=g^rh^m,aux=r
\]</span></p>
<p><span class="math display">\[
Verify_{pp}(c,m,aux):\text{ check } c=g^{aux}h^m
\]</span></p></li>
<li><p>Perfect Hiding</p>
<p><span class="math inline">\(c=g^rh^m\)</span> , which is the same
distribution as <span class="math inline">\(g^r\)</span> , since <span
class="math inline">\(r\)</span> is random .</p></li>
<li><p>Computationally Binding</p>
<p>Suppose that we have P.P.T. <span class="math inline">\(A\)</span>
that <span class="math inline">\(A(G,p,g,h)\to
(c,m,aux,m&#39;,aux&#39;)\)</span> , s.t. <span class="math display">\[
c=g^{aux}h^m=g^{aux&#39;}h^{m&#39;}
\]</span> Therefore : <span class="math display">\[
aux+am=aux&#39;+am&#39;
\]</span> Therefore : <span class="math display">\[
a=\frac{aux-aux&#39;}{m&#39;-m}
\]</span> which contradicts with Discrete Log assumption</p></li>
<li><p>How to generate <span class="math inline">\(h=g^a\)</span></p>
<ul>
<li><p>Generate by Sender ? NO</p>
<p>knows <span class="math inline">\(a\)</span> , then <span
class="math inline">\(c=g^{r+am}\)</span> . Sender can choose proper
<span class="math inline">\(r\)</span> to reconstruct <span
class="math inline">\(m\)</span> .</p></li>
<li><p>Generate by Receiver ? YES</p>
<p>knows <span class="math inline">\(a\)</span> not affect
hiding</p></li>
</ul>
<blockquote>
<p>perfect binding + computationally hiding : Sender runs Gen</p>
<p>computationally binding + perfect hiding : Receiver runs Gen</p>
</blockquote></li>
</ol>
<h3 id="general-np-problem">3.4 General NP Problem</h3>
<ol type="1">
<li><p>NP , NPC</p>
<ol type="1">
<li><p>Def [ <strong><em>NP</em></strong> ] : <span
class="math inline">\(L\in NP\)</span> if <span
class="math inline">\(\exists\ \)</span> poly-time algorithm <span
class="math inline">\(D\)</span> ,</p>
<p><span class="math inline">\(x\in L\Leftrightarrow \exists w\ ,\
D(x,w)=1\)</span></p>
<p><span class="math inline">\(w\)</span> is called the proof .</p></li>
<li><p>Def [ <strong><em>NP-Complete</em></strong> ] : <span
class="math inline">\(L&#39;\in NP-Complete\)</span> if <span
class="math inline">\(\forall L\in NP\)</span> , <span
class="math inline">\(\exists\ \)</span> poly-time algorithm <span
class="math inline">\(Q\)</span> , s.t.</p>
<p><span class="math inline">\(Q(x,w)=(x&#39;,w&#39;)\)</span> , and
<span class="math inline">\(x\in L\equiv x&#39;\in L&#39;\)</span> .</p>
<p>i.e. all NP problems can be poly-reducible to NP-Complete
Problem</p></li>
</ol></li>
<li><p>Graph Hamiltonicity</p>
<ol type="1">
<li><p>Def [ <strong><em>Hamiltonian</em></strong> ] : A graph <span
class="math inline">\(G\)</span> is
<strong><em>Hamiltonian</em></strong> , if there exists a cycle visiting
each node exactly once.</p></li>
<li><p>Construct zk-proof for Graph Hamiltonicity</p>
<ol type="1">
<li><p>Like GI :</p>
<p>Step 1 : send <span class="math inline">\(\tilde G\)</span> , where
<span class="math inline">\(G\sim \tilde G\)</span></p>
<p>Step 2 : give a cycle of <span class="math inline">\(\tilde
G\)</span> to prove its Hamiltonian</p>
<p>Traditional ways : we cannot realize both</p></li>
<li><p>Protocol with commitment</p></li>
</ol>
<p>Commitment of a graph <span class="math inline">\(G\)</span> : <span
class="math inline">\(\forall i,j\in [n]\)</span> , <span
class="math display">\[
(c_{i,j},aux_{i,j})\gets\begin{cases}
Commit_{pp}(1,r)&amp;(i,j)\in E\\
Commit_{pp}(0,r)&amp;(i,j)\notin E
\end{cases}
\]</span></p>
<table>
<colgroup>
<col style="width: 28%" />
<col style="width: 46%" />
<col style="width: 25%" />
</colgroup>
<thead>
<tr class="header">
<th style="text-align: center;">Prover</th>
<th style="text-align: center;"></th>
<th style="text-align: center;">Verifier</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;"><span class="math inline">\(\pi\gets
S_n\)</span> , <span class="math inline">\(\tilde
G:=\pi(G)\)</span></td>
<td style="text-align: center;"></td>
<td style="text-align: center;"></td>
</tr>
<tr class="even">
<td style="text-align: center;">Commit <span
class="math inline">\(\tilde G\)</span> :</td>
<td style="text-align: center;">---<span
class="math inline">\(c_{i,j}\)</span>--&gt;</td>
<td style="text-align: center;"></td>
</tr>
<tr class="odd">
<td style="text-align: center;"></td>
<td style="text-align: center;">&lt;--<span
class="math inline">\(b\)</span>---</td>
<td style="text-align: center;"><span class="math inline">\(b\gets
\{0,1\}\)</span></td>
</tr>
<tr class="even">
<td style="text-align: center;">If <span
class="math inline">\(b=0\)</span></td>
<td style="text-align: center;">---<span
class="math inline">\(\pi,m_{i,j},aux_{i,j}\)</span>--&gt;</td>
<td style="text-align: center;">Checks <span
class="math inline">\(\tilde G=\pi(G)\)</span></td>
</tr>
<tr class="odd">
<td style="text-align: center;">If <span
class="math inline">\(b=1\)</span></td>
<td style="text-align: center;">---cycle for <span
class="math inline">\(\tilde G\)</span> , <span
class="math inline">\(m_{i,j},aux_{i,j}\)</span> in cycle---&gt;</td>
<td style="text-align: center;">Checks cycle and edges existence</td>
</tr>
</tbody>
</table></li>
</ol></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>零知识证明和多方安全计算</category>
      </categories>
      <tags>
        <tag>密码学</tag>
        <tag>密码学-证明复合</tag>
        <tag>密码学-证明复合引理</tag>
        <tag>密码学-承诺</tag>
        <tag>密码学-密码学假设-DDH假设</tag>
        <tag>密码学-密码学假设-离散对数假设</tag>
        <tag>密码学-承诺实现零知识证明</tag>
      </tags>
  </entry>
  <entry>
    <title>Algorithm Design 5</title>
    <url>/2023/10/16/Algorithm-Design-5/</url>
    <content><![CDATA[<h2 id="chapter-3-dynamic-programming">Chapter 3 Dynamic
Programming</h2>
<h3 id="treewidth-cond.">3.4.2 Treewidth cond.</h3>
<p>Extension : [ Roberson , Seymour ] [ <strong><em>Graph Minor
THM</em></strong> ]</p>
<blockquote>
<p>1983 - 2004 , 20 papers , 500 pages</p>
</blockquote>
<ol type="1">
<li><p>Def [ <strong><em>Minor</em></strong> ] : a minor of <span
class="math inline">\(G\)</span> can be obtained by contracting edges
and deletion of nodes/edges from <span
class="math inline">\(G\)</span></p>
<p>contract edge : <span class="math inline">\(e=\{u,v\}\)</span> ,
contract it : then merge <span class="math inline">\({u,v}\)</span> as a
new vertex .</p></li>
<li><p>Def [ <strong><em>Minor-closed</em></strong> ] : A graph family
<span class="math inline">\(\mathcal F\)</span> of graph is
<strong><em>minor-closed</em></strong> if <span
class="math inline">\(\forall G\in \mathcal F\)</span> , the minor of
<span class="math inline">\(G\in \mathcal F\)</span></p>
<p>E.g. : <span class="math inline">\(\mathcal
F=\{\text{forests}\}\)</span> , <span class="math inline">\(\mathcal
F=\{\text{planar graph}\}\)</span> , <span
class="math inline">\(\mathcal F=\{\text{graphs with }tw\le k\}\)</span>
are all minor-closed</p></li>
<li><p>Thm [ <strong><em>Wagner-Kuratowski THM</em></strong> ] : planar
graph <span class="math inline">\(\Leftrightarrow\)</span> no <span
class="math inline">\(K_{3,3} , K_5\)</span> as minors</p>
<p>Likewise : forests <span
class="math inline">\(\Leftrightarrow\)</span> no <span
class="math inline">\(K_3\)</span> as minors</p></li>
<li><p>Thm [ <strong><em>Graph Minor THM</em></strong> ] : a
minor-closed family <span class="math inline">\(\Leftrightarrow\)</span>
no <span class="math inline">\(H\)</span> as minors , <span
class="math inline">\(H\)</span> is a finite graph set</p></li>
<li><p>Application : Fixed Parameter Tractable</p></li>
</ol>
<h3 id="shortest-path-bellman-ford">3.5 Shortest Path
(Bellman-Ford)</h3>
<ol type="1">
<li><p>Description</p>
<p>directed graph <span class="math inline">\(G=(V,E)\)</span> , <span
class="math inline">\(w_e\in \mathbb R\)</span> , <span
class="math inline">\(G\)</span> has no negative cycle . ( not
necessarily <span class="math inline">\(w_e\ge 0\)</span> )</p>
<p>Find shortest path for all vertices to <span
class="math inline">\(t\)</span> .</p></li>
<li><p>Algorithm</p>
<p><span class="math inline">\(opt(i,v)\)</span> : using at most <span
class="math inline">\(i\)</span> edges , the shortest path from <span
class="math inline">\(v\)</span> to <span
class="math inline">\(t\)</span> . <span class="math display">\[
opt(i,v)=\min\begin{cases}
opt(i-1,v)\\
\min_u\{opt(i-1,u)+w_{v\to u}\}
\end{cases}
\]</span> Time Complexity : <span class="math inline">\(\mathcal
O(nm)\)</span> .</p>
<p>Memory Complexity : <span class="math inline">\(\mathcal
O(n^2)\)</span> . ( at most using <span
class="math inline">\(n-1\)</span> edges )</p></li>
<li><p>Optimize Memory Complexity <span class="math display">\[
opt(v)=\min\begin{cases}
opt(v)\\
\min_{(v,u)\in E}\{opt(u)+w_{v\to u}\}
\end{cases}
\]</span> Iterate the update <span class="math inline">\(n\)</span>
times .</p>
<blockquote>
<p>Note : may using <span class="math inline">\(opt(i,u)\)</span> , but
we do not care about the really number of edges in the shortest path
.</p>
</blockquote>
<blockquote>
<p>Observation : After <span class="math inline">\(i\)</span> iterations
, <span class="math inline">\(opt(v)\)</span> is not larger than <span
class="math inline">\(opt(i,v)\)</span></p>
</blockquote></li>
<li><p>Find the Shortest Path</p>
<p>Construct a pointer graph <span
class="math inline">\(P=\{(v,first[v])|v\in V\backslash\{t\}\}\)</span>
:</p>
<p><span class="math inline">\(\forall v\in V\)</span> , <span
class="math inline">\(first[v]\)</span> : the first node on the <span
class="math inline">\(v\to t\)</span> path after <span
class="math inline">\(v\)</span> .</p>
<p>Compute <span class="math inline">\(first[v]\)</span> : When <span
class="math inline">\(opt(v)\)</span> is updated by <span
class="math inline">\(opt(u)+w_{v\to u}\)</span> , let <span
class="math inline">\(first[v]=u\)</span> .</p>
<blockquote>
<p>Observation : After all iterations , <span
class="math inline">\(\{(v,first[v])|v\in V\backslash \{t\}\}\)</span>
forms a shortest path tree .</p>
</blockquote></li>
<li><p>Lemma 1 : Suppose <span class="math inline">\(G\)</span> has no
negative cycle . At the termination of the algorithm , <span
class="math inline">\(P\)</span> is the a shortest path tree rooted at
<span class="math inline">\(t\)</span> .</p>
<p>Proof : [ Induction like ]</p>
<p>If <span class="math inline">\(first[v]=u\)</span> , then <span
class="math inline">\(opt(v)=opt(u)+w_{v\to u}\)</span></p>
<p><span class="math inline">\(opt(u)\)</span> is the shortest <span
class="math inline">\(u\to t\)</span> path . Therefore , the cost of
<span class="math inline">\(u\to t\)</span> path on <span
class="math inline">\(P\)</span> is smallest . Therefore , the cost of
<span class="math inline">\(v\to t\)</span> path on <span
class="math inline">\(P\)</span> is smallest .</p></li>
<li><p>Lemma 2 : If <span class="math inline">\(P\)</span> contains a
cycle <span class="math inline">\(C\)</span> at any stage , then <span
class="math inline">\(cost(C)&lt;0\)</span></p>
<p>Proof : Let <span
class="math inline">\(C=\{v_1,v_2,\cdots,v_k\}\)</span> .</p>
<p>If <span class="math inline">\(first[v]=u\)</span> , then <span
class="math inline">\(opt(v)\ge w_{v\to u} +opt(u)\)</span> .</p>
<blockquote>
<p>If <span class="math inline">\(opt(v)&lt;w_{v\to u}+opt(u)\)</span> ,
then <span class="math inline">\(v\)</span> is updated by other <span
class="math inline">\(u&#39;\)</span> , so <span
class="math inline">\(first[v]\neq u\)</span> .</p>
</blockquote>
<p>Therefore , <span class="math inline">\(opt(v_1)\ge w_{v_1\to
v_2}+opt(v_2)\)</span> , <span class="math inline">\(opt(v_2)\ge
w_{v_2\to v_3}+opt(v_3)\)</span> , <span
class="math inline">\(\cdots\)</span> , <span
class="math inline">\(opt(v_k)\ge w_{v_k \to v_1} +opt(v_1)\)</span>
.</p>
<p>Therefore , <span class="math inline">\(cost(C)=\sum_{i=1}^k
w_{v_i\to v_{i+1}}\le 0\)</span> .</p>
<p>Consider the time before we update the last cycle edge , suppose that
it is <span class="math inline">\(v_k\to v_1\)</span> .</p>
<p>Therefore , exactly before the update , <span
class="math inline">\(opt(v_k)&gt;w_{v_k\to v_1}+opt(v_1)\)</span> , so
at this time the inequality is strict .</p></li>
<li><p>Note : If exists negative cycle , there may still exist finite
shortest-path vertices and <span class="math inline">\(+\infty\)</span>
shortest-path vertices .</p></li>
</ol>
<h2 id="chapter-4-np-completeness">Chapter 4 NP Completeness</h2>
<ol type="1">
<li><p>[ Cook , Karp ]</p>
<p><span class="math inline">\(NPC\subset NP\)</span> , poly-time
reduction</p>
<p><span class="math inline">\(P\neq NP\)</span> conjecture</p></li>
</ol>
<h3 id="polynomial-time-reduction">4.1 Polynomial Time Reduction</h3>
<ol type="1">
<li><p>using poly-time as measure ?</p>
<p>different computation model can have different power constant ( <span
class="math inline">\(\mathcal O(n^2,n^3,\cdots)\)</span> )</p></li>
<li><p>Def [ <strong><em>polynomial-time reduction</em></strong> ] : If
a problem <span class="math inline">\(Y\)</span> can be solved in
poly-time plus an oracle that solves problem <span
class="math inline">\(X\)</span> , then <span
class="math inline">\(Y\)</span> is <strong><em>poly-time
reducible</em></strong> to <span class="math inline">\(X\)</span> ,
denote as <span class="math inline">\(Y\le_P X\)</span> .</p>
<ol type="1">
<li><p>Def [ <strong><em>oracle</em></strong> ] : an oracle for <span
class="math inline">\(X\)</span> is a "black-box" , input an instance of
<span class="math inline">\(X\)</span> and can output the answer in
<span class="math inline">\(\mathcal O(1)\)</span> time .</p>
<blockquote>
<p>Even if <span class="math inline">\(X\)</span> itself cannot be
solved in <span class="math inline">\(\mathcal O(1)\)</span> .</p>
</blockquote></li>
<li><p><span class="math inline">\(Y\le _P X\)</span> : <span
class="math inline">\(X\)</span> is more powerful than <span
class="math inline">\(Y\)</span> ( i.e. <span
class="math inline">\(Y\)</span> is not harder than <span
class="math inline">\(X\)</span> )</p></li>
</ol></li>
<li><p>Properties</p>
<ol type="1">
<li>If <span class="math inline">\(Y\le_P X\)</span> , <span
class="math inline">\(X\)</span> is poly-solvable , then <span
class="math inline">\(Y\)</span> is poly-solvable</li>
<li>If <span class="math inline">\(Y\le_P X\)</span> , <span
class="math inline">\(Y\)</span> cannot be solved in poly-time , then
<span class="math inline">\(X\)</span> cannot be solved in
poly-time</li>
</ol></li>
<li><p>[ Cook , Karp ] :</p>
<p>Intuitive : If <span class="math inline">\(Y\le_P X\)</span> , then
connect a directed edge <span class="math inline">\(Y\to X\)</span> .
Then there exists a class of Problems <span
class="math inline">\(\mathcal C\)</span> . <span
class="math inline">\(\forall C\in \mathcal C,\forall X\in \mathcal
P\)</span> , there exists a directed edge <span
class="math inline">\(X\to C\)</span> .</p></li>
</ol>
<h3 id="examples-of-poly-reduction">4.2 Examples of Poly-reduction</h3>
<ol type="1">
<li><p>Independent Set Problem</p>
<p><span class="math inline">\(IS=(G(V,E),k)\)</span> .</p>
<p>Independent set : <span class="math inline">\(I\subseteq V\)</span> ,
s.t. <span class="math inline">\(\forall u,v\in I,(u,v)\notin E\)</span>
.</p>
<p>Ask whether there exists an Independent Set of size at least <span
class="math inline">\(k\)</span> .</p></li>
<li><p>Vertex Cover Problem</p>
<p><span class="math inline">\(VC=(G(V,E),h)\)</span> .</p>
<p>Vertex Cover : <span class="math inline">\(C\subseteq V\)</span> ,
s.t. <span class="math inline">\(\forall (u,v)\in E\)</span> , either
<span class="math inline">\(u\in C\)</span> or <span
class="math inline">\(v\in C\)</span> .</p>
<p>Ask whether there exists a Vertex Cover of size at most <span
class="math inline">\(h\)</span> .</p></li>
<li><p><span class="math inline">\(IS\le_P VC\)</span></p>
<p>Lemma : <span class="math inline">\(I\)</span> is an Independent Set
<span class="math inline">\(\Leftrightarrow\)</span> <span
class="math inline">\(V\backslash I\)</span> is a Vertex Cover</p></li>
<li><p>Set Cover Problem</p>
<p><span class="math inline">\(SC=(U,\{S_1,\cdots,S_m\},h)\)</span>
.</p>
<p><span class="math inline">\(U\)</span> : universe . <span
class="math inline">\(S_1,\cdots,S_m\subseteq U\)</span> .</p>
<p>Set Cover : <span class="math inline">\(I\subseteq [m]\)</span> ,
s.t. <span class="math inline">\(\bigcup_{i\in I}S_i=U\)</span> .</p>
<p>Ask whether there exists a Set Cover of size at most <span
class="math inline">\(h\)</span> .</p></li>
<li><p><span class="math inline">\(VC\le_P SC\)</span></p>
<p>Let <span class="math inline">\(S_i=\{e\in E|i\in e\}\)</span> ,
<span class="math inline">\(U=E\)</span> .</p></li>
</ol>
<h3 id="np-complete-problem">4.3 NP-Complete Problem</h3>
<ol type="1">
<li><p>Def [ <strong><em>NP</em></strong> ] : Only consider decision
problems ( output Y/N )</p>
<p>A decision problem <span class="math inline">\(X\)</span> , we can
consider <span class="math inline">\(X\)</span> as a collection of YES
instances.</p>
<ol type="1">
<li><p>Def [ <strong><em>Efficient Verifier</em></strong> ] : <span
class="math inline">\(V(x,\pi)\)</span> is an <strong><em>efficient
verifier</em></strong> for problem <span
class="math inline">\(X\)</span> if</p>
<ul>
<li><p><span class="math inline">\(V\)</span> is a poly-time algorithm
with <span class="math inline">\(x\)</span> and <span
class="math inline">\(\pi\)</span></p></li>
<li><p><span class="math inline">\(x\in X\)</span> <span
class="math inline">\(\Leftrightarrow\)</span> <span
class="math inline">\(\exists \pi , |\pi|\le poly(|x|) ,
V(x,\pi)=1\)</span></p></li>
</ul>
<p><span class="math inline">\(\pi\)</span> : certificate /
proof</p></li>
<li><p>Def [ <strong><em>P</em></strong> ] : The class of all problems
<span class="math inline">\(X\)</span> , s.t. there exists a poly-time
algorithm that solves <span class="math inline">\(X\)</span> .</p></li>
<li><p>Def [ <strong><em>NP</em></strong> ] : The class of all problems
<span class="math inline">\(X\)</span> , s.t. there exists an efficient
verifier for <span class="math inline">\(X\)</span> .</p></li>
</ol></li>
<li><p>Def [ <strong><em>NP-Complete</em></strong> ] : A problem <span
class="math inline">\(X\)</span> is <strong><em>NPC</em></strong> if</p>
<ul>
<li><span class="math inline">\(X\in NP\)</span></li>
<li><span class="math inline">\(\forall Y\in NP\)</span> , <span
class="math inline">\(Y\le_P X\)</span></li>
</ul></li>
<li><p>Properties of NPC</p>
<ol type="1">
<li><span class="math inline">\(\forall X\in NPC\)</span> , if <span
class="math inline">\(X\in P\)</span> , then <span
class="math inline">\(P=NP\)</span></li>
<li><span class="math inline">\(\forall X,Y\in NPC\)</span> , <span
class="math inline">\(X\le_P Y\)</span> and <span
class="math inline">\(Y\le_P X\)</span></li>
</ol></li>
<li><p>Thm [ Cook-Levin 1971 , the first NPC problem ] : SAT is NPC</p>
<ol type="1">
<li><p>Def [ <strong><em>SAT</em></strong> ] : a formula <span
class="math inline">\(F\)</span> of <span
class="math inline">\(0/1\)</span> variables <span
class="math inline">\(\{x_n\}\)</span> and <span
class="math inline">\(\land,\lor,\lnot\)</span> .</p>
<p>Fix some variables , determine whether there exists an assignment of
<span class="math inline">\(\{x_n\}\)</span> s.t. <span
class="math inline">\(F=true\)</span> .</p></li>
<li><p>Proof ( high-level sketch )</p>
<p>SAT <span class="math inline">\(\in\)</span> NP : given an assignment
, can verify</p>
<p><span class="math inline">\(\forall X\in NP,X\le_P SAT\)</span> :</p>
<p>efficient verifier for <span class="math inline">\(X\)</span> : can
be write as logic-circuit form</p>
<p><span class="math inline">\(V(x,\pi)\)</span> : <span
class="math inline">\(x\)</span> as fixed , <span
class="math inline">\(\pi\)</span> as assignment .</p></li>
</ol></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>算法</tag>
        <tag>算法-图论-treewidth</tag>
        <tag>算法-图论-minor graph</tag>
        <tag>算法-图论-最短路(Bellman Ford)</tag>
        <tag>计算理论-多项式规约</tag>
        <tag>计算理论-P,NP,NPC</tag>
        <tag>计算理论-SAT</tag>
      </tags>
  </entry>
  <entry>
    <title>ZKP and MPC 2</title>
    <url>/2023/10/15/ZKP-and-MPC-2/</url>
    <content><![CDATA[<h2 id="lec02-zero-knowledge-proof-under-composition">Lec02
Zero-Knowledge Proof under Composition</h2>
<h3 id="review-of-interactive-proof">2.1 Review of Interactive
Proof</h3>
<ol type="1">
<li><p>Interactive Proof</p>
<ol type="1">
<li><p>Framework</p>
<p>Prover <span class="math inline">\(P\)</span>, Verifier <span
class="math inline">\(V\)</span> . <span
class="math inline">\(P,V\)</span> knows <span
class="math inline">\(x\)</span> .</p>
<p><span class="math inline">\(P\)</span> wants to convince <span
class="math inline">\(V\)</span> that <span class="math inline">\(x\in
L\)</span> with interaction .</p>
<p><span class="math inline">\(P,V\)</span> sends messages <span
class="math inline">\(m_1,m_2,\cdots,m_k\)</span> one by one , then
<span class="math inline">\(V\)</span> accept/reject whether <span
class="math inline">\(x\in L\)</span> .</p>
<p><span class="math inline">\(\braket{P,V}(x)=y\in \{0,1\}\)</span> ,
indicating whether <span class="math inline">\(x\in L\)</span> .</p>
<p><strong><em>transcript</em></strong> : <span
class="math inline">\(\tau=\{m_1,m_2,\cdots,m_k\}\)</span> .</p></li>
<li><p>Formal Definition</p>
<p><span class="math inline">\((P,V)\)</span> is an
<strong><em>interactive proof</em></strong> for <span
class="math inline">\(L\)</span> if</p>
<ul>
<li><span class="math inline">\(V\)</span> is an efficient algorithm (
poly-time )</li>
<li><strong><em>Completeness</em></strong> : <span
class="math inline">\(\forall x\in
L,\Pr\{\braket{P,V}(x)=1\}=1\)</span></li>
<li><strong><em>(Almost) Soundness</em></strong> : <span
class="math inline">\(\forall x\notin L,\forall
P^*,\Pr\{\braket{P^*,V}(x)=1\}&lt;\epsilon\)</span></li>
</ul></li>
</ol></li>
<li><p>Error Toleration</p>
<ol type="1">
<li><p>Security parameter : <span
class="math inline">\(\kappa\)</span></p></li>
<li><p>For Efficiency : <span
class="math inline">\(|x|=poly(\kappa)\)</span></p></li>
<li><p>For Security</p>
<ul>
<li><p>Def [<strong><em>Negligible Function</em></strong>] <span
class="math inline">\(\epsilon:\mathbb N\to \mathbb R^*\)</span> is
<strong><em>negligible</em></strong> if for all positive polynomial
<span class="math inline">\(p\)</span> , <span
class="math inline">\(\exists c\)</span> , s.t. <span
class="math inline">\(\forall k\ge
c,\epsilon(k)&lt;\frac{1}{p(k)}\)</span></p></li>
<li><p>Def [<strong><em>Statistically Indistinguishable</em></strong>] :
two distribution <span class="math inline">\(f,g\)</span> are
<strong><em>statistically indistinguishable</em></strong> if there
exists a negligible function <span
class="math inline">\(\epsilon\)</span> that : <span
class="math display">\[
SD(f(x),g(x))=\frac{1}{2}\sum_{s}\left|\Pr\{f(x)=s\}-\Pr\{g(x)=s\}\right|&lt;\epsilon
\]</span></p></li>
<li><p>Def [<strong><em>Computationally Indistinguishable</em></strong>]
: two distribution <span class="math inline">\(f,g\)</span> are
<strong><em>computationally indistinguishable</em></strong> if <span
class="math inline">\(\forall D\)</span> (distinguisher) in P.P.T ,
there exists a negligible function <span
class="math inline">\(\epsilon\)</span> that : <span
class="math display">\[
\left|\Pr\left\{D\big(f(x)\big)=1\right\}-\Pr\left\{D\big(g(x)\big)=1\right\}\right|&lt;\epsilon
\]</span></p></li>
</ul></li>
</ol></li>
<li><p>Zero Knowledge</p>
<ol type="1">
<li><p>Intuition : not gain of <strong>Knowledge</strong></p>
<blockquote>
<p>Everything can be <strong>computed locally</strong> is not a gain of
knowledge</p>
</blockquote></li>
<li><p>Def [<strong><em>View</em></strong>] : <span
class="math display">\[
View_V^P(x)=(x,r,\tau)
\]</span></p></li>
<li><p>Def [<strong><em>dishonest-verifier zero-knowledge</em></strong>]
: <span class="math inline">\((P,V)\)</span> is an IP for <span
class="math inline">\(L\)</span> . If <span
class="math inline">\((P,V)\)</span> achieves
perfect/statistical/computational <strong><em>dishonest-verifier
zero-knowledge</em></strong> , it satisfies that :</p>
<p><span class="math inline">\(\forall V^*\)</span> in P.P.T. , <span
class="math inline">\(\exists \ \)</span> expected poly-time randomized
algorithm <span class="math inline">\(M^*\)</span> , s.t. <span
class="math display">\[
\forall x\in L,M^*(x)\sim View_{V^*}^P(x)
\]</span> <span class="math inline">\(\sim\)</span> refers to equivalent
/ statistically indistinguishable / computationally
indistinguishable</p></li>
<li><p>Alternative definition of zero-knowledge :</p>
<p><span class="math inline">\(\forall V^*\)</span> in P.P.T. , <span
class="math inline">\(\exists\ \)</span> expected poly-time randomized
algorithm <span class="math inline">\(M^*\)</span> , s.t. <span
class="math display">\[
\forall x\in L,\braket{P,V^*}(x)\sim M^*(x)
\]</span></p>
<blockquote>
<p><em>Proof of Equivalence</em></p>
<p>Original <span class="math inline">\(\to\)</span> Alternative : The
output of <span class="math inline">\(V^*\)</span> only depends on its
view . If <span class="math inline">\(M^*\)</span> can simulate the view
of <span class="math inline">\(V^*\)</span> , then it can run <span
class="math inline">\(V^*\)</span> with the view , and get the output of
<span class="math inline">\(V^*\)</span> .</p>
<p>Alternative <span class="math inline">\(\to\)</span> Original : The
proposition holds for all <span class="math inline">\(V^*\)</span> (
which can violate the protocol ) , so if <span
class="math inline">\(V^*\)</span> just output its entire view , we
still need to be able to find the corresponding <span
class="math inline">\(M^*\)</span> , this means the ability to simulate
the view of <span class="math inline">\(V^*\)</span> .</p>
</blockquote></li>
</ol></li>
</ol>
<h3 id="interactive-proof-for-graph-isomorphism">2.2 Interactive Proof
for Graph Isomorphism</h3>
<ol type="1">
<li><p>Protocol</p>
<table>
<colgroup>
<col style="width: 35%" />
<col style="width: 12%" />
<col style="width: 52%" />
</colgroup>
<thead>
<tr class="header">
<th style="text-align: center;">Prover</th>
<th style="text-align: center;"></th>
<th style="text-align: center;">Verifier</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;"><span
class="math inline">\(\pi_r\leftarrow S_n\)</span></td>
<td style="text-align: center;"></td>
<td style="text-align: center;"></td>
</tr>
<tr class="even">
<td style="text-align: center;"><span class="math inline">\(\tilde
G:=\pi_r(G_0)\)</span></td>
<td style="text-align: center;">--<span class="math inline">\(\tilde
G\)</span>-&gt;</td>
<td style="text-align: center;"></td>
</tr>
<tr class="odd">
<td style="text-align: center;"></td>
<td style="text-align: center;">&lt;-<span
class="math inline">\(b\)</span>--</td>
<td style="text-align: center;"><span class="math inline">\(b\leftarrow
\{0,1\}\)</span></td>
</tr>
<tr class="even">
<td style="text-align: center;">find <span
class="math inline">\(\pi_b\)</span> , s.t. <span
class="math inline">\(\tilde G=\pi_b(G_b)\)</span></td>
<td style="text-align: center;">--<span
class="math inline">\(\pi_b\)</span>-&gt;</td>
<td style="text-align: center;"></td>
</tr>
<tr class="odd">
<td style="text-align: center;"></td>
<td style="text-align: center;"></td>
<td style="text-align: center;"><span
class="math inline">\(\begin{cases}1&amp;\tilde
G=\pi_b(G_b)\\0&amp;\text{otherwise}\end{cases}\)</span></td>
</tr>
</tbody>
</table>
<blockquote>
<p>Note : If Prover knows <span class="math inline">\(\pi\)</span> that
<span class="math inline">\(\pi(G_0)=G_1\)</span> , then <span
class="math inline">\(\pi_b\)</span> can be constructed : <span
class="math display">\[
\pi_b=\begin{cases}\pi_r&amp;b=0\\\pi_r\circ\pi^{-1}&amp;b=1\end{cases}
\]</span></p>
</blockquote></li>
<li><p>Completeness : By construction of <span
class="math inline">\(\pi_b\)</span> above , <span
class="math inline">\(\Pr\{V\text{ accepts}\}=1\)</span> .</p></li>
<li><p>Soundness : <span class="math inline">\(\forall \tilde G\)</span>
, if <span class="math inline">\(G_0\not\sim G_1\)</span> , then either
<span class="math inline">\(\tilde G\not\sim G_0\)</span> , or <span
class="math inline">\(\tilde G\not\sim G_1\)</span> ,</p>
<p>If <span class="math inline">\(\tilde G\not\sim G_{b^*}\)</span> ,
then <span class="math inline">\(\Pr\{V\text{ rejects}\}\ge
\Pr\{b=b^*\}=\frac{1}{2}\)</span></p>
<blockquote>
<p>Note : we cannot let Verifier just choose <span
class="math inline">\(b=1\)</span> , since a malicious Prover can
violate the protocol , <span class="math inline">\(\pi_r\)</span> may
not be a permutation , and <span class="math inline">\(\tilde G\)</span>
may not isomorphic to <span class="math inline">\(G_0\)</span> .</p>
</blockquote>
<p>Repeat <span class="math inline">\(k\)</span> times , <span
class="math inline">\(\Pr\{V\text{ rejects}\}\ge
1-\frac{1}{2^k}\)</span> .</p></li>
<li><p>Honest-Verifier Zero-Knowledge : <span
class="math inline">\(P,V\)</span> follows the protocol .</p>
<p>In <span class="math inline">\(View_V^P\)</span> , the additional
info is <span class="math inline">\(\tilde G,\pi_b\)</span> , which are
both randomly picked s.t. <span class="math inline">\(\pi_b(G_b)=\tilde
G\)</span> .</p>
<p><span class="math inline">\(M\)</span> : samples <span
class="math inline">\(b\gets\{0,1\}\)</span> , samples <span
class="math inline">\(\pi_b\gets S_n\)</span> , computes <span
class="math inline">\(\tilde G=\pi_b(G_b)\)</span> .</p></li>
<li><p>Dishonest-Verifier Zero-Knowledge : <span
class="math inline">\(V^*\)</span> may not follow the protocol : <span
class="math inline">\(b\)</span> may be chosen depends on <span
class="math inline">\(\tilde G\)</span> .</p>
<p><span class="math inline">\(M^*\)</span> : perform as <span
class="math inline">\(P\)</span> (but without knowing <span
class="math inline">\(\pi\)</span>)</p>
<ol type="1">
<li><p><em>Strategy</em> :</p>
<ol type="1">
<li><p><span class="math inline">\(M^*\)</span> samples <span
class="math inline">\(b^*\gets \{0,1\}\)</span> ,samples <span
class="math inline">\(\pi_{b^*}\gets S_n\)</span> , computes <span
class="math inline">\(\tilde G=\pi_{b^*}(G_{b^*})\)</span></p></li>
<li><p><span class="math inline">\(M^*\)</span> runs with <span
class="math inline">\(V^*\)</span> and obtains <span
class="math inline">\(b\)</span> .</p></li>
<li><p>If <span class="math inline">\(b^*=b\)</span> , reply <span
class="math inline">\(\pi_{b^*}\)</span> and find out the view .</p>
<p>If <span class="math inline">\(b^*\neq b\)</span> , go back to
(1).</p></li>
</ol></li>
<li><p>Prove <span class="math inline">\(M^*(x)\equiv
View_{V^*}^P(x)\)</span> :</p>
<p><em>Proof Strategy</em> : <strong><em>Hybrid Arguments</em></strong>
: make sure that every two adjacent <span
class="math inline">\(Hyb\)</span> are so near .</p>
<ol start="0" type="1">
<li><p><span class="math inline">\(Hyb_0\)</span> : <span
class="math inline">\(P\)</span> runs with <span
class="math inline">\(V^*\)</span> , output <span
class="math inline">\(View_{V^*}^P(x)\)</span></p></li>
<li><p><span class="math inline">\(Hyb_1\)</span> : <span
class="math inline">\(\tilde P\)</span> runs with <span
class="math inline">\(V^*\)</span> , receives <span
class="math inline">\(b\)</span> . Then <span
class="math inline">\(\tilde P\)</span> samples <span
class="math inline">\(b^*\gets \{0,1\}\)</span></p>
<p>If <span class="math inline">\(b^*=b\)</span> , <span
class="math inline">\(\tilde P\)</span> continues with <span
class="math inline">\(V^*\)</span> . Otherwise , <span
class="math inline">\(\tilde P\)</span> reruns with <span
class="math inline">\(V^*\)</span> .</p>
<ul>
<li><span class="math inline">\(\mathbb E\{\text{repitition
time}\}=2\)</span></li>
<li><span class="math inline">\(View_{V^*}^P(x)\equiv View_{V^*}^{\tilde
P}(x)\)</span></li>
</ul>
<blockquote>
<p>Proof : <span class="math inline">\(\forall v\)</span> , compute
<span class="math inline">\(\Pr\{View_{V^*}^{\tilde
P}(x)=v\}\)</span></p>
<p>Define <span class="math inline">\(v_i\)</span> : view of <span
class="math inline">\(V^*\)</span> in the <span
class="math inline">\(i\)</span>-th iteration</p>
<p>If <span class="math inline">\(\tilde P\)</span> exactly finishes at
<span class="math inline">\(i\)</span>-th iteration , then <span
class="math inline">\(v_i\equiv View_{V^*}^{P}(x)\)</span> <span
class="math display">\[
\begin{aligned}
\Pr\{View_{V^*}^{\tilde P}(x)=v\}&amp;=\sum_{l=1}^{+\infty}
\Pr\{v_l=v\land b_l^*=b_l\land \forall i&lt;l,b_i^*\neq b_i\}\\
&amp;=\sum_{l=1}^{+\infty}\Pr\{v_l=v\}\Pr\{b_l^*=b_l\land \forall
i&lt;l,b_i^*\neq b_i\}\\
&amp;=\Pr\{View_{V^*}^P(x)=v\}\sum_{l=1}^{+\infty}\Pr\{b_l^*=b_l\land
\forall i&lt;l,b_i^*\neq b_i\}\\
&amp;=\Pr\{View_{V^*}^P(x)=v\}
\end{aligned}
\]</span></p>
</blockquote></li>
<li><p><span class="math inline">\(Hyb_2\)</span> : <span
class="math inline">\(\tilde P\)</span> samples <span
class="math inline">\(b^*\gets \{0,1\}\)</span> . Then <span
class="math inline">\(\tilde P\)</span> runs with <span
class="math inline">\(V^*\)</span> , receives <span
class="math inline">\(b\)</span> .</p>
<p>If <span class="math inline">\(b^*=b\)</span> , <span
class="math inline">\(\tilde P\)</span> continues with <span
class="math inline">\(V^*\)</span> . Otherwise , <span
class="math inline">\(\tilde P\)</span> reruns with <span
class="math inline">\(V^*\)</span> .</p>
<ul>
<li><span class="math inline">\(\mathbb E\{\text{repitition
time}\}=2\)</span></li>
<li><span class="math inline">\(Hyb_1\equiv Hyb_2\)</span> ( Since we
only swap two independent operations )</li>
</ul></li>
<li><p><span class="math inline">\(Hyb_3\)</span> : <span
class="math inline">\(\tilde P\)</span> samples <span
class="math inline">\(b^*\gets\{0,1\}\)</span> . Then <span
class="math inline">\(\tilde P\)</span> samples <span
class="math inline">\(\pi_{b^*}\)</span> and computes <span
class="math inline">\(\tilde G=\pi_{b^*}(G_{b^*})\)</span> . Then <span
class="math inline">\(\tilde P\)</span> runs with <span
class="math inline">\(V^*\)</span> , receives <span
class="math inline">\(b\)</span> .</p></li>
</ol>
<p>If <span class="math inline">\(b^*=b\)</span> , <span
class="math inline">\(\tilde P\)</span> continues with <span
class="math inline">\(V^*\)</span> . Otherwise , <span
class="math inline">\(\tilde P\)</span> reruns with <span
class="math inline">\(V^*\)</span> .</p>
<ul>
<li><span class="math inline">\(\mathbb E\{\text{repitition
time}\}=2\)</span></li>
<li><span class="math inline">\(Hyb_2\equiv Hyb_3\)</span> ( Since the
distribution of <span class="math inline">\((\pi_{b^*},\tilde
G)\)</span> is the same )</li>
<li><span class="math inline">\(Hyb_3\)</span> is exactly <span
class="math inline">\(M^*\)</span></li>
</ul></li>
</ol></li>
<li><p>Zero-knowledge under Repetition (for soundness)</p>
<ol type="1">
<li><p>Def [<strong><em>parallel composition</em></strong>] :</p>
<table>
<colgroup>
<col style="width: 36%" />
<col style="width: 23%" />
<col style="width: 39%" />
</colgroup>
<thead>
<tr class="header">
<th style="text-align: center;">Prover</th>
<th style="text-align: center;"></th>
<th style="text-align: center;">Verifier</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;"><span
class="math inline">\(\pi_r^i\leftarrow S_n\)</span> for <span
class="math inline">\(i\in [\kappa]\)</span></td>
<td style="text-align: center;"></td>
<td style="text-align: center;"></td>
</tr>
<tr class="even">
<td style="text-align: center;"><span class="math inline">\(\tilde
{G^i}:=\pi_r^i(G_0)\)</span> for <span class="math inline">\(i\in
[\kappa]\)</span></td>
<td style="text-align: center;">--<span class="math inline">\(\{\tilde
{G^i}|i\in [\kappa]\}\)</span>-&gt;</td>
<td style="text-align: center;"></td>
</tr>
<tr class="odd">
<td style="text-align: center;"></td>
<td style="text-align: center;">&lt;-<span
class="math inline">\(b\)</span>--</td>
<td style="text-align: center;"><span class="math inline">\(b\leftarrow
\{0,1\}^\kappa\)</span></td>
</tr>
<tr class="even">
<td style="text-align: center;">find <span
class="math inline">\(\pi_b^i\)</span> , s.t. <span
class="math inline">\(\tilde {G^i}=\pi_{b}^i(G_{b_i})\)</span></td>
<td style="text-align: center;">--<span
class="math inline">\(\{\pi_b^i|i\in [\kappa]\}\)</span>-&gt;</td>
<td style="text-align: center;"></td>
</tr>
<tr class="odd">
<td style="text-align: center;"></td>
<td style="text-align: center;"></td>
<td style="text-align: center;"><span
class="math inline">\(\begin{cases}1&amp;\forall i\in [\kappa],\tilde
{G^i}=\pi_b^i(G_{b_i})\\0&amp;\text{otherwise}\end{cases}\)</span></td>
</tr>
</tbody>
</table></li>
<li><p>Brute force simulator :</p>
<p>randomly sample <span
class="math inline">\(b^*\gets\{0,1\}^\kappa\)</span> , if received
<span class="math inline">\(b\neq b^*\)</span> , rerun .</p>
<p>Good : <span class="math inline">\(M^*(x)\equiv
View_{V^*}^P(x)\)</span></p>
<p>Bad : <span class="math inline">\(\mathbb E\{\text{repitition
time}\}=2^{\kappa}\)</span></p></li>
</ol>
<blockquote>
<p>For most ZKP , we don't know how to prove <em>dishonest-verifier
zero-knowledge</em> under parallel composition .</p>
</blockquote></li>
</ol>
<h3 id="sequential-composition">2.3 Sequential Composition</h3>
<ol type="1">
<li><p>Def [<strong><em>sequential composition</em></strong>] : <span
class="math inline">\((P,V)\)</span> is an IP for <span
class="math inline">\(L\)</span> , Let <span
class="math inline">\((P_q,V_q)\)</span> be <span
class="math inline">\((P,V)\)</span> repeat <span
class="math inline">\(q\)</span> times .</p></li>
<li><p>Brute force simulator for Graph Isomorphism :</p>
<p>For <span class="math inline">\(i=1,2,\cdots,\kappa\)</span> :</p>
<ol type="1">
<li><p><span class="math inline">\(M^*\)</span> samples <span
class="math inline">\(b^*\gets\{0,1\}\)</span> , samples <span
class="math inline">\(\pi_{b^*}\gets S_n\)</span> , computes <span
class="math inline">\(\tilde G=\pi_{b^*}(G_{b^*})\)</span></p></li>
<li><p><span class="math inline">\(M^*\)</span> runs with <span
class="math inline">\(V^*\)</span> and receives <span
class="math inline">\(b^*\)</span></p></li>
<li><p>If <span class="math inline">\(b=b^*\)</span> , <span
class="math inline">\(M^*\)</span> continues working as <span
class="math inline">\(P\)</span></p>
<p>If <span class="math inline">\(b\neq b^*\)</span> , <span
class="math inline">\(M^*\)</span> restarts from <span
class="math inline">\(i=1\)</span></p></li>
</ol>
<blockquote>
<p>Bad : <span class="math inline">\(\mathbb E\{\text{repitition
time}\}=\mathcal O(2^{\kappa})\)</span></p>
</blockquote></li>
<li><p>Idea : Not restarts from <span class="math inline">\(i=1\)</span>
?</p>
<p>Store the state of <span class="math inline">\(V^*\)</span> : <span
class="math inline">\((input,memory,output,randomness)\)</span></p>
<blockquote>
<p>We can store randomness of <span class="math inline">\(V^*\)</span>
since it can be provided by <span class="math inline">\(M^*\)</span></p>
<p>The randomness of <span class="math inline">\(V^*\)</span> can be
pre-determined</p>
</blockquote>
<p>Simulator <span class="math inline">\(M^*\)</span> :</p>
<p>For <span class="math inline">\(i=1,2,\cdots,\kappa\)</span> :</p>
<ol type="1">
<li><p><span class="math inline">\(M^*\)</span> saves the state <span
class="math inline">\(st_{i-1}\)</span> of <span
class="math inline">\(V^*\)</span></p></li>
<li><p><span class="math inline">\(M^*\)</span> samples <span
class="math inline">\(b_i^*\gets\{0,1\}\)</span> , samples <span
class="math inline">\(\pi_{b_i^*}\gets S_n\)</span> , computes <span
class="math inline">\(\tilde
G_i=\pi_{b_i^*}(G_{b_i^*})\)</span></p></li>
<li><p><span class="math inline">\(M^*\)</span> runs with <span
class="math inline">\(V^*\)</span> and receives <span
class="math inline">\(b_i^*\)</span></p></li>
<li><p>If <span class="math inline">\(b_i=b_i^*\)</span> , <span
class="math inline">\(M^*\)</span> continues working as <span
class="math inline">\(P\)</span></p>
<p>If <span class="math inline">\(b_i\neq b_i^*\)</span> , <span
class="math inline">\(M^*\)</span> reloads <span
class="math inline">\(st_{i-1}\)</span> for <span
class="math inline">\(V^*\)</span> and goes back to (2).</p></li>
</ol></li>
<li><p>Def [<strong><em>Interactive Proof with Auxiliary
Input</em></strong>] :</p>
<p><span class="math inline">\((P,V)\)</span> is an
<strong><em>interactive zk-proof with auxiliary input</em></strong> for
<span class="math inline">\(L\)</span> if</p>
<ul>
<li><p><span class="math inline">\(V\)</span> is an efficient algorithm
( poly-time )</p></li>
<li><p><strong><em>Completeness</em></strong> : <span
class="math inline">\(\forall x\in L,\forall y,z\in
\{0,1\}^*,\Pr\{\braket{P(y),V(z)}(x)=1\}=1\)</span></p></li>
<li><p><strong><em>(Almost) Soundness</em></strong> : <span
class="math inline">\(\forall x\notin L,\forall y,z\in \{0,1\}^*,\forall
P^*,\Pr\{\braket{P^*(y),V(z)}(x)=1\}&lt;\epsilon\)</span></p></li>
<li><p><strong><em>Zero-Knowledge</em></strong> : <span
class="math inline">\(\forall V^*\)</span> in P.P.T. , <span
class="math inline">\(\exists\)</span> expected poly-time algorithm
<span class="math inline">\(M^*\)</span> , <span
class="math inline">\(\forall x\in L,\forall y,z\in \{0,1\}^*\)</span> ,
<span class="math display">\[
M^*(x,z)\sim View_{V^*(z)}^{P(y)}(x)
\]</span></p></li>
</ul></li>
<li><p>Lemma [<strong><em>Composition Lemma</em></strong>] :</p>
<p><span class="math inline">\((P,V)\)</span> : zero-knowledge IP for
<span class="math inline">\(L\)</span> , <span
class="math inline">\(q\)</span> is a polynomial of <span
class="math inline">\(\kappa\)</span></p>
<p><span class="math inline">\((P_q,V_q)\)</span> : An IP runs <span
class="math inline">\((P,V)\)</span> <span
class="math inline">\(q\)</span> times</p>
<p>Claim : <span class="math inline">\((P_q,V_q)\)</span> are still
zero-knowledge .</p>
<blockquote>
<p>Proof Intuition</p>
<p>In the <span class="math inline">\(i\)</span>-th iteration : <span
class="math inline">\(V\)</span> runs like <span
class="math inline">\(V_i^*(x,st_{i-1})\)</span> , construct
corresponding <span class="math inline">\(M_i^*(x,st_{i-1})\)</span></p>
<p>Prove <span class="math inline">\(M_i^*(x,st_{i-1})\sim
View_{V_i^*}^P(x)\)</span></p>
</blockquote></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>零知识证明和多方安全计算</category>
      </categories>
      <tags>
        <tag>密码学</tag>
        <tag>密码学-零知识证明</tag>
        <tag>密码学-交互式证明</tag>
        <tag>密码学-证明复合</tag>
      </tags>
  </entry>
  <entry>
    <title>Algorithm Design 4</title>
    <url>/2023/10/09/Algorithm-Design-4/</url>
    <content><![CDATA[<h2 id="chapter-2-greedy-algorithm">Chapter 2 Greedy Algorithm</h2>
<h3 id="optimal-caching">2.7 Optimal Caching</h3>
<ol type="1">
<li><p>Description</p>
<ul>
<li><p>U : <span class="math inline">\(n\)</span> pieces of distinct
elements stored in main memory</p></li>
<li><p>cache : can hold <span class="math inline">\(k&lt;n\)</span>
pieces</p>
<p>init : cache holds <span class="math inline">\(k\)</span>
elements</p></li>
<li><p>need to process a sequence of elements <span
class="math inline">\(d_1,\cdots,d_m\)</span> ( <span
class="math inline">\(m\)</span> can <span class="math inline">\(\ge
n\)</span> , not distinct)</p></li>
<li><p>when processing <span class="math inline">\(d_i\)</span> :</p>
<ul>
<li>If <span class="math inline">\(d_i\)</span> already in cache : cache
hit , do nothing to the cache</li>
<li>If <span class="math inline">\(d_i\)</span> not in cache : cache
miss , evict some other element from the cache</li>
</ul></li>
<li><p>Goal : minimize # of cache miss .</p></li>
</ul></li>
<li><p>Greedy Algorithm</p>
<ol type="1">
<li><p>FF Strategy : evict the element that will be used the furthest in
the future .</p></li>
<li><p>Proof [Exchange Argument] :</p>
<p>See BOOK</p>
<blockquote>
<p>Suppose our solution <span class="math inline">\(S_{FF}\)</span>
differs with the optimal solution <span class="math inline">\(S\)</span>
firstly at <span class="math inline">\(d_i\)</span> , which is a cache
miss , and our solution evicts <span class="math inline">\(a\)</span>
and the optimal solution evicts <span class="math inline">\(b\)</span> ,
where the future used time : <span
class="math inline">\(t(a)&gt;t(b)\)</span> . Let's prove that we can
construct a new solution <span class="math inline">\(S&#39;\)</span>
that is same with our solution at <span
class="math inline">\(d_i\)</span> as well , and is not worse than the
optimal solution <span class="math inline">\(S\)</span>.</p>
<p>We do not need to care about other caches and other elements . We
only need to care about <span class="math inline">\(a,b\)</span> . <span
class="math inline">\(S=S&#39;\)</span> except the following cases :</p>
<p>Case 0 : At <span class="math inline">\(d_i\)</span> , <span
class="math inline">\(S&#39;\)</span> should evict <span
class="math inline">\(a\)</span> rather than <span
class="math inline">\(b\)</span></p>
<p>Case 1 : <span class="math inline">\(d_j\)</span> let <span
class="math inline">\(a\)</span> evicted in <span
class="math inline">\(S\)</span> before the first future used time <span
class="math inline">\(t(a)\)</span> . Then let <span
class="math inline">\(S&#39;\)</span> evicts <span
class="math inline">\(b\)</span> at the same time , <span
class="math inline">\(S\)</span> is the same as <span
class="math inline">\(S&#39;\)</span> .</p>
<p>Case 2 : At <span class="math inline">\(t(b)\)</span> . If at this
time , <span class="math inline">\(S\)</span> evicts <span
class="math inline">\(a\)</span> , then <span
class="math inline">\(S&#39;\)</span> can do nothing and save one cache
miss . If at this time , <span class="math inline">\(S\)</span> evicts
<span class="math inline">\(c\neq a\)</span> , then we can let <span
class="math inline">\(S&#39;\)</span> evicts <span
class="math inline">\(c\)</span> as well , and then let <span
class="math inline">\(a\)</span> "waiting" until it is used at <span
class="math inline">\(t(a)\)</span> or evicted later .</p>
</blockquote></li>
</ol></li>
<li><p>In reality : we cannot know the sequence ahead</p>
<p>LRU : least recently used</p>
<p>using locality of reference</p>
<p>[ Slater , Tarjan ] : LRU is the earliest online solution (elements
come one by one)</p>
<p><span class="math inline">\(LRU\le k(FF+1)\)</span> , <span
class="math inline">\(k\)</span> is the cache size .</p></li>
</ol>
<h3 id="minimum-cost-arborescence">2.8 Minimum Cost Arborescence</h3>
<p>a.k.a 最小树形图</p>
<ol type="1">
<li><p>Description</p>
<p>Find a min-cost directed spanning tree rooted at <span
class="math inline">\(r\)</span> in a directed graph <span
class="math inline">\(G\)</span> .</p>
<p>Assumption : <span class="math inline">\(\forall e\in E,w(e)\ge
0\)</span></p></li>
<li><p>Algorithm</p>
<ul>
<li><p>For <span class="math inline">\(v\in V\backslash \{r\}\)</span> ,
choose the in-edge with minimum weight <span
class="math inline">\(e_v\)</span> .</p>
<p>Let <span class="math inline">\(F^*\)</span> be the set of chosen
edge , so <span class="math inline">\(cost(F^*)\le OPT\)</span> .</p>
<p>If <span class="math inline">\(F^*\)</span> is already a tree , we
get the optimal solution .</p></li>
<li><p>Otherwise , for all <span class="math inline">\(v\in
V\backslash\{r\}\)</span> , let <span
class="math inline">\(w&#39;(e=(u,v))=w(u,v)-w(e_v)\)</span> , and then
contract zero-cost cycles to get a new graph <span
class="math inline">\(G&#39;\)</span> .</p></li>
<li><p>Process <span class="math inline">\(G&#39;\)</span> as above
.</p></li>
</ul></li>
<li><p>Extension : matrix-tree theorem</p>
<p>count # of spanning tree in a graph (either directed or
undirected)</p>
<p><span class="math inline">\(Det(L_{00})\)</span></p>
<p>variant : count # of spanning tree in a graph with given total
weight</p></li>
</ol>
<h2 id="chapter-3-dynamic-programming">Chapter 3 Dynamic
Programming</h2>
<h3 id="weighted-interval-scheduling-problem">3.1 weighted interval
scheduling problem</h3>
<ol type="1">
<li><p>Description</p>
<p>Input : <span class="math inline">\(n\)</span> intervals <span
class="math inline">\([l_i,r_i]\)</span> with weight <span
class="math inline">\(w_i\)</span></p>
<p>Goal : maximize the total weight s.t. the chosen intervals are
distinct</p></li>
<li><p>key : Dynamic Programming recursion</p>
<ol type="1">
<li>sort the jobs according to <span class="math inline">\(r_i\)</span>
(like greedy algorithm)</li>
<li>define <span class="math inline">\(opt(j)\)</span> : the optimal
value for the first <span class="math inline">\(j\)</span>
intervals</li>
</ol>
<p><span class="math display">\[
opt(i)=\max\{opt(i-1),opt(p(i))+w_i\}
\]</span></p>
<p>where <span class="math inline">\(p(i)=\max\{j|r_j&lt;l_i\}\)</span>
.</p>
<ol start="3" type="1">
<li><p>If we don't use interval <span class="math inline">\(i\)</span> ,
it is <span class="math inline">\(opt(i-1)\)</span> .</p>
<p>If we use interval <span class="math inline">\(i\)</span> , since
<span class="math inline">\(\{r_i\}\)</span> non-decreasing , <span
class="math inline">\(opt(p(i))\)</span> contains all valid optimal
solutions before .</p></li>
</ol></li>
<li><p>Implementation</p>
<p>naive :</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">compute_opt</span><span class="params">(<span class="type">int</span> i)</span></span>&#123;</span><br><span class="line">   <span class="keyword">if</span>(i==<span class="number">0</span>)<span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">   <span class="keyword">else</span> <span class="keyword">return</span> <span class="built_in">max</span>(<span class="built_in">compute_opt</span>(i<span class="number">-1</span>),w[i]+<span class="built_in">compute_opt</span>(<span class="built_in">p</span>(i)));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<blockquote>
<p>Problem : can be exponential time !</p>
<p>Key : many redundant computation <span
class="math inline">\(\to\)</span> store them</p>
</blockquote>
<p>memorization / filling out the DP table :</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="type">int</span> opt[]=&#123;<span class="number">-1</span>&#125;;</span><br><span class="line"><span class="comment">// recursive implementation </span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">compute_opt</span><span class="params">(<span class="type">int</span> i)</span></span>&#123;</span><br><span class="line">    <span class="keyword">if</span>(i==<span class="number">0</span>) <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">if</span>(opt[i]!=<span class="number">-1</span>)<span class="keyword">return</span> opt[i];</span><br><span class="line">    <span class="keyword">else</span> <span class="keyword">return</span> opt[i]=<span class="built_in">max</span>(<span class="built_in">compute_opt</span>(i<span class="number">-1</span>),w[i]+<span class="built_in">compute_opt</span>(<span class="built_in">p</span>(i)));</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// non-recursive implementation</span></span><br><span class="line">&#123;</span><br><span class="line">    opt[<span class="number">0</span>]=<span class="number">0</span>;</span><br><span class="line">    <span class="keyword">for</span>(<span class="type">int</span> i=<span class="number">1</span>;i&lt;=n;++i)&#123;</span><br><span class="line">        opt[i]=<span class="built_in">max</span>(opt[i<span class="number">-1</span>],w[i]+opt[<span class="built_in">p</span>(i)]);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li>
</ol>
<h3 id="subset-sum-knapsack">3.2 Subset Sum / Knapsack</h3>
<ol type="1">
<li><p>Description</p>
<p>Input : <span class="math inline">\(n\)</span> items , item <span
class="math inline">\(i\)</span> has weight <span
class="math inline">\(w_i\)</span> and value <span
class="math inline">\(v_i\)</span> ; Weight limit <span
class="math inline">\(W\)</span></p>
<p>Goal : find a subset of items <span class="math inline">\(S\)</span>
s.t. <span class="math inline">\(\sum_{i\in S}w_i\le W\)</span> and
maximize <span class="math inline">\(\sum_{i\in S}v_i\)</span>
.</p></li>
<li><p>DP Algorithm</p>
<ol type="1">
<li><p>pseudo-poly time / poly time</p>
<p>pseudo-poly time : <span class="math inline">\(poly(n,W)\)</span></p>
<p>poly time : <span class="math inline">\(poly(n,\log W)\)</span> (
<span class="math inline">\(\log W\)</span> is the least number of bits
to input <span class="math inline">\(W\)</span> )</p>
<blockquote>
<p>Knapsack : NP-Hard : hard to find poly time algorithm</p>
<p>But finding a pseudo-poly time algorithm is easy !</p>
</blockquote></li>
<li><p><span class="math inline">\(opt(i,w)\)</span> : the optimal
solution for first <span class="math inline">\(i\)</span> items and
weight <span class="math inline">\(= w\)</span> <span
class="math display">\[
opt(i,w)=\max\begin{cases}opt(i-1,w)&amp;\text{not using i-th item}\\
opt(i-1,w-w_i)+v_i&amp;\text{using i-th item}
\end{cases}
\]</span></p></li>
<li><p>Time Complexity : <span class="math inline">\(\mathcal
O(nW)\)</span> .</p></li>
</ol></li>
<li><p>Implementation</p>
<p><span class="math inline">\(opt(w)\)</span> : at <span
class="math inline">\(i\)</span>-th iteration : the optimal solution for
first <span class="math inline">\(i\)</span> items and weight <span
class="math inline">\(= w\)</span> <span class="math display">\[
opt(w)=\max\{opt(w),opt(w-w_i)+v_i\}
\]</span> iterate <span class="math inline">\(w\)</span>
decreasingly</p>
<p><img src="\images\posts\AD4_fig1.jpg" /></p>
<p>Memory Complexity : <span class="math inline">\(\mathcal
O(n+W)\)</span></p></li>
</ol>
<h3 id="rna-secondary-structure">3.3 RNA secondary structure</h3>
<ol type="1">
<li><p>Description</p>
<p>Input : <span class="math inline">\(\{A,C,G,U\}^n\)</span></p>
<p>Constraints :</p>
<ul>
<li>no sharp turns : If pairing <span class="math inline">\((i,j)\in
S\)</span> , then <span class="math inline">\(i&lt;j-4\)</span></li>
<li>pairing : <span class="math inline">\(A-G\)</span> , <span
class="math inline">\(C-U\)</span></li>
<li>one base can only appear in <span class="math inline">\(\le
1\)</span> pairs</li>
<li>no crossing pairing : <span class="math inline">\(\not\exist
(i,j),(l,r)\in S,i&lt;l&lt;j&lt;r\)</span></li>
</ul>
<p>Goal : maximize the size of pairing set <span
class="math inline">\(S\)</span></p></li>
<li><p>Algorithm</p>
<p><span class="math inline">\(opt(l,r)\)</span> : the maximum number of
pairs in subsequence <span class="math inline">\([l,r]\)</span> <span
class="math display">\[
opt(l,r)=\max\begin{cases}
opt(l+1,r-1)+1&amp;a[l]\text{ and }a[r] \text{ can pairing}\\
\max_{l\le i\le r-1}\{opt(l,i)+opt(i+1,r)\}
\end{cases}
\]</span> Order : <span class="math inline">\(r-l\)</span>
increasing</p>
<p>Time Complexity : <span class="math inline">\(\mathcal
O(n^3)\)</span></p>
<p>Memory Complexity : <span class="math inline">\(\mathcal
O(n^2)\)</span></p></li>
</ol>
<h3 id="dp-on-tree-and-tree-like-graph">3.4 DP on tree and tree-like
graph</h3>
<h4 id="maximum-independent-set-on-tree">3.4.1 maximum independent set
on tree</h4>
<ol type="1">
<li><p>Description</p>
<p>Input : A tree , vertex-weighted</p>
<p>Goal : Find an Independent Set of maximum weight</p></li>
<li><p>Algorithm</p>
<p><span class="math inline">\(opt_{0/1}(x)\)</span> : the
maximum-weighted independent set of the subtree <span
class="math inline">\(x\)</span> , <span
class="math inline">\(opt_1\)</span> means we choose <span
class="math inline">\(x\)</span> , <span
class="math inline">\(opt_0\)</span> means we don't choose <span
class="math inline">\(x\)</span> . <span class="math display">\[
opt_1(x)=w_x+\sum_{y\in child(x)}opt_0(y)\\
opt_0(x)=\sum_{y\in child(x)}\max\{opt_0(y),opt_1(y)\}
\]</span></p></li>
<li><p>Tree DP View :</p>
<ul>
<li>subtree is the natural subinstance</li>
<li>delete <span class="math inline">\(x\)</span> , the tree will be
separated into several parts , no connection between <span
class="math inline">\(subtree(y)\)</span> (<span
class="math inline">\(y\in child(x)\)</span>) , <span
class="math inline">\(G\backslash subtree(x)\)</span></li>
</ul></li>
</ol>
<h4 id="treewidth">3.4.2 treewidth</h4>
<ol type="1">
<li><p>Tree Decomposition ( clique tree , junction trees )</p>
<ol type="1">
<li><p>Def [separator] : For connected graph <span
class="math inline">\(G\)</span> , a separator is a vertex set <span
class="math inline">\(S\subset V\)</span> , s.t. <span
class="math inline">\(G\backslash S\)</span> is not connected .</p></li>
<li><p>Def [tree decomposition] : <span
class="math inline">\(T(G)\)</span> is a tree decomposition if :</p>
<p>Each vertex in <span class="math inline">\(T(G)\)</span> is called a
bag <span class="math inline">\((V_T(x),E_T(x))\)</span> : containing
some vertices in <span class="math inline">\(G\)</span> and their
edges</p>
<ul>
<li><span class="math inline">\(\cup_x V_T(x)=V\)</span></li>
<li><span class="math inline">\(\forall e\in E,\exists x\in T(G) , e\in
E_T(x)\)</span></li>
<li><span class="math inline">\(\forall v\in V,\{x|v\in
V_T(x)\}\)</span> form a connected subtree</li>
</ul></li>
<li><p>Def [treewidth for tree decomposition] : <span
class="math inline">\(tw(T(G)):=\max_x \{|V_T(x)|\}-1\)</span></p></li>
<li><p>Lemma : If <span class="math inline">\(G\)</span> is a tree ,
<span class="math inline">\(tw(T(G))=1\)</span></p></li>
<li><p>Def [treewidth for graph] : <span class="math display">\[
tw(G)=\min\limits_{T(G)\text{ is a tree decomposition}}tw(T(G))
\]</span></p></li>
</ol></li>
<li><p>Lemma [ bags and separators ] : <span
class="math inline">\(T(G)\)</span> is a tree decomposition of <span
class="math inline">\(G\)</span> , <span
class="math inline">\(A,B\)</span> are adjacent bags in <span
class="math inline">\(T(G)\)</span> , then</p>
<p><span class="math inline">\(V_T(A)\)</span> is a separator of <span
class="math inline">\(G\)</span> , <span class="math inline">\(V_T(A\cap
B)\)</span> is a separator of <span
class="math inline">\(G\)</span></p></li>
<li><p>Independent Set on a graph with a tree decomposition</p>
<p>Input : given <span class="math inline">\(G\)</span> and a tree
decomposition <span class="math inline">\(T(G)\)</span></p>
<p>Goal : Find maximum Independent Set in Time Complexity <span
class="math inline">\(\mathcal O(n2^{tw(T(G))})\)</span></p></li>
<li><p>Algorithm</p>
<ul>
<li><p><span class="math inline">\(opt(x,S)\)</span> : consider the
subtree <span class="math inline">\(x\)</span> in the tree decomposition
, choosing exactly <span class="math inline">\(S\)</span> in <span
class="math inline">\(V_T(x)\)</span> , the maximum independent
set</p></li>
<li><p><span class="math display">\[
opt(x,S)=w(S)+\sum_{y\in child(x)}\max_{S&#39; , S\text{ consists with }
S&#39;\text{ on }S\cap S&#39;}\{opt(y,S&#39;)-w(S\cap S&#39;)\}
\]</span></p></li>
</ul></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>算法</tag>
        <tag>算法-贪心</tag>
        <tag>算法-贪心-最优缓存</tag>
        <tag>算法-图论-最小树形图</tag>
        <tag>算法-动态规划</tag>
        <tag>算法-动态规划-线性DP</tag>
        <tag>算法-动态规划-背包</tag>
        <tag>算法-动态规划-区间DP</tag>
        <tag>算法-动态规划-树形DP</tag>
        <tag>算法-图论-树分解</tag>
      </tags>
  </entry>
  <entry>
    <title>Probability and Statistics 2</title>
    <url>/2023/10/08/Probability-and-Statistics-2/</url>
    <content><![CDATA[<h2 id="chapter-1-background-in-probability">Chapter 1 Background in
Probability</h2>
<h3 id="random-variables">1.2 Random Variables</h3>
<ol type="1">
<li><p>measurable map</p>
<ol type="1">
<li><p>Remark of Thm [a sufficient condition for measurable map]</p>
<blockquote>
<p>If <span class="math inline">\(\mathcal S\)</span> is a <span
class="math inline">\(\sigma\)</span>-field , then <span
class="math inline">\(\{X^{-1}(B)|B\in \mathcal S\}\)</span> is a <span
class="math inline">\(\sigma\)</span>-field</p>
</blockquote>
<blockquote>
<p>Def : [generation of measurable map] <span
class="math inline">\(\sigma(X)\)</span> is the <span
class="math inline">\(\sigma\)</span>-field generated by <span
class="math inline">\(X\)</span> <span class="math display">\[
\sigma(X)=\{X^{-1}(B)|B\in \mathcal S\}
\]</span></p>
</blockquote>
<blockquote>
<p><span class="math inline">\(\sigma(X)\)</span> is the smallest <span
class="math inline">\(\sigma\)</span>-field on <span
class="math inline">\(\Omega\)</span> that makes <span
class="math inline">\(X\)</span> a measurable map to <span
class="math inline">\((S,\mathcal S)\)</span> .</p>
</blockquote></li>
<li><p>Thm [measurable map composition] : If <span
class="math inline">\(X:(\Omega,\mathcal F)\to(S,\mathcal S)\)</span> ,
<span class="math inline">\(f:(S,\mathcal S)\to (T,\mathcal T)\)</span>
are both measurable maps , then <span class="math inline">\(f\circ
X\)</span> is a measurable map <span
class="math inline">\((\Omega,\mathcal S)\to(T,\mathcal T)\)</span>
.</p>
<blockquote>
<p>Proof : <span class="math inline">\((f\circ
X)^{-1}(B)=X^{-1}(f^{-1}(B))\)</span> , <span
class="math inline">\(C:=f^{-1}(B)\in \mathcal S\)</span> , <span
class="math inline">\(X^{-1}(C)\in \mathcal F\)</span></p>
</blockquote></li>
<li><p>Cor : If <span class="math inline">\(X_1,\cdots,X_n\)</span> are
random variables , <span class="math inline">\(f:(\mathbb R^n,\mathcal
R^n)\to(\mathbb R,\mathcal R)\)</span> is measurable , then <span
class="math inline">\(f(X_1,\cdots,X_n)\)</span> is a random variable
.</p></li>
<li><p>(*) Cor : <span class="math inline">\(X_1,\cdots,X_n\)</span> are
random variables , then <span
class="math inline">\(X_1+\cdots+X_n\)</span> is a random variable
.</p></li>
<li><p>(*) Thm : <span class="math inline">\(X_1,\cdots\)</span> are
random variables , then <span class="math inline">\(\inf_n X_n\)</span>
, <span class="math inline">\(\sup_n X_n\)</span> , <span
class="math inline">\(\liminf_n X_n\)</span> , <span
class="math inline">\(\limsup_n X_n\)</span> are random variables</p>
<blockquote>
<p>Proof : <span class="math inline">\(\{\inf_n
X_n&lt;a\}=\cup_n\{X_n&lt;a\}\in\mathcal F\)</span> . <span
class="math display">\[
\liminf_{n\to \infty}X_n=\sup_n\left(\inf_{m\ge n}X_m\right)
\]</span> <span class="math inline">\(Y_n=\inf_{m\ge n}X_m\)</span> is a
random variable .</p>
</blockquote></li>
</ol></li>
<li><p>(*) Generalization of random variable</p>
<ol type="1">
<li><p>Def : [Converges almost surely]</p>
<p>From Thm above , the following set is a measurable set <span
class="math display">\[
\Omega_o:=\{w:\lim_{n\to\infty}X_n\text{
exists}\}=\{w:\limsup_{n\to\infty}X_n-\liminf_{n\to\infty}X_n=0\}
\]</span> If <span class="math inline">\(P(\Omega_o)=1\)</span> , then
<span class="math inline">\(X_n\)</span> converges almost surely ( a.s.
) .</p>
<blockquote>
<p>即一列 <span class="math inline">\(X_n\)</span> 是几乎处处收敛的
当且仅当 <span class="math inline">\(X_n\)</span> 不收敛的点集测度为
<span class="math inline">\(0\)</span></p>
</blockquote>
<p>Def : <span class="math inline">\(X_\infty:=\limsup_{n\to \infty}
X_n\)</span></p>
<p>Problem : <span class="math inline">\(X_{\infty}\)</span> can have
value <span class="math inline">\(\pm\infty\)</span> .</p></li>
<li><p>Def : [Generalized random variable]</p>
<p>A function whose domain is <span class="math inline">\(D\in \mathcal
F\)</span> and range is <span class="math inline">\(\mathbb
R^*:=[-\infty,+\infty]\)</span> is a random variable if <span
class="math inline">\(\forall B\in \mathcal R^* , X^{-1}(B)\in \mathcal
F\)</span> .</p>
<p>Def : [Extended Borel set ] : <span class="math inline">\(\mathcal
R^*\)</span> is generated by intervals of form <span
class="math inline">\([-\infty,a),(a,b),(b,+\infty]\)</span> , <span
class="math inline">\(a,b\in \mathbb R\)</span> .</p>
<blockquote>
<p>extended real line <span class="math inline">\((\mathbb R^*,\mathcal
R^*)\)</span> is a measurable space</p>
</blockquote></li>
</ol></li>
</ol>
<h3 id="distribution">1.3 Distribution</h3>
<ol type="1">
<li><p>Definition of Distribution</p>
<ol type="1">
<li><p>Def : [distribution] <span class="math inline">\(X\)</span> is a
random variable , then let <span class="math inline">\(\mu=P\circ
X^{-1}\)</span> ( <span class="math inline">\(\mu(A)=P(X\in A)\)</span>
) , <span class="math inline">\(\mu\)</span> is a probability measure
called distribution .</p></li>
<li><p>Check <span class="math inline">\(\mu\)</span> is a probability
measure <span class="math inline">\((\mathbb R,\mathcal R)\to (\mathbb
R,\mathcal R)\)</span> .</p>
<ul>
<li><p><span class="math inline">\(\mu(A)=P(\{w\in \Omega:X(w)\in
A\})\ge 0\)</span></p></li>
<li><p>If <span class="math inline">\(A_i\in \mathcal R\)</span> are
disjoint countable sequence , then <span
class="math inline">\(X^{-1}(A_i)\)</span> are also disjoint <span
class="math display">\[
\begin{aligned}
\mu(\cup_i A_i)&amp;=P(X^{-1}(\cup_i A_i))\\
&amp;=P(\cup_i X^{-1}(A_i))\\
&amp;=\sum_{i} P(X^{-1}(A_i))\\
&amp;=\sum_{i} \mu(A_i)
\end{aligned}
\]</span></p></li>
</ul></li>
</ol></li>
<li><p>distribution function</p>
<ol type="1">
<li><p>Def : [distribution function] Let <span
class="math inline">\(F(x)=P(X\le x)\)</span> , <span
class="math inline">\(F(x)\)</span> is the distribution function .</p>
<blockquote>
<p><span class="math inline">\(F(x)=P(X\le x)\)</span> : Let <span
class="math inline">\(A_x=(-\infty,x]\)</span> , <span
class="math inline">\(F(x)=P(X^{-1}(A_x))=\mu(A_x)\)</span></p>
</blockquote>
<blockquote>
<p><span class="math inline">\(F(x)\)</span> can be viewed as the CDF of
<span class="math inline">\(\mu\)</span> / the Stieltjes measure
function</p>
</blockquote></li>
<li><p>Thm [Props of distribution function] : Let <span
class="math inline">\(F\)</span> be a distribution function</p>
<ul>
<li><p><span class="math inline">\(F\)</span> : non-decreasing</p></li>
<li><p><span class="math inline">\(\lim\limits_{x\to -\infty
}F(x)=0\)</span> , <span
class="math inline">\(\lim\limits_{x\to+\infty}F(x)=1\)</span></p></li>
</ul>
<blockquote>
<p><span class="math inline">\(x\to -\infty , \{X\le x\}\downarrow
\varnothing\)</span> , <span class="math inline">\(x\to+\infty , \{X\le
x\}\uparrow \Omega\)</span></p>
</blockquote>
<ul>
<li><span class="math inline">\(F\)</span> : right continuous , <span
class="math inline">\(F(x^+)=\lim_{y\downarrow x}F(y)=F(x)\)</span></li>
</ul>
<blockquote>
<p><span class="math inline">\(y=x+\epsilon\)</span> , <span
class="math inline">\(\{X\le y\}=\{X\le x+\epsilon\}\downarrow \{X\le
x\}\)</span></p>
</blockquote>
<ul>
<li><span class="math inline">\(F(x^-)=P(X&lt;x)\)</span></li>
</ul>
<blockquote>
<p><span class="math inline">\(y=x-\epsilon\)</span> , <span
class="math inline">\(\{X\le y\}=\{X\le x-\epsilon\}\uparrow
\{X&lt;x\}\)</span></p>
</blockquote>
<ul>
<li><span class="math inline">\(P(X=x)=F(x)-F(x^-)\)</span></li>
</ul></li>
<li><p>Thm [Judgement of distribution function] : <span
class="math inline">\(F\)</span> satisfies (i) , (ii) , (iii) is a
distribution function</p>
<blockquote>
<p>Proof : [construction]</p>
<p>Let <span class="math inline">\(\Omega=(0,1)\)</span> , <span
class="math inline">\(\mathcal F\)</span> is the corresponding Borel set
, <span class="math inline">\(P\)</span> is Lebesgue measure , so <span
class="math inline">\(P((a,b])=b-a\)</span> .</p>
<p>Let <span class="math inline">\(X(w)=\sup\{y:F(y)&lt;w\}\)</span> ,
we want to prove that <span class="math inline">\(P(X\le
x)=F(x)\)</span> .</p>
<p>Since <span class="math inline">\(F(x)=P(\{w:w\le F(x)\})\)</span> ,
we want to prove that : <span class="math display">\[
\{w:X(w)\le x\}=\{w:w\le F(x)\} \quad\quad \quad (*)
\]</span> <span class="math inline">\(R\subset L\)</span> : <span
class="math inline">\(\forall w,w\le F(x)\)</span> , and <span
class="math inline">\(X(w)=\sup\{y:F(y)&lt;w\}\)</span> , so <span
class="math inline">\(F(X(w))\le w\le F(x)\)</span> , so <span
class="math inline">\(X(w)\le x\)</span> .</p>
<p><span class="math inline">\(R^c\subset L^c\)</span> : <span
class="math inline">\(\forall w,w&gt;F(x)\)</span> , and <span
class="math inline">\(X(w)=\sup\{y:F(y)&lt;w\}\)</span> .</p>
<p><span class="math inline">\(F\)</span> : right continuous , so <span
class="math inline">\(\exists \epsilon&gt;0\)</span> , <span
class="math inline">\(F(x+\epsilon)&lt;w\)</span> , so <span
class="math inline">\(X(w)\ge x+\epsilon&gt;x\)</span> .</p>
</blockquote>
<p>Equation (*) means :</p>
<p><img src="/images/posts/PS2_fig1.png" /></p></li>
<li><p>Remark</p>
<p>Each distribution function <span class="math inline">\(F\)</span>
corresponds to a unique distribution measure <span
class="math inline">\(\mu\)</span></p>
<p>One distribution function <span class="math inline">\(F\)</span> can
correspond to many different random variables</p>
<p>Def [equal in distribution] : If <span
class="math inline">\(X,Y\)</span> have same distribution
measure/function , then <span class="math inline">\(X\)</span> and <span
class="math inline">\(Y\)</span> are equal in distribution , denote as
<span class="math inline">\(X\overset{d}{=}Y\)</span> or <span
class="math inline">\(X=_d Y\)</span> .</p></li>
</ol></li>
<li><p>Density function</p>
<ol type="1">
<li><p>Def [density function] : when a distribution function <span
class="math inline">\(F(x)=P(X\le x)\)</span> has the form <span
class="math display">\[
F(x)=\int_{-\infty}^x f(y)\ dy
\]</span> then <span class="math inline">\(X\)</span> has the density
function <span class="math inline">\(f\)</span> , denote as <span
class="math inline">\(f_X(x)\)</span> .</p>
<blockquote>
<p><span class="math display">\[
P(X=x)=\lim_{\epsilon\to0}\int_{x-\epsilon}^{x+\epsilon} f(y)dy=0
\]</span></p>
</blockquote></li>
<li><p>Prop : (necessary and sufficient)</p>
<ul>
<li><span class="math inline">\(f(x)\ge 0\)</span></li>
<li><span
class="math inline">\(\int_{-\infty}^{+\infty}f(x)dx=1\)</span></li>
</ul></li>
</ol></li>
<li><p>Discrete / Continuous</p>
<p>A probability measure <span class="math inline">\(P\)</span> is
discrete if there exists a countable set <span
class="math inline">\(S\)</span> that <span
class="math inline">\(P(S^c)=0\)</span> ( only non-zero on countable
set) .</p>
<p>Discrete : usually <span class="math inline">\([a_i,b_i)\)</span>
segments like .</p></li>
</ol>
<h3 id="integration">1.4 Integration</h3>
<p>Intuition : Expectation needs Integration .</p>
<ol type="1">
<li><p>Notations</p>
<ol type="1">
<li><p>Def : <span class="math inline">\((\Omega,\mathcal F)\)</span> ,
with measure <span class="math inline">\(\mu\)</span> , <span
class="math inline">\(f:(\Omega,\mathcal F)\to(\mathbb R,\mathcal
R)\)</span> . Denote the integration as <span class="math inline">\(\int
fd\mu\)</span> .</p></li>
<li><p>Restriction for <span class="math inline">\(\mu\)</span> : should
be <span class="math inline">\(\sigma\)</span>-finite measure</p>
<p>e.g. Lebesgue measure is <span
class="math inline">\(\sigma\)</span>-finite : <span
class="math inline">\(A_i=[-i,i]\)</span> , so <span
class="math inline">\(\mu(A_i)&lt;\infty\)</span> and <span
class="math inline">\(\cup_i{A_i}=\mathbb R\)</span> .</p></li>
<li><p>Restriction for <span class="math inline">\(\int fd\mu\)</span>
:</p>
<ol type="i">
<li><span class="math inline">\(\varphi\ge 0\)</span> <span
class="math inline">\(\mu\)</span>-a.e. , then <span
class="math inline">\(\int\varphi d\mu\ge 0\)</span></li>
</ol>
<p>Def [almost everywhere] : <span
class="math inline">\(\mu\)</span>-a.e. : <span
class="math inline">\(\mu(\{w:\varphi(w)&lt;0\})=0\)</span> .</p>
<ol start="2" type="i">
<li><p><span class="math inline">\(\int a\varphi d\mu=a\int \varphi
d\mu\)</span></p></li>
<li><p><span class="math inline">\(\int (\varphi+\psi)d\mu=\int \varphi
d\mu+\int \psi d\mu\)</span></p></li>
<li><p><span class="math inline">\(\varphi\le \psi\)</span> <span
class="math inline">\(\mu\)</span>-a.e. , then <span
class="math inline">\(\int \varphi d\mu\le \int \psi
d\mu\)</span></p></li>
<li><p><span class="math inline">\(\varphi=\psi\)</span> <span
class="math inline">\(\mu\)</span>-a.e. , then <span
class="math inline">\(\int \varphi d\mu=\int \psi d\mu\)</span></p></li>
<li><p><span class="math inline">\(\left|\int \varphi d\mu\right|\le
\int |\varphi|d\mu\)</span></p></li>
</ol></li>
<li><p>Thm : (i),(ii),(iii) can derive (iv),(v),(vi)</p></li>
</ol></li>
<li><p>Simple Function</p>
<ol type="1">
<li><p>Def [simple function] : If <span
class="math inline">\(\varphi(\omega)=\sum_{i=1}^n a_i \mathbb
1_{A_i}\)</span> , and <span class="math inline">\(A_i\)</span> are
disjoint sets , <span
class="math inline">\(\mu(A_i)&lt;\infty\)</span></p></li>
<li><p>Def [simple function integration] : If <span
class="math inline">\(\varphi(\omega)=\sum_{i=1}^n a_i \mathbb
1_{A_i}\)</span> , define <span class="math display">\[
\int \varphi d\mu=\sum_{i=1}^n a_i \mu(A_i)
\]</span></p></li>
<li><p>Check Props</p>
<blockquote>
<ol type="i">
<li><p>: <span class="math inline">\(\varphi\ge 0\)</span> <span
class="math inline">\(\mu\)</span>-a.e. , so for all <span
class="math inline">\(A_i\)</span> with <span
class="math inline">\(\mu(A_i)&gt;0\)</span> , <span
class="math inline">\(a_i\ge 0\)</span> .</p></li>
<li><p>: trivial</p></li>
</ol>
</blockquote>
<blockquote>
<ol start="3" type="i">
<li>: Suppose <span class="math inline">\(\varphi=\sum_{i=1}^m a_i
\mathbb 1_{A_i}\)</span> , <span class="math inline">\(\psi=\sum_{j=1}^n
b_j \mathbb 1_{B_j}\)</span></li>
</ol>
<p>Define <span class="math inline">\(A_0=\cup_{j} B_j-\cup_i
A_i\)</span> , <span class="math inline">\(B_0=\cup_i A_i-\cup_j
B_j\)</span> . Let <span class="math inline">\(a_0=b_0=0\)</span> .</p>
<p>Therefore <span class="math inline">\(\cup_{j=1}^n B_j\subset
\cup_{i=0}^n A_i\)</span> and <span class="math inline">\(\cup_{i=1}^m
A_i\subset \cup_{j=1}^n B_j\)</span> . <span class="math display">\[
\begin{aligned}
\int(\varphi+\psi)d\mu&amp;=\sum_{i=0}^m \sum_{j=0}^n
(a_i+b_j)\mu(A_i\cap B_j)\\
&amp;=\sum_{i=0}^m a_i\sum_{j=0}^n \mu(A_i\cap
B_j)+\sum_{j=0}^nb_j\sum_{i=0}^m \mu(A_i\cup B_j)\\
&amp;=\sum_{i=0}^m a_i\mu(A_i)+\sum_{j=0}^n b_j\mu(B_j)\\
&amp;=\int \varphi d\mu +\int \psi d\mu
\end{aligned}
\]</span></p>
</blockquote></li>
</ol></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>概率与统计</category>
      </categories>
      <tags>
        <tag>实分析-测度-可测映射</tag>
        <tag>概率论-随机变量</tag>
        <tag>概率论-分布函数</tag>
        <tag>概率论-概率密度函数</tag>
        <tag>实分析-Lebesgue积分</tag>
      </tags>
  </entry>
  <entry>
    <title>Probability and Statistics 1</title>
    <url>/2023/10/08/Probability-and-Statistics-1/</url>
    <content><![CDATA[<h2 id="chapter-0">Chapter 0</h2>
<ol type="1">
<li><p>Categories</p>
<ol type="1">
<li><p>Background in Probability</p>
<ul>
<li><p>what is probability <span class="math inline">\(\to\)</span>
measure theory</p></li>
<li><p>what is integration $$ Riemann / Lebesgue Integration</p></li>
<li><p>Expectation &amp; its properties</p></li>
</ul></li>
<li><p>Probability foundations of Asymptotic Statistics</p>
<ul>
<li>weak law of large numbers</li>
<li>strong law of large numbers (proof by Kolmoguor)</li>
<li>central limit theorem</li>
<li>characteristic function</li>
</ul></li>
<li><p>Estimation inference &amp; testing</p>
<ul>
<li>hypothesis testing</li>
<li>regression analysis</li>
<li>frontiers of statistical research (e.g. distribution of free
test)</li>
</ul></li>
</ol></li>
<li><p>Textbook</p>
<p>[Durrett] Probability Theory &amp; Examples (or PTE)</p></li>
</ol>
<h2 id="chapter-1-background-in-probability">Chapter 1 Background in
Probability</h2>
<h3 id="probability-space">1.1 Probability Space</h3>
<ol type="1">
<li><p>Probability Space</p>
<ol type="1">
<li><p>Def : <span class="math inline">\((\Omega,\mathcal
F,P)\)</span></p>
<p><span class="math inline">\(\Omega\)</span> : set of "outcomes"</p>
<p><span class="math inline">\(\mathcal F\)</span> : set of "events" ,
like subset of <span class="math inline">\(\Omega\)</span></p>
<p><span class="math inline">\(P\)</span> : function : <span
class="math inline">\(\mathcal F\to [0,1]\)</span></p></li>
<li><p><span class="math inline">\(\mathcal F\)</span> should be a <span
class="math inline">\(\sigma\)</span>-field</p>
<ol type="1">
<li><p>Def : [<span class="math inline">\(\sigma\)</span>-field] A
nonempty collection of subsets of <span
class="math inline">\(\Omega\)</span> that</p>
<ul>
<li>补封闭：If <span class="math inline">\(A\in \mathcal F\)</span> ,
then <span class="math inline">\(A^c\in \mathcal F\)</span></li>
<li>可数无穷并封闭：If <span class="math inline">\(A_i\in \mathcal
F\)</span> , <span class="math inline">\(A_i\)</span> is a countable
sequence , then <span class="math inline">\(\cup_i A_i\in \mathcal
F\)</span></li>
</ul></li>
<li><p>Prop :</p>
<ul>
<li><span class="math inline">\(\varnothing \in \mathcal F,\Omega\in
\mathcal F\)</span></li>
</ul>
<blockquote>
<p>Proof : <span class="math inline">\(A\in\mathcal F\)</span> $$ <span
class="math inline">\(A^c\in \mathcal F\)</span> <span
class="math inline">\(\to\)</span> <span class="math inline">\(A\cup
A^c\in \mathcal F\)</span> <span class="math inline">\(\to\)</span>
<span class="math inline">\(\Omega\in \mathcal F\)</span> <span
class="math inline">\(\to\)</span> <span
class="math inline">\(\varnothing\in \mathcal F\)</span></p>
</blockquote>
<ul>
<li>可数无穷交封闭：If <span class="math inline">\(A_i\in \mathcal
F\)</span> , , <span class="math inline">\(A_i\)</span> is a countable
sequence , then <span class="math inline">\(\cap_i A_i\in \mathcal
F\)</span></li>
</ul>
<blockquote>
<p>Proof : <span class="math inline">\(A\cap B=(A^c\cup
B^c)^c\)</span></p>
</blockquote></li>
</ol></li>
</ol></li>
<li><p>Measurable Space</p>
<ol type="1">
<li><p>Def : [measure] : non-negative , countably additive , set
function</p>
<p>A function <span class="math inline">\(\mu:\mathcal F\to \mathbb
R\)</span> with :</p>
<ul>
<li><p><span class="math inline">\(\forall A\in \mathcal F\)</span> ,
<span class="math inline">\(\mu(A)\ge
\mu(\varnothing)=0\)</span></p></li>
<li><p>If <span class="math inline">\(A_i\in \mathcal F\)</span> , <span
class="math inline">\(A_i\)</span> countable <strong>disjoint</strong>
sequence , then <span class="math display">\[
\mu(\cup_i A_i)=\sum_{i}\mu(A_i)
\]</span></p></li>
</ul></li>
<li><p>Def : [probability measure] : a measure <span
class="math inline">\(\mu\)</span> with <span
class="math inline">\(\mu(\Omega)=1\)</span></p></li>
<li><p>Thm : a measure <span class="math inline">\(\mu\)</span> on <span
class="math inline">\((\Omega,\mathcal F)\)</span> satisfies</p>
<ul>
<li><p>monotonicity : <span class="math inline">\(A\subset B\Rightarrow
\mu(A)\le \mu(B)\)</span></p></li>
<li><p>subadditivity : <span class="math inline">\(A\subset
\cup_{i=1}^{\infty} A_i\Rightarrow \mu(A)\le
\sum_{i=1}^{\infty}A_i\)</span></p></li>
<li><p>continuity :</p>
<blockquote>
<p>Def : [<span class="math inline">\(A_i\uparrow A\)</span> ]</p>
<p>For set <span class="math inline">\(A\)</span> : <span
class="math inline">\(A_1\subset A_2\subset\cdots , \cup_i
A_i=A\)</span></p>
<p>For real number <span class="math inline">\(A\)</span> : <span
class="math inline">\(A_1\le A_2\le
\cdots,\lim_{n\to\infty}A_n=A\)</span></p>
</blockquote>
<p>If <span class="math inline">\(A_i\uparrow A\)</span> , then <span
class="math inline">\(\mu(A_i)\uparrow \mu(A)\)</span></p>
<p>If <span class="math inline">\(A_i\downarrow A\)</span> , then <span
class="math inline">\(\mu(A_i)\downarrow \mu(A)\)</span></p></li>
</ul>
<blockquote>
<p>Proof :</p>
<ol type="i">
<li>: Let <span class="math inline">\(B-A=B\cap A^c\)</span> , so if
<span class="math inline">\(A\subset B\)</span> , then <span
class="math inline">\(B=A+(B-A)\)</span> , and <span
class="math inline">\(A,B-A\)</span> are disjoint <span
class="math display">\[
\mu(B)=\mu(A+(B-A))=\mu(A)+\mu(B-A)\ge \mu(A)
\]</span></li>
<li>: Let <span class="math inline">\(A_n&#39;:=A_n\cap A\)</span> , so
<span class="math inline">\(A=\cup_{i=1}^{\infty} A&#39;_i\)</span> .
Let <span
class="math inline">\(B_n=\begin{cases}A_1&#39;&amp;n=1\\A_n&#39;-\cup_{i=1}^{n-1}
A_i&#39;&amp;n\ge 2\end{cases}\)</span></li>
</ol>
Therefore , <span class="math inline">\(B_n\)</span> are disjoint , and
<span class="math inline">\(\cup_{i=1}^{\infty} B_i=\cup_{i=1}^{\infty}
A_i&#39;=A\)</span> <span class="math display">\[
\mu(A)=\mu(\cup_{i} B_i)=\sum_{i}\mu(B_i)\le \sum_{i} \mu(A_i)
\]</span>
<ol start="3" type="i">
<li>: Let <span class="math inline">\(B_n=A_n-A_{n-1}\)</span> , so
<span class="math inline">\(B_n\)</span> are disjoint , <span
class="math inline">\(\cup_{i=1}^{n} B_i=A_n\)</span> , <span
class="math inline">\(\cup_{i=1}^{\infty}B_i=A\)</span> <span
class="math display">\[
\mu(A)=\sum_{i=1}^{\infty} \mu(B_i)=\lim_{n\to\infty}\sum_{i=1}^n
\mu(B_i)=\lim_{n\to \infty}\mu(A_n)
\]</span></li>
</ol>
</blockquote></li>
<li><p>E.g.1 Discrete Probability Space</p>
<p><span class="math inline">\(\Omega\)</span> : countable set , <span
class="math inline">\(\mathcal F\)</span> : the set of all subsets of
<span class="math inline">\(\Omega\)</span> , <span
class="math inline">\(p: \Omega\to[0,1]\)</span> , where <span
class="math inline">\(\sum_{\omega\in \Omega}p(\omega)=1\)</span> .
<span class="math display">\[
P(A):=\sum_{\omega\in A}p(\omega)
\]</span></p></li>
</ol></li>
<li><p>Measure on real line</p>
<ol type="1">
<li><p>Def : [generate] <span class="math inline">\(\mathcal A\)</span>
is a set of some subsets of <span class="math inline">\(\Omega\)</span>.
A <span class="math inline">\(\sigma\)</span>-field is generated by
<span class="math inline">\(\mathcal A\)</span> if it is the smallest
<span class="math inline">\(\sigma\)</span>-field containing <span
class="math inline">\(\mathcal A\)</span> : <span
class="math display">\[
\sigma(\mathcal A):=\bigcap_{\mathcal A\subset\mathcal F,\mathcal
F\text{ is }\sigma\text{-field}}\mathcal F
\]</span></p></li>
<li><p>Def : [Borel Set]</p>
<p>Let <span class="math inline">\(\mathcal A\)</span> be the open
subsets of <span class="math inline">\(\mathbb R^d\)</span> , Borel set
is <span class="math inline">\(\sigma(\mathcal A)\)</span> , denoted as
<span class="math inline">\(\mathcal R^d\)</span> .</p></li>
<li><p>measure for <span class="math inline">\(d=1\)</span></p>
<ol type="1">
<li><p>Def : [Stieltjes measure function] <span
class="math inline">\(F:\mathbb R\to\mathbb R\)</span> satisfies :</p>
<ul>
<li>non-decreasing : <span class="math inline">\(\forall x\ge y ,
F(x)\ge F(y)\)</span></li>
<li>right-continuous : <span class="math inline">\(\lim_{y\downarrow
x}F(y)=\lim_{y\to x^+}F(y)=F(x)\)</span></li>
</ul></li>
<li><p>Thm : For all Stieltjes measure function <span
class="math inline">\(F\)</span> , there is a unique measure <span
class="math inline">\(\mu\)</span> on <span
class="math inline">\((\mathbb R,\mathcal R)\)</span> , with <span
class="math display">\[
\mu((a,b])=F(b)-F(a)
\]</span></p></li>
<li><blockquote>
<p>When <span class="math inline">\(F(x)=x\)</span> , <span
class="math inline">\(\mu\)</span> is Lebesgue measure</p>
</blockquote>
<blockquote>
<p>right-continuous : If <span class="math inline">\(b_n\downarrow
b\)</span> , then <span
class="math inline">\(\cup_{n}(a,b_n]=(a,b_n]\)</span>
（可以保持右闭）</p>
</blockquote></li>
<li><p>Def [CDF] : For probability measure : <span
class="math inline">\(\lim\limits_{x\to
-\infty}F(x)=0,\lim\limits_{x\to+\infty}F(x)=1\)</span></p>
<p><span class="math inline">\(F\)</span> : Cumulative Distribution
Function [CDF] .</p></li>
</ol></li>
</ol></li>
<li><p>(*) semi-algebra , algebra , <span
class="math inline">\(\sigma\)</span>-algebra</p>
<ol type="1">
<li>Def : [semi-algebra , algebra , <span
class="math inline">\(\sigma\)</span>-algebra]</li>
</ol></li>
</ol>
<table>
<colgroup>
<col style="width: 10%" />
<col style="width: 39%" />
<col style="width: 50%" />
</colgroup>
<thead>
<tr class="header">
<th style="text-align: center;">structure</th>
<th style="text-align: center;">complement</th>
<th style="text-align: center;">intersection/union</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;">semi-algebra</td>
<td style="text-align: center;"><span class="math inline">\(S^c\)</span>
is a finite , disjoint union of sets in <span
class="math inline">\(\mathcal S\)</span></td>
<td style="text-align: center;"><span class="math inline">\(S,T\in
\mathcal S\)</span> , then <span class="math inline">\(S\cap
T\in\mathcal S\)</span></td>
</tr>
<tr class="even">
<td style="text-align: center;">algebra</td>
<td style="text-align: center;"><span class="math inline">\(A\in
\mathcal A\)</span> , then <span class="math inline">\(A^c\in \mathcal
A\)</span></td>
<td style="text-align: center;"><span class="math inline">\(S,T\in
\mathcal A\)</span> , then <span class="math inline">\(S\cap T,S\cup
T\in \mathcal A\)</span></td>
</tr>
<tr class="odd">
<td style="text-align: center;"><span
class="math inline">\(\sigma\)</span>-algebra</td>
<td style="text-align: center;"><span class="math inline">\(A\in
\mathcal F\)</span> , then <span class="math inline">\(A^c\in \mathcal
F\)</span></td>
<td style="text-align: center;"><span class="math inline">\(A_i\in
\mathcal F\)</span> , countable sequence , then <span
class="math inline">\(\cup_i A_i\in \mathcal F\)</span></td>
</tr>
</tbody>
</table>
<ol start="2" type="1">
<li><p>E.g. [algebra but not <span
class="math inline">\(\sigma\)</span>-algebra]</p>
<p><span class="math inline">\(\Omega=\mathbb Z\)</span> , <span
class="math inline">\(\mathcal A=\{A\subset \Omega | A\text{ or
}A^c\text{ is finite}\}\)</span></p>
<p><span class="math inline">\(\mathcal A\)</span> is obviously algebra
but not <span class="math inline">\(\sigma\)</span>-algebra</p></li>
<li><p>Lemma</p>
<p>If <span class="math inline">\(\mathcal S\)</span> is a semi-algebra
, then <span class="math inline">\(\bar{\mathcal S}=\{\text{finite
disjoint union of sets in }\mathcal S\}\)</span> is algebra .</p>
<p><span class="math inline">\(\bar{\mathcal S}\)</span> is called the
algebra generated by <span class="math inline">\(\mathcal S\)</span>
.</p>
<blockquote>
<p>Proof : easy to check two properties of algebra</p>
</blockquote>
<blockquote>
<p>Question : Is this generation the smallest generation ?</p>
</blockquote></li>
<li><p>Def : [measure for algebra] a measure <span
class="math inline">\(\mu\)</span> on an algebra <span
class="math inline">\(\mathcal A\)</span> satisfies :</p>
<ul>
<li><p><span class="math inline">\(\forall A\in \mathcal A , \mu(A)\ge
\mu(\varnothing)=0\)</span></p></li>
<li><p>If <span class="math inline">\(A_i\in \mathcal A\)</span> is a
disjoint sequence , and <span class="math inline">\(\cup_i A_i\in
\mathcal A\)</span> , then <span class="math display">\[
\mu(\cup_i A_i)=\sum_{i}\mu(A_i)
\]</span></p></li>
</ul>
<p>Def : [<span class="math inline">\(\sigma\)</span>-finite] If there
exists a sequence of sets $A_nA $ , $(A_n)&lt;$ , <span
class="math inline">\(\cup_n A_n=\Omega\)</span> .</p>
<blockquote>
<p>We can let <span class="math inline">\(A_n&#39;=\cup_{i=1}^n
A_i\)</span> , then <span class="math inline">\(A_n&#39;\uparrow
\Omega\)</span> .</p>
<p>We can let <span
class="math inline">\(A_n&#39;=A_n\cap(\cap_{i=1}^{n-1}A_i^c)\)</span> ,
then <span class="math inline">\(A_n&#39;\)</span> are disjoint .</p>
<p>即，在构造这样的 <span class="math inline">\(A_n\)</span>
的时候，我们可以直接考虑 <span class="math inline">\(A_n\uparrow
\Omega\)</span> 或 <span class="math inline">\(A_n\)</span> 不交</p>
</blockquote></li>
<li><p>Thm : <span class="math inline">\(\mathcal S\)</span> is a
semi-algebra , <span class="math inline">\(\mu\)</span> defined on <span
class="math inline">\(\mathcal S\)</span> with <span
class="math inline">\(\mu(\varnothing)=0\)</span></p>
<ol type="i">
<li>. If <span class="math inline">\(\mu\)</span> satisfies :</li>
</ol>
<ul>
<li>If <span class="math inline">\(S\in \mathcal S\)</span> is a finite
disjoint union of sets <span class="math inline">\(S_i\in \mathcal
S\)</span> , then <span
class="math inline">\(\mu(S)=\sum_{i}\mu(S_i)\)</span></li>
<li>If <span class="math inline">\(S_i,S\in \mathcal S\)</span> , <span
class="math inline">\(S=+_{i\ge 1} S_i\)</span> , then <span
class="math inline">\(\mu(S)\le \sum_{i\ge 1} \mu(S_i)\)</span></li>
</ul>
<p>Then <span class="math inline">\(\mu\)</span> has a unique extension
<span class="math inline">\(\bar \mu\)</span> that is a measure on <span
class="math inline">\(\bar{\mathcal S}\)</span> .</p>
<ol start="2" type="i">
<li>. If <span class="math inline">\(\bar\mu\)</span> is <span
class="math inline">\(\sigma\)</span>-finite , then there is a unique
extension <span class="math inline">\(\hat \mu\)</span> that is a
measure on <span class="math inline">\(\sigma(\mathcal S)\)</span>
.</li>
</ol></li>
<li><p>Lemma : If <span class="math inline">\(\mathcal S\)</span> is a
semi-algebra , <span class="math inline">\(\mu\)</span> defined on <span
class="math inline">\(\mathcal S\)</span> with <span
class="math inline">\(\mu(\varnothing)=0\)</span> . If <span
class="math inline">\(S\in \mathcal S\)</span> is a finite disjoint
union of sets <span class="math inline">\(S_i\in \mathcal S\)</span> ,
then <span class="math inline">\(\mu(S)=\sum_{i}\mu(S_i)\)</span> . Then
,</p>
<ul>
<li>If <span class="math inline">\(A,B_i\in \bar{\mathcal S}\)</span> ,
<span class="math inline">\(A=+_{i=1}^n B_i\)</span> , then <span
class="math inline">\(\bar \mu(A)=\sum_{i=1}^n
\bar\mu(B_i)\)</span></li>
<li>If <span class="math inline">\(A,B_i\in \bar{\mathcal S}\)</span> ,
<span class="math inline">\(A\subset \cup_{i=1}^n B_i\)</span> , then
<span class="math inline">\(\bar \mu(A)\le \sum_{i=1}^n
\bar\mu(B_i)\)</span></li>
</ul>
<blockquote>
<p>相当于，上面 (i) 中如果第一个条件成立，对于有限情况下的 第二个条件
一定成立，并可以直接扩展到 <span class="math inline">\(\bar{\mathcal
S}\)</span> 和 <span class="math inline">\(\bar \mu\)</span> 上 。</p>
</blockquote></li>
<li><p>可以借助 Thm , 证明 Stieltjes measure function 对应的 measure
存在，且证明过程需要左开。</p></li>
<li><p>(*) measure on <span class="math inline">\(\mathbb
R^d\)</span></p>
<ol type="1">
<li><p>直接采用类似 Stieltjes measure function 的条件构造 measure
是不够的</p>
<p>Restrictions :</p>
<ul>
<li>non-decreasing : If <span class="math inline">\(\vec x\le \vec
y\)</span> ( <span class="math inline">\(\forall i\in [d] , x_i\le
y_i\)</span>) , then <span class="math inline">\(F(\vec x)\le F(\vec
y)\)</span></li>
<li>right-continuous : Define <span class="math inline">\(\vec
y\downarrow \vec x\)</span> as <span class="math inline">\(\forall i\in
[d] , y_i\downarrow x_i\)</span> , then <span
class="math inline">\(\lim_{\vec y\downarrow \vec x}F(\vec y)=F(\vec
x)\)</span></li>
<li>(probability measure) <span class="math inline">\(\lim\limits_{\vec
x\downarrow -\infty}F(\vec x)=0\)</span> , <span
class="math inline">\(\lim_{\vec x\uparrow +\infty}F(\vec
x)=1\)</span></li>
</ul>
<p>Problem : <span class="math display">\[
F(x_1,x_2)=\begin{cases}
1&amp;x_1\ge 1,x_2\ge 1\\
\frac{2}{3}&amp;x_1\ge 1,x_2\in [0,1)\\
\frac{2}{3}&amp;x_1\in [0,1),x_2\ge 1\\
0&amp;\text{otherwise}
\end{cases}
\]</span></p>
<p><span class="math display">\[
\mu((a_1,b_1]\times(a_2,b_2])=F(b_1,b_2)-F(a_1,b_2)-F(b_1,a_2)+F(a_1,a_2)
\]</span></p>
<p>Let $a_1,a_2=1-$ , <span class="math inline">\(b_1,b_2=1\)</span> ,
<span class="math inline">\(\epsilon \to 0\)</span> , then <span
class="math display">\[
\mu(\{1\}\times\{1\})=-\frac{1}{3}&lt;0
\]</span></p></li>
<li><p>Def : [ <span class="math inline">\(\mathbb R^d\)</span> measure
]</p>
<p>Consider finite rectangles <span
class="math inline">\(A=(a_1,b_1]\times\cdots\times(a_d,b_d]\)</span> ,
<span
class="math inline">\(V=\{a_1,b_1\}\times\cdots\times\{a_d,b_d\}\)</span></p>
<p>If <span class="math inline">\(v\in V\)</span> , define <span
class="math display">\[
sgn(v)=(-1)^{|\{i\in [d]|v_i=a_i\}|}\\
\Delta_A F:=\sum_{v\in V}sgn(v)F(v)
\]</span> let <span class="math inline">\(\mu(A)=\Delta_A F\)</span>
.</p>
<blockquote>
<p>此处相当于 <span class="math inline">\(d\)</span>
维前缀和与差分，<span class="math inline">\(V\)</span> 相当于 <span
class="math inline">\(d\)</span> 维矩形 <span
class="math inline">\(A\)</span> 的所有顶点，<span
class="math inline">\(sgn(v)\)</span> 相当于顶点 <span
class="math inline">\(v\)</span> 有多少维是左顶点，然后容斥求差分。</p>
</blockquote></li>
<li><p>Thm : [<span class="math inline">\(\mathbb R^d\)</span> measure ]
If <span class="math inline">\(F:\mathbb R^d\to [0,1]\)</span> ,
satisfies the <span class="math inline">\(3\)</span> restrictions above
, and for all rectangles <span class="math inline">\(A\)</span> , <span
class="math inline">\(\Delta_A F\ge 0\)</span> . Then there is a unique
probability measure <span class="math inline">\(\mu\)</span> on <span
class="math inline">\((\mathbb R^d,\mathcal R^d)\)</span> that <span
class="math inline">\(\mu(A)=\Delta_A F\)</span> for all finite
rectangles .</p></li>
<li><blockquote>
<p>If <span class="math inline">\(F(\vec x)=\prod_{i=1}^d
F_i(x_i)\)</span> , <span class="math inline">\(F_i\)</span> are all
Stieltjes measure function , then <span class="math display">\[
\Delta_A F=\prod_{i=1}^d (F_i(b_i)-F_i(a_i))
\]</span> When <span class="math inline">\(F_i(x)=x\)</span> for all
<span class="math inline">\(i\in [d]\)</span> , <span
class="math inline">\(F\)</span> is Lebesgue measure on <span
class="math inline">\(\mathbb R^d\)</span> .</p>
</blockquote></li>
</ol></li>
</ol>
<h3 id="random-variables">1.2 Random Variables</h3>
<ol type="1">
<li><p>measurable map</p>
<ol type="1">
<li><p>Def : [measurable map] <span class="math inline">\(X:\Omega\to
S\)</span> is a measurable map from <span
class="math inline">\((\Omega,\mathcal F)\)</span> to <span
class="math inline">\((S,\mathcal S)\)</span> if <span
class="math display">\[
\forall B\in \mathcal S , X^{-1}(B):=\{w\in \Omega|X(w)\in B\}\in
\mathcal F
\]</span> Def : [random vector] When <span
class="math inline">\((S,\mathcal S)=(\mathbb R^d,\mathcal R^d)\)</span>
, <span class="math inline">\(X\)</span> is random vector .</p>
<p>Def : [random variable] When <span class="math inline">\((S,\mathcal
S)=(\mathbb R,\mathcal R)\)</span> , <span
class="math inline">\(X\)</span> is a random variable .</p></li>
<li><blockquote>
<p>虽然 measurable map 写作 from <span
class="math inline">\((\Omega,\mathcal F)\)</span> to <span
class="math inline">\((S,\mathcal S)\)</span> ，但 <span
class="math inline">\(X\)</span> 本身并不实现 <span
class="math inline">\(\mathcal F\to\mathcal S\)</span> 的映射，只有
<span class="math inline">\(\Omega\to S\)</span> 的映射。 <span
class="math inline">\(\mathcal F,\mathcal S\)</span> 是表明 measurable
的"范围"</p>
</blockquote>
<blockquote>
<p>Random variable is not a variable but a (measurable) map</p>
<p>这也很好解释了 <span class="math inline">\(E(X^2)\)</span>
这种类型的记号的实际含义</p>
</blockquote></li>
<li><p>Thm [a sufficient condition for measurable map]</p>
<p><span class="math inline">\(X:\Omega\to S\)</span> , $A $ : a
collection of some subsets of <span class="math inline">\(S\)</span> ,
If</p>
<ul>
<li><span class="math inline">\(\forall A\in \mathcal A , X^{-1}(A)\in
\mathcal F\)</span></li>
<li><span class="math inline">\(\mathcal A\)</span> generates <span
class="math inline">\(\mathcal S\)</span></li>
</ul>
<p>Then <span class="math inline">\(X\)</span> is a measurable map from
<span class="math inline">\((\Omega,\mathcal F)\)</span> to <span
class="math inline">\((S,\mathcal S)\)</span> .</p>
<blockquote>
<p>Proof : Prove <span class="math inline">\(\mathcal B=\{B\subset
S|X^{-1}(B)\in \mathcal F\}\)</span> is a <span
class="math inline">\(\sigma\)</span>-field , and obviously <span
class="math inline">\(\mathcal A\subset \mathcal B\)</span> . Consider
generation is the smallest , <span class="math inline">\(\mathcal
S\subset \mathcal B\)</span> .</p>
</blockquote></li>
<li><p>E.g. <span class="math inline">\(f:\mathbb R^d\to \mathbb
R\)</span> : <span class="math inline">\(f(x_1,\cdots,x_d)=\sum_{i=1}^d
x_i\)</span> is a measurable map from <span
class="math inline">\((\mathbb R^d,\mathcal R^d)\)</span> to <span
class="math inline">\((\mathbb R,\mathcal R)\)</span> .</p></li>
</ol></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>概率与统计</category>
      </categories>
      <tags>
        <tag>概率论-概率空间</tag>
        <tag>实分析-测度</tag>
        <tag>实分析-代数空间</tag>
        <tag>实分析-测度-可测映射</tag>
        <tag>概率论-随机变量</tag>
      </tags>
  </entry>
  <entry>
    <title>答疑坊 线性代数 趣题若干</title>
    <url>/2023/10/07/%E7%AD%94%E7%96%91%E5%9D%8A-%E7%BA%BF%E6%80%A7%E4%BB%A3%E6%95%B0-%E8%B6%A3%E9%A2%98%E8%8B%A5%E5%B9%B2/</url>
    <content><![CDATA[<h3 id="t1">T1</h3>
<p>类型：行列式计算，加行列技巧</p>
<h4 id="题目">题目</h4>
<p>求矩阵 <span class="math inline">\(A\)</span> 的行列式，其中 <span
class="math inline">\(A\)</span> 定义为： <span class="math display">\[
A_{ij}=\begin{cases}a_i+b_j+1&amp;i=j\\a_i+b_j\end{cases}
\]</span></p>
<h4 id="解答">解答</h4>
<p>（自己不会做，求助大佬的）</p>
<p><img src="../images/posts/DYF_Linear_Algebra_fig1.jpg" /></p>
]]></content>
      <categories>
        <category>答疑坊</category>
        <category>线性代数</category>
      </categories>
      <tags>
        <tag>答疑坊</tag>
        <tag>线性代数</tag>
        <tag>线性方程组</tag>
        <tag>矩阵</tag>
        <tag>行列式</tag>
        <tag>对角化</tag>
      </tags>
  </entry>
  <entry>
    <title>答疑坊 微积分 趣题若干</title>
    <url>/2023/10/07/%E7%AD%94%E7%96%91%E5%9D%8A-%E5%BE%AE%E7%A7%AF%E5%88%86-%E8%B6%A3%E9%A2%98%E8%8B%A5%E5%B9%B2/</url>
    <content><![CDATA[<h3 id="p1">P1</h3>
<p>类型：极限存在性</p>
<h4 id="题目">题目</h4>
<p>证明下面的极限存在： <span class="math display">\[
\lim_{n\to
+\infty}(1+\frac{1}{2^2})(1+\frac{1}{3^2})\cdots(1+\frac{1}{n^2})
\]</span></p>
<h4 id="解答">解答</h4>
<p>显然上述数列单调增，因此只需要证明有上界 <span
class="math display">\[
\begin{aligned}
&amp;\quad \prod_{k=2}^n (1+\frac{1}{k^2})\\
&amp;=\exp\left(\sum_{k=2}^n \ln (1+\frac{1}{k^2})\right)\\
&amp;\le \exp\left(\sum_{k=2}^n \frac{1}{k^2}\right)\\
&amp;\le \exp\left(\sum_{k=2}^n \frac{1}{k(k-1)}\right)\\
&amp;=\exp\left(\sum_{k=1}^n \frac{1}{k-1}-\frac{1}{k}\right)\\
&amp;=\exp(1-\frac{1}{n})\\
&amp;\le e
\end{aligned}
\]</span> 其中第 <span class="math inline">\(3\)</span> 行使用常见不等式
<span class="math inline">\(\ln(1+x)\le x\)</span> ，第 <span
class="math inline">\(4\sim 6\)</span> 行为裂项。</p>
<hr />
<h3 id="p2">P2</h3>
<p>类型：<span class="math inline">\(e\)</span> 相关极限不等式</p>
<h4 id="题目-1">题目</h4>
<p>证明，对任意正整数 <span class="math inline">\(n\ge 2\)</span> ，
<span class="math display">\[
\sum_{k=0}^n
\frac{1}{k!}-\frac{3}{2n}&lt;(1+\frac{1}{n})^n&lt;\sum_{k=0}^n
\frac{1}{k!}
\]</span></p>
<h4 id="解答-1">解答</h4>
<p>使用二项式定理 <span class="math display">\[
\begin{aligned}
(1+\frac{1}{n})^n&amp;=\sum_{k=0}^n \frac{n!}{k!(n-k)!}\frac{1}{n^k}\\
&amp;=\sum_{k=0}^n \frac{1}{k!}\prod_{j=0}^{k-1}(1-\frac{j}{n})
\end{aligned}
\]</span> 显然对 <span class="math inline">\(n\ge 2\)</span> , <span
class="math inline">\(\prod_{j=0}^{k-1}(1-\frac{j}{n})&lt; 1\)</span>
，故右侧不等式成立。</p>
<p>考虑 Bernoulli 不等式： <span class="math display">\[
\forall a_1,\cdots,a_n&gt;-1,且a_i 同号，则\prod_{k=1}^n
(1+a_k)&gt;1+\sum_{k=1}^n a_k
\]</span> 故 <span class="math display">\[
\begin{aligned}
(1+\frac{1}{n})^n&amp;&gt;\sum_{k=0}^n\left(
\frac{1}{k!}-\frac{1}{k!}\sum_{j=1}^{k-1}\frac{j}{n}\right)\\
&amp;=\sum_{k=0}^n \frac{1}{k!}-\sum_{k=0}^n
\frac{1}{k!}\frac{k(k-1)}{2n}\\
&amp;=\sum_{k=0}^n \frac{1}{k!}-\sum_{k=2}^n
\frac{1}{(k-2)!}\frac{1}{2n}\\
&amp;\ge \sum_{k=0}^n \frac{1}{k!}-\frac{1}{2n}\left(2+\sum_{k=2}^{n-2}
\frac{1}{k(k-1)}\right)\\
&amp;=\sum_{k=0}^n \frac{1}{k!}-\frac{1}{2n}(3-\frac{1}{n-2})\\
&amp;&gt;\sum_{k=0}^n \frac{1}{k!}-\frac{3}{2n}
\end{aligned}
\]</span></p>
<hr />
]]></content>
      <categories>
        <category>答疑坊</category>
        <category>微积分</category>
      </categories>
      <tags>
        <tag>答疑坊</tag>
        <tag>微积分</tag>
        <tag>多元微积分</tag>
        <tag>级数</tag>
        <tag>微分方程</tag>
      </tags>
  </entry>
  <entry>
    <title>答疑坊 程设/离散/DSA 趣题若干</title>
    <url>/2023/10/07/%E7%AD%94%E7%96%91%E5%9D%8A-%E7%A8%8B%E8%AE%BE-DSA-%E8%B6%A3%E9%A2%98%E8%8B%A5%E5%B9%B2/</url>
    <content><![CDATA[<h3 id="p1">P1</h3>
<h4 id="问题描述">问题描述</h4>
<p>输入 <span class="math inline">\(3n-1\)</span> 个数，其中 <span
class="math inline">\(n-1\)</span> 个数出现恰好 <span
class="math inline">\(3\)</span> 次，<span
class="math inline">\(1\)</span> 个数恰好出现 <span
class="math inline">\(2\)</span> 次。求这个出现 <span
class="math inline">\(2\)</span> 次的数。</p>
<p>要求，时间复杂度 <span class="math inline">\(\tilde {\mathcal
O}(n)\)</span> , 空间复杂度 <span class="math inline">\(\mathcal
O(1)\)</span> .</p>
<h4 id="算法分析">算法分析</h4>
<p>考虑使用三进制不进位加法，显然满足交换律、结合律，且 <span
class="math inline">\(a\oplus a\oplus a=0\)</span> ，出现 <span
class="math inline">\(3\)</span> 次的数都抵消掉，只剩下恰好出现 <span
class="math inline">\(2\)</span> 次的那个数 <span
class="math inline">\(a\)</span> 的 "2倍"。</p>
<p>对最后结果 <span class="math inline">\(sum\)</span> 求 <span
class="math inline">\(sum\oplus sum\)</span> 。因 <span
class="math inline">\(sum=a\oplus a\)</span> , 故 <span
class="math inline">\(sum\oplus sum=a\oplus a\oplus a\oplus a=a\)</span>
。</p>
]]></content>
      <categories>
        <category>答疑坊</category>
        <category>程设/离散/DSA</category>
      </categories>
      <tags>
        <tag>算法</tag>
        <tag>答疑坊</tag>
        <tag>程序设计</tag>
        <tag>离散数学</tag>
      </tags>
  </entry>
  <entry>
    <title>Algorithm Design 3</title>
    <url>/2023/10/07/Algorithm-Design-3/</url>
    <content><![CDATA[<h2 id="chapter-2-greedy-algorithm">Chapter 2 Greedy Algorithm</h2>
<h3 id="minimum-spanning-tree-undirected">2.3 Minimum Spanning Tree
(undirected)</h3>
<ol type="1">
<li><p>Definition</p>
<ol type="1">
<li><p>Input : undirected , edge-weighted graph <span
class="math inline">\(G\)</span></p></li>
<li><p>Output : Minimum Spanning Tree of <span
class="math inline">\(G\)</span></p>
<p>Spanning Tree : a subgraph of <span class="math inline">\(G\)</span>
that is a tree containing all vertices</p>
<p>Minimum : the sum of weights on tree's all edges is
minimized</p></li>
</ol></li>
<li><p>Properties</p>
<ol type="1">
<li><p>Spanning Tree</p>
<ol type="1">
<li>Connectivity <span class="math inline">\(\leftrightarrow\)</span>
check connectivity <span class="math inline">\(\to\)</span>
<strong>cut</strong></li>
<li>Acyclic</li>
</ol></li>
<li><p>Cut</p>
<ol type="1">
<li><p>Def : Graph <span class="math inline">\(G=(V,E),V=A\cup B,A\cap
B=\varnothing\)</span></p>
<p><span class="math inline">\((A,B)\)</span>-cut : <span
class="math inline">\(E(A,B)=\{(u,v)\in E|u\in A,v\in
B\}\)</span></p></li>
<li><p>Observation : If <span class="math inline">\(T\)</span> is a
Spanning Tree , <span class="math inline">\(E(A,B)\)</span> is any cut ,
then <span class="math inline">\(T\cap E(A,B)\neq
\varnothing\)</span></p>
<blockquote>
<p>Otherwise , not connected</p>
</blockquote></li>
</ol></li>
<li><p>Lemma : Assumption : weights are distinct</p>
<p>Suppose <span class="math inline">\(e=(u,v),u\in A,v\in B\)</span> is
the edge with minimum weight in <span
class="math inline">\((A,B)\)</span>-cut , then every MST must contain
<span class="math inline">\(e\)</span> .</p>
<p>Proof : [Exchange Argument , Prove by contradiction]</p>
<p>Suppose <span class="math inline">\(T\)</span> is a MST but <span
class="math inline">\(e\notin T\)</span> . By the observation , there
exists <span class="math inline">\(e&#39;\in T,e&#39;\in E(A,B)\)</span>
.</p>
<p>Let <span class="math inline">\(T&#39;=T-e&#39;+e\)</span> , then
<span class="math inline">\(T\)</span> is still connected , with smaller
weight .</p></li>
</ol></li>
</ol>
<div class="note warning"><p>Problem : <span class="math inline">\(T\)</span> can have cycle !</p>
</div>
<div class="note success"><p>Correction :<br />
Choose one specific <span class="math inline">\(e&#39;\)</span> : <span
class="math inline">\(e\notin T\)</span> , then <span
class="math inline">\(e\)</span> and some edges in <span
class="math inline">\(T\)</span> can form a cycle ( the path from <span
class="math inline">\(u\)</span> to <span
class="math inline">\(v\)</span> on <span
class="math inline">\(T\)</span> ). This cycle must contain an edge
<span class="math inline">\(e&#39;\in E(A,B)\)</span> .</p>
</div>
<ol start="3" type="1">
<li><p>Kruskal's Algorithm</p>
<ol type="1">
<li><p>Algorithm</p>
<p>Successively inserting edges from <span
class="math inline">\(E\)</span> in increasing order of weight .</p>
<p>If edge <span class="math inline">\(e\)</span> would create a cycle .
discard it .</p>
<p>(Using Union-Find Set)</p></li>
<li><p>Proof of Correctness</p>
<p>Consider every added edge <span class="math inline">\(e\)</span> ,
<span class="math inline">\(e\)</span> is the min-weight edge in some
cut</p>
<p><span class="math inline">\(e=(u,v)\)</span> , consider the connected
component <span class="math inline">\(C\)</span> containing <span
class="math inline">\(u\)</span> , then <span
class="math inline">\(e\)</span> is the minimum weight edge in cut <span
class="math inline">\(E(C,V\backslash C)\)</span></p></li>
</ol></li>
<li><p>Prim's Algorithm</p>
<ol type="1">
<li><p>Algorithm</p>
<p>At each step , add the node that can be attached as cheaply as
possible to the partial tree we already have .</p></li>
<li><p>Naive Implementation</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="comment">// d[x] : current minimum edge in cut E(&#123;x&#125;,S)</span></span><br><span class="line"><span class="comment">// Init </span></span><br><span class="line"><span class="keyword">for</span> (v in V)&#123;</span><br><span class="line">    d[v]=+inf;</span><br><span class="line">&#125;</span><br><span class="line">d[<span class="number">1</span>]=<span class="number">0</span>;S.<span class="built_in">insert</span>(<span class="number">1</span>);</span><br><span class="line"><span class="keyword">while</span>(S!=V)&#123;</span><br><span class="line">    <span class="type">int</span> x=<span class="built_in">argmin</span>(d[v] | v in V\S);</span><br><span class="line">    S.<span class="built_in">insert</span>(x); <span class="comment">// using the edge d[x]</span></span><br><span class="line">    <span class="keyword">for</span>(y in <span class="built_in">Neighbour</span>(x))&#123;</span><br><span class="line">        d[y]=<span class="built_in">min</span>(d[y],<span class="built_in">w</span>(x,y));</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li>
<li><p>Improvement : Priority Queue</p></li>
</ol></li>
<li><p>Reverse Deletion</p>
<p>Successively removing edges from <span
class="math inline">\(E\)</span> in decreasing order of weight .</p>
<p>If removing edge <span class="math inline">\(e\)</span> would cause
disconnectedness , discard it .</p>
<blockquote>
<p>The reverse version of Kruskal's Algorithm</p>
</blockquote></li>
<li><p>Faster algorithms</p>
<p>[Chazelle] Deterministic Algorithm <span
class="math inline">\(\mathcal O(|E|\alpha(|E|))\)</span></p>
<blockquote>
<p>This means MST is weaker than sorting</p>
</blockquote>
<p>[Karger , Klem , Tarjan] Randomized Algorithm <span
class="math inline">\(\mathcal O(|E|+|V|)\)</span></p>
<p><strong>Open Question</strong> : Deterministic Linear
Algorithm</p></li>
</ol>
<h3 id="union-find-set">2.4 Union-Find Set</h3>
<ol type="1">
<li><p>Definition</p>
<p>Maintain sets of elements , support :</p>
<ol type="1">
<li><code>find(element e)</code> , return the set that contains <span
class="math inline">\(e\)</span></li>
<li><code>union(set Si,set Sj)</code> , combine set <span
class="math inline">\(S_i\)</span> and <span
class="math inline">\(S_j\)</span> , return the union set <span
class="math inline">\(S_i\cup S_j\)</span></li>
</ol></li>
<li><p>Implementation</p>
<p>The sets can be viewed as trees .</p>
<p>The set can be represented by the root of the corresponding tree
.</p>
<p><code>find</code> : find the root of the tree (keep finding
father)</p>
<p><code>union</code> : <span
class="math inline">\(fa(root(S_1))\leftarrow root(S_2)\)</span></p>
<p>Problem : tree can be very deep</p></li>
<li><p>Improve</p>
<ol type="1">
<li><p>启发式合并 / Heuristic merging</p>
<p>To make the tree shallow , suppose <span
class="math inline">\(|S_1|\le |S_2|\)</span></p>
<p>小集合合并到大集合上</p></li>
<li><p>路径压缩 / Path Compression</p>
<p>Suppose one find : <span
class="math inline">\(e,v_1,\cdots,v_k,root\)</span> , after the find ,
let <span
class="math inline">\(fa(e)=fa(v_1)=\cdots=fa(v_k)=root\)</span></p></li>
</ol></li>
<li><p>Time Complexity</p>
<p>Amortized Analysis / 均摊分析 ( using Path Compression and Heuristic
merging)</p>
<p>Suppose we have a sequence of <code>find</code> or <code>union</code>
operations , with length <span class="math inline">\(m\)</span> .</p>
<p>Total running time of union-find is <span
class="math inline">\(\mathcal O(m\alpha(m,n))\)</span></p>
<p><span class="math inline">\(\alpha\)</span> : inverse Ackermann's
function , grows very slow</p>
<blockquote>
<p>Another slow function : <span
class="math inline">\(\log^*(n)\)</span> , number of logs to make <span
class="math inline">\(n\)</span> to small constant (i.e. <span
class="math inline">\(\log(\log\cdots\log(n))\le 5\)</span>)</p>
</blockquote>
<p>See [CLRS] for detailed proof .</p></li>
<li><p>Kruskal's Algorithm Implementation</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="built_in">sort</span>(E); <span class="comment">// weight-increasing</span></span><br><span class="line"><span class="keyword">for</span>(edge e=(u,v):E)&#123;</span><br><span class="line">    <span class="keyword">if</span>(<span class="built_in">find</span>(u)!=<span class="built_in">find</span>(v))&#123;</span><br><span class="line">        T.<span class="built_in">insert</span>(e);</span><br><span class="line">        <span class="built_in">union</span>(<span class="built_in">find</span>(u),<span class="built_in">find</span>(v));</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>Total time complexity : <span class="math inline">\(\mathcal
O(|E|\log|E|)\)</span> .</p></li>
</ol>
<h3 id="priority-queue">2.5 Priority Queue</h3>
<ol type="1">
<li><p>Definition</p>
<p>Maintain a set of elements <span class="math inline">\(S\)</span> ,
support :</p>
<p><code>insert(S,x)</code> , insert <span
class="math inline">\(x\)</span> into <span
class="math inline">\(S\)</span> .</p>
<p><code>extract_max(S)</code> , return the maximum element in <span
class="math inline">\(S\)</span> , and then remove this element .</p>
<p><code>max(S)</code> , return the maximum element in <span
class="math inline">\(S\)</span></p>
<p><code>increase_key(S,x,k)</code> , increase value of <span
class="math inline">\(x\)</span> by <span
class="math inline">\(k\)</span></p></li>
<li><p>(Basic) (max) Heap</p>
<ol type="1">
<li><p>Property : Complete binary tree , parent $$ both
children</p></li>
<li><p>Can be implemented in an array</p>
<p><code>fa[x]=x/2</code> , <code>left_child[x]=2x</code> ,
<code>right_child[x]=2x+1</code></p></li>
</ol>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="built_in">insert</span>(S,x)&#123; <span class="comment">// flow down , O(log n)</span></span><br><span class="line">    val[|S|]=x;|S|++;</span><br><span class="line">    <span class="type">int</span> p=|S|<span class="number">-1</span>;</span><br><span class="line">    <span class="keyword">while</span>(p!=root&amp;&amp;val[p]&gt;val[fa[p]])&#123;</span><br><span class="line">        <span class="built_in">swap</span>(val[p],val[fa[p]]);</span><br><span class="line">        p=fa[p];</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// Note : build a heap : O(n)</span></span><br></pre></td></tr></table></figure>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="built_in">extract_max</span>(S)&#123; <span class="comment">// flow up , O(log n)</span></span><br><span class="line">    <span class="type">int</span> ret_val=val[root];</span><br><span class="line">    <span class="built_in">swap</span>(val[root],val[|S|<span class="number">-1</span>]); <span class="comment">// swap the root and the last element</span></span><br><span class="line">    |S|--; <span class="comment">// delete original root</span></span><br><span class="line">    <span class="type">int</span> p=root;</span><br><span class="line">    <span class="keyword">while</span>(val[p]&lt;val[left_child[p]]||val[p]&lt;val[right_child[p]])&#123;</span><br><span class="line">        <span class="keyword">if</span>(val[left_child[p]]&gt;val[right_child[p]])&#123;</span><br><span class="line">            <span class="built_in">swap</span>(val[p],val[left_child[p]]);</span><br><span class="line">            p=left_child[p];</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">else</span>&#123;</span><br><span class="line">            <span class="built_in">swap</span>(val[p],val[right_child[p]]);</span><br><span class="line">            p=right_child[p];</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> ret_val;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="built_in">max</span>(S)&#123; <span class="comment">// O(1)</span></span><br><span class="line">    <span class="keyword">return</span> val[root];</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="built_in">increase_key</span>(S,p,k)&#123; <span class="comment">// flow down , O(log n)</span></span><br><span class="line">    val[p]=val[p]+k;</span><br><span class="line">    <span class="keyword">while</span>(p!=root&amp;&amp;val[p]&gt;val[fa[p]])&#123;</span><br><span class="line">        <span class="built_in">swap</span>(val[p],val[fa[p]]);</span><br><span class="line">        p=fa[p];</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li>
<li><p>Advanced : Fibonacci Heap</p>
<p>Time complexity (amortized-time)</p>
<table>
<thead>
<tr class="header">
<th>operations</th>
<th>binary heap</th>
<th>Fibonacci heap</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><code>insert</code></td>
<td><span class="math inline">\(\mathcal O(\log n)\)</span></td>
<td><span class="math inline">\(\mathcal O(1)\)</span></td>
</tr>
<tr class="even">
<td><code>extract_max</code></td>
<td><span class="math inline">\(\mathcal O(\log n)\)</span></td>
<td><span class="math inline">\(\mathcal O(\log n)\)</span></td>
</tr>
<tr class="odd">
<td><code>max</code></td>
<td><span class="math inline">\(\mathcal O(1)\)</span></td>
<td><span class="math inline">\(\mathcal O(1)\)</span></td>
</tr>
<tr class="even">
<td><code>increase_key</code></td>
<td><span class="math inline">\(\mathcal O(\log n)\)</span></td>
<td><span class="math inline">\(\mathcal O(1)\)</span></td>
</tr>
</tbody>
</table>
<p>See [CLRS] for detailed Fibonacci heap .</p></li>
<li><p>Prim's Algorithm with Priority Queue</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="comment">// using min priority queue</span></span><br><span class="line"><span class="keyword">for</span> (v in V\&#123;<span class="number">1</span>&#125;)&#123;</span><br><span class="line">    d[v]=+inf;</span><br><span class="line">    inS[v]=<span class="literal">false</span>;</span><br><span class="line">    PQ.<span class="built_in">insert</span>((v,+inf));</span><br><span class="line">&#125;</span><br><span class="line">d[<span class="number">1</span>]=<span class="number">0</span>;inS[<span class="number">1</span>]=<span class="literal">true</span>;</span><br><span class="line">PQ.<span class="built_in">insert</span>((<span class="number">1</span>,<span class="number">0</span>));</span><br><span class="line"><span class="comment">// index 1 , value 0</span></span><br><span class="line"><span class="keyword">for</span>(<span class="type">int</span> i=<span class="number">1</span>;i&lt;n;++i)&#123; <span class="comment">// n-1 rounds</span></span><br><span class="line">    <span class="type">int</span> x=PQ.<span class="built_in">extract_min</span>(); <span class="comment">// argmin value</span></span><br><span class="line">    inS[x]=<span class="literal">true</span>;</span><br><span class="line">    <span class="comment">// using edge d[x]</span></span><br><span class="line">    <span class="keyword">for</span>(y in <span class="built_in">Neighbour</span>(x))&#123;</span><br><span class="line">        <span class="keyword">if</span>(!inS[y] &amp;&amp; <span class="built_in">w</span>(x,y)&lt;d[y])&#123;</span><br><span class="line">            PQ.<span class="built_in">decrease_key</span>(y,d[y]-<span class="built_in">w</span>(x,y));</span><br><span class="line">            d[y]=<span class="built_in">w</span>(x,y);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>Time Complexity : <span class="math inline">\(\mathcal O(|E|\log
|V|)\)</span> .</p></li>
<li><p>Dijkstra's Algorithm with Priority Queue</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="comment">// using min priority queue</span></span><br><span class="line"><span class="keyword">for</span>(v in V\&#123;s&#125;)&#123;</span><br><span class="line">    d[v]=+inf;</span><br><span class="line">    inS[v]=<span class="literal">false</span>;</span><br><span class="line">    PQ.<span class="built_in">insert</span>((v,+inf))</span><br><span class="line">&#125;</span><br><span class="line">d[s]=<span class="number">0</span>;inS[s]=<span class="literal">true</span>;</span><br><span class="line">PQ.<span class="built_in">insert</span>((v,+inf));</span><br><span class="line"><span class="keyword">for</span>(<span class="type">int</span> i=<span class="number">1</span>;i&lt;n;++i)&#123; <span class="comment">// n-1 rounds</span></span><br><span class="line">    <span class="type">int</span> x=PQ.<span class="built_in">extract_min</span>();</span><br><span class="line">    inS[x]=<span class="literal">true</span>;</span><br><span class="line">    <span class="keyword">for</span>(y in <span class="built_in">Neighbour</span>(x))&#123;</span><br><span class="line">        <span class="keyword">if</span>(!inS[y] &amp;&amp; d[y] &gt; d[x]+<span class="built_in">w</span>(x,y) )&#123;</span><br><span class="line">            PQ.<span class="built_in">decrease_key</span>(d[y]-d[x]-<span class="built_in">w</span>(x,y));</span><br><span class="line">            d[y]=d[x]+<span class="built_in">w</span>(x,y);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>Time Complexity : <span class="math inline">\(\mathcal O(|E|\log
|V|)\)</span> .</p></li>
</ol>
<h3 id="huffman-code">2.6 Huffman Code</h3>
<ol type="1">
<li><p>Code Definition : a set <span class="math inline">\(S\)</span> of
letters encode to <span class="math inline">\(\{0,1\}^*\)</span> , <span
class="math inline">\(r:S\to \{0,1\}^*\)</span></p>
<p>Properties</p>
<ol type="1">
<li><span class="math inline">\(r\)</span> is one-to-one /
injection</li>
<li>encoding / decoding efficiently</li>
<li>minimize average length of the code</li>
<li>[optional] robust to errors , Error Correction Code</li>
</ol></li>
<li><p>Prefix Code</p>
<ol type="1">
<li><p>Def : <span class="math inline">\(\forall x,y\in S\)</span> ,
<span class="math inline">\(r(x)\)</span> is not a prefix of <span
class="math inline">\(r(y)\)</span></p></li>
<li><p>Property : useful for decoding (unique decode) , can decode
greedily</p></li>
<li><p>e.g. <span class="math inline">\(S=\{a,b,c,d,e\}\)</span></p>
<p>Prefix Code <span class="math display">\[
r(a)=11,r(b)=01,r(c)=001,r(d)=10,r(e)=000
\]</span></p>
<p><span class="math display">\[
decode(0000011101)=decode(000\ 001\ 11\ 01)=ecab
\]</span></p>
<p>non-Prefix Code <span class="math display">\[
r(a)=110,r(b)=11,r(c)=01
\]</span></p>
<p><span class="math display">\[
decode(11011)=decode(110\ 11)=decode(11\ 01\ 1)
\]</span></p></li>
<li><p>Prefix Code can be represented using a binary tree</p>
<p>Leafy tree : each leaf corresponds to a letter</p></li>
</ol></li>
<li><p>Minimize the length</p>
<ol type="1">
<li><p>Average encoding length</p>
<p>Suppose each letter has a frequency <span
class="math inline">\(p(x)\)</span> , <span
class="math inline">\(\sum_{x\in S}p(x)=1\)</span> .</p>
<p>Average encoding length : <span class="math display">\[
AEL(r)=\sum_{x\in S}p(x)|r(x)|
\]</span></p></li>
<li><p>[Shannon] Source Coding Theorem <span class="math display">\[
\forall r,AEL(r)\ge \sum_{x\in S}-p(x)\log p(x) =:H
\]</span> See : [Thomas Cover] Information Theory</p></li>
</ol></li>
<li><p>Lemma</p>
<p>There is an optimal code ( or an optimal binary tree ) in which two
lowest-frequency letter are assigned to leaves that are as deep as
possible , and are siblings .</p>
<blockquote>
<p>Proof : 同层换：对 AEL 无影响；跨层： exchange argument</p>
</blockquote></li>
<li><p>Huffman's Code</p>
<p>Initially , construct a set <span class="math inline">\(S\)</span>
containing all letters.</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="keyword">while</span>(|S|&gt;<span class="number">1</span>)&#123;</span><br><span class="line">    elem x,y;</span><br><span class="line">    x=<span class="built_in">extract_min</span>(S) , y=<span class="built_in">extract_min</span>(S);</span><br><span class="line">    elem new_elem=(id,<span class="built_in">p</span>(x)+<span class="built_in">p</span>(y));</span><br><span class="line">    <span class="built_in">insert</span>(S,new_elem);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>算法</tag>
        <tag>算法-图论</tag>
        <tag>算法-图论-最小生成树</tag>
        <tag>算法-图论-最短路</tag>
        <tag>算法-数据结构-并查集</tag>
        <tag>算法-数据结构-堆/优先队列</tag>
        <tag>编码-Huffman编码</tag>
      </tags>
  </entry>
  <entry>
    <title>ZKP and MPC 1</title>
    <url>/2023/10/06/ZKP-and_MPC-1/</url>
    <content><![CDATA[<h2 id="lec01-basic-definitions-and-examples-of-zkp">Lec01 Basic
Definitions and Examples of ZKP</h2>
<h3 id="basic-notations">1 Basic Notations</h3>
<ol type="1">
<li><p>Complexity</p>
<ol type="1">
<li><p>Efficient Algorithm : poly-time algorithm .</p></li>
<li><p>NP class : class of problems that are easy to verify a solution
(but may be hard to solve it)</p>
<p>Formal def : Suppose <span class="math inline">\(L\subset
\{0,1\}^*\)</span> is a language . <span class="math inline">\(L\in
NP\)</span> if there exists a poly-time algorithm <span
class="math inline">\(\mathcal A\)</span> :</p>
<ul>
<li>If <span class="math inline">\(x\in L\)</span> , <span
class="math inline">\(\exists w\in \{0,1\}^{poly(|x|)}\)</span> , <span
class="math inline">\(\mathcal A(x,w)=1\)</span></li>
<li>If <span class="math inline">\(x\notin L\)</span> , <span
class="math inline">\(\forall w\in \{0,1\}^{poly(|x|)}\)</span> , <span
class="math inline">\(\mathcal A(x,w)=0\)</span></li>
</ul>
<p><span class="math inline">\(w\)</span> that makes <span
class="math inline">\(\mathcal A(x,w)=1\)</span> is called the proof of
<span class="math inline">\(x\in L\)</span> , and <span
class="math inline">\(\mathcal A\)</span> is called the verification
algorithm .</p></li>
</ol></li>
<li><p>Proof</p>
<ol type="1">
<li><p>Def : a static sequence of rules</p></li>
<li><p>E.g.1 : Prove that <span
class="math inline">\(f(x,y)=x^2y^3-5xy+4=0\)</span> has a solution
.</p>
<p>Proof 1 [Explicit Proof] : <span
class="math inline">\(f(1,1)=0\)</span></p>
<p>Proof 2 [Implicit Proof] : <span
class="math inline">\(f(2,1)&lt;0,f(2,2)&gt;0\)</span> .</p>
<p>( <span class="math inline">\(f(2,x)\)</span> is continuous , so
<span class="math inline">\(\exists x_0\in (1,2) , f(2,x_0)=0\)</span>
)</p></li>
</ol></li>
</ol>
<h3 id="interactive-proof">2 Interactive Proof</h3>
<ol type="1">
<li><p>Interactive Proof</p>
<ol type="1">
<li><p>A Prover and a Verifier , Prover needs to convince Verifier some
proposition .</p>
<p>Usually , Prover needs to convince Verifier that some proposition is
<strong>true</strong> . <span class="math inline">\((x\in
L)\)</span></p></li>
<li><p>Prover : with unbounded computation sources</p>
<p>Verifier : Only efficient algorithm</p></li>
<li><p>Complexity Class : IP : class of problems that are easy to
determine with an interactive proof</p>
<ul>
<li><span class="math inline">\(NP\subsetneq IP\)</span></li>
</ul></li>
<li><p>Transcript</p>
<p>Prover sends <span class="math inline">\(m_1\)</span> to Verifier ,
Verifier receives <span class="math inline">\(m_1\)</span> and sends
<span class="math inline">\(m_2\)</span> to Prover , ... <span
class="math display">\[
\begin{aligned}
P(x,r_p)\to m_1\quad \quad &amp;\\
V(x,r_v,m_1,\cdots,m_{i-1})\to m_i\quad&amp; \text{for odd }i\\
P(x,r_p,m_1,\cdots,m_{i-1})\to m_i\quad&amp; \text{for even }i\\
V(x,r_v,m_1,\cdots,m_k)\to y\in \{0,1\}&amp;
\end{aligned}
\]</span></p>
<ol type="1">
<li>Def : transcript : <span
class="math inline">\(\tau=\{m_1,\cdots,m_k\}\)</span></li>
<li>Def : <span class="math inline">\(\braket{P,V}(x):=y\)</span></li>
</ol></li>
</ol></li>
<li><p>IP Criteria</p>
<ol type="1">
<li><p>Completeness : If Prover is honest , Verifier accepts the proof
.</p>
<p><span class="math inline">\(\forall x\in L ,
\Pr\{\left&lt;P,V\right&gt;(x)=1\}=1\)</span></p></li>
<li><p>Soundness : If Prover is dishonest , Verifier rejects the proof
(with high probability) .</p></li>
<li><p>Zero-Knowledge : If Prover is honest , Verifier cannot know more
than "knowing the proposition is true"</p></li>
</ol></li>
<li><p>Definition (somewhat formal (?) )</p>
<p>A pair of randomized algorithms <span
class="math inline">\((P,V)\)</span> is an interactive proof for <span
class="math inline">\(L\)</span></p>
<ol type="1">
<li><p><span class="math inline">\(V\)</span> runs in poly-time</p></li>
<li><p>Completeness : <span class="math inline">\(\forall x\in L ,
\Pr\{\braket{P,V}(x)=1\}=1\)</span></p></li>
<li><p>Soundness : <span class="math inline">\(\forall x\notin L,\forall
P^*,\Pr\{\braket{P^*,V}(x)=1\}&lt; \epsilon\)</span></p></li>
<li><p>Zero-Knowledge (Optimal)</p>
<p><span class="math inline">\(\forall x\in L\)</span> , <span
class="math inline">\(V\)</span> can generate everything itself with
bounded computation time without interaction</p>
<p><strong>gain sth. cannot gain by PPT <span
class="math inline">\(\Rightarrow\)</span> gain
knowledge</strong></p></li>
</ol></li>
</ol>
<h3 id="ip-examples">3 IP Examples</h3>
<ol type="1">
<li><p>Graph non-Isomorphism</p>
<ol type="1">
<li><p>Description</p>
<p>Prover and Verifier know graphs <span
class="math inline">\(G_0,G_1\)</span> .</p>
<p>Prover wants to convince verifier that <span
class="math inline">\(G_0\)</span> is not isomorphic with <span
class="math inline">\(G_1\)</span> . <span class="math display">\[
\begin{aligned}
&amp;G_0=(V,E_0) , G_1=(V,E_1)\\
\not\exists \pi\in S_{|V|}&amp;\ ,\ \{(\pi(u),\pi(v)):(u,v)\in E_0\}=E_1
\end{aligned}
\]</span></p></li>
<li><p>Protocol</p>
<p>Global : Prover and Verifier already know <span
class="math inline">\(G_0,G_1\)</span> .</p>
<table>
<colgroup>
<col style="width: 37%" />
<col style="width: 13%" />
<col style="width: 49%" />
</colgroup>
<thead>
<tr class="header">
<th style="text-align: center;">Prover</th>
<th style="text-align: center;"></th>
<th>Verifier</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;"></td>
<td style="text-align: center;"></td>
<td><span class="math inline">\(b\leftarrow \{0,1\}\)</span> , <span
class="math inline">\(\pi\leftarrow S_{|V|}\)</span></td>
</tr>
<tr class="even">
<td style="text-align: center;"></td>
<td style="text-align: center;">&lt;- <span class="math inline">\(\tilde
G\)</span> --</td>
<td><span class="math inline">\(\tilde G:=\pi(G_b)\)</span></td>
</tr>
<tr class="odd">
<td style="text-align: center;">Find $b $ , <span
class="math inline">\(G_{\tilde b}\sim \tilde G\)</span></td>
<td style="text-align: center;">-- <span class="math inline">\(\tilde
b\)</span> -&gt;</td>
<td></td>
</tr>
<tr class="even">
<td style="text-align: center;"></td>
<td style="text-align: center;"></td>
<td><span class="math inline">\(y=\begin{cases}1&amp; \text{if }b=\tilde
b\\0&amp;\text{otherwise}\end{cases}\)</span></td>
</tr>
</tbody>
</table></li>
<li><p>Analysis</p>
<ol type="1">
<li><p>Completeness : When <span class="math inline">\(G_0\not\sim
G_1\)</span> , <span class="math inline">\(G_{\tilde b}\sim \tilde G\sim
G_b\)</span> , so <span class="math inline">\(b=\tilde b\)</span> .
Always accept .</p></li>
<li><p>Soundness : When <span class="math inline">\(G_0\sim G_1\)</span>
, any prover can only guess <span class="math inline">\(\tilde
b\)</span> randomly since <span class="math inline">\(G_0\sim G_1\sim
\tilde G\)</span> . <span class="math inline">\(\Pr\{V\text{
accept}\}\le \frac{1}{2}\)</span> .</p>
<p>Proof : <span class="math inline">\(P^*\)</span> knows <span
class="math inline">\(\tilde G\)</span> and wants to guess <span
class="math inline">\(b\)</span> , we need to prove that <span
class="math display">\[
\forall P^* , \Pr\{P^*(\tilde G)=b\}\le \frac{1}{2}
\]</span> Since <span class="math inline">\(\forall \tilde G\sim G_0\sim
G_1\)</span> , <span class="math display">\[
\begin{aligned}
\Pr\{\pi(G_0)=\tilde G\}&amp;=\Pr\{\pi(G_1)=\tilde G\}\\
\Rightarrow \Pr\{\pi(G_b)=\tilde G|b=0\}&amp;=\Pr\{\pi(G_b)=\tilde
G|b=1\}\\
\Rightarrow \Pr\{b=0|\pi(G_b)=\tilde G\}&amp;=\Pr\{b=1|\pi(G_b)=\tilde
G\}\\
\end{aligned}
\]</span> Therefore , <span class="math display">\[
\forall P^* , \Pr\{P^*(\tilde G)=b|\pi(G_b)=\tilde G\}\le\frac{1}{2}
\]</span> Therefore , <span class="math display">\[
\begin{aligned}
\Pr\{P^*(\tilde G)=b\}&amp;=\sum_{\tilde G}\Pr\{P^*(\tilde
G)=b|\pi(G_b)=\tilde G\}\Pr\{\pi(G_b)=\tilde G\}\\
&amp;\le \frac{1}{2}\sum_{\tilde G}\Pr\{\pi(G_b)=\tilde G\}\\
&amp;=\frac{1}{2}
\end{aligned}
\]</span> <span class="math inline">\(k\)</span> rounds : <span
class="math inline">\(\Pr\{V\text{ accepts}\}\le \frac{1}{2^k}\)</span>
.</p></li>
<li><p>Zero-Knowledge</p>
<p>Verifier itself knows <span class="math inline">\(b,\pi,\tilde
G\)</span> . Prover tells Verifier <span class="math inline">\(\tilde
b\)</span> .</p>
<p>Only consider <span class="math inline">\(G_0\not\sim G_1\)</span> ,
then <span class="math inline">\(\tilde b\)</span> must be <span
class="math inline">\(b\)</span> . Verifier itself can generate the
transcript .</p>
<blockquote>
<p>注意：因为 zero-knowledge 是对 Prover 的保护，我们只需要保护诚实
Prover 的隐私，因此只要考虑命题成立的情况（即 <span
class="math inline">\(G_0\not\sim G_1\)</span>）。</p>
</blockquote></li>
</ol></li>
<li><p>Notes</p>
<blockquote>
<p>Given <span class="math inline">\(G_0\not\sim G_1\)</span> , Verifier
can generate the proof itself</p>
<p>Due to Verifier's randomness , Prover cannot generate the proof
itself</p>
</blockquote>
<blockquote>
<p>Graph non-isomorphism Problem is not <span
class="math inline">\(NP\)</span> , but can be solved by <span
class="math inline">\(IP\)</span> .</p>
</blockquote></li>
</ol></li>
<li><p>Collision Problem</p>
<ol type="1">
<li><p>Definition</p>
<ol type="1">
<li>Given <span class="math inline">\(h:\{0,1\}^n\to\{0,1\}^n\)</span> ,
either <span class="math inline">\(h\)</span> is a permutation or <span
class="math inline">\(|Im(h)|\le 2^{n-1}\)</span> .</li>
<li>Prover wants to convince Verifier that <span
class="math inline">\(h\)</span> is a permutation .</li>
</ol></li>
<li><p>Protocol 1</p>
<p>Global : Prover and Verifier already know <span
class="math inline">\(h\)</span> . <span class="math inline">\(\forall
x\in \{0,1\}^n\)</span> , Verifier can get <span
class="math inline">\(h(x)\)</span> in poly-time .</p>
<table style="width:100%;">
<colgroup>
<col style="width: 30%" />
<col style="width: 14%" />
<col style="width: 54%" />
</colgroup>
<thead>
<tr class="header">
<th style="text-align: center;">Prover</th>
<th style="text-align: center;"></th>
<th style="text-align: center;">Verifier</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;"></td>
<td style="text-align: center;"></td>
<td style="text-align: center;"><span class="math inline">\(x\leftarrow
\{0,1\}^n\)</span></td>
</tr>
<tr class="even">
<td style="text-align: center;"></td>
<td style="text-align: center;">&lt;- <span
class="math inline">\(y\)</span> --</td>
<td style="text-align: center;"><span
class="math inline">\(y:=h(x)\)</span></td>
</tr>
<tr class="odd">
<td style="text-align: center;">Find <span class="math inline">\(\tilde
x\)</span> , <span class="math inline">\(h(\tilde x)=y\)</span></td>
<td style="text-align: center;">-- <span class="math inline">\(\tilde
x\)</span> -&gt;</td>
<td style="text-align: center;"></td>
</tr>
<tr class="even">
<td style="text-align: center;"></td>
<td style="text-align: center;"></td>
<td style="text-align: center;"><span
class="math inline">\(\begin{cases}1&amp;x=\tilde
x\\0&amp;\text{otherwise}\end{cases}\)</span></td>
</tr>
</tbody>
</table></li>
<li><p>Analysis 1</p>
<ol type="1">
<li><p>Completeness : When <span class="math inline">\(h\)</span> is a
permutation , <span class="math inline">\(h\)</span> is a injection ,
<span class="math inline">\(x=\tilde x\)</span> .</p></li>
<li><p>Soundness : When <span class="math inline">\(h\)</span> is not a
permutation , then <span class="math inline">\(|Im(h)|\le
2^{n-1}\)</span> . <span class="math display">\[
\forall x , P^*\ ,\ \Pr\{P^*(h(x))=x\}=\frac{1}{|h^{-1}(h(x))|}
\]</span> Therefore , <span class="math display">\[
\begin{aligned}
\Pr\{V\text{ accepts}\}&amp;=\sum_{y\in Im(h)}\Pr\{y=f(x)|x\in
\{0,1\}^n\}\Pr\{P^*(h(x))=x\}\\
&amp;=\sum_{y\in Im(h)}\frac{|h^{-1}(y)|}{2^n}\frac{1}{|h^{-1}(y)|}\\
&amp;=\frac{|Im(h)|}{2^n}\\
&amp;\le \frac{1}{2}
\end{aligned}
\]</span> <span class="math inline">\(k\)</span> rounds : <span
class="math inline">\(\Pr\{V \text{ accepts}\}\le\frac{1}{2^k}\)</span>
.</p></li>
<li><p>Zero-knowledge :</p>
<p>Verifier itself knows <span class="math inline">\(x,y\)</span> .
Prover tells Verifier <span class="math inline">\(\tilde x\)</span>
.</p>
<p>When <span class="math inline">\(h\)</span> is a permutation , <span
class="math inline">\(\tilde x=x\)</span> , so Verifier can generate
<span class="math inline">\(\tilde x\)</span> itself .</p></li>
</ol></li>
<li><p>Protocol 2</p>
<p>Global : Prover and Verifier already know <span
class="math inline">\(h\)</span> . <span class="math inline">\(\forall
x\in \{0,1\}^n\)</span> , Verifier can get <span
class="math inline">\(h(x)\)</span> in poly-time .</p>
<table>
<colgroup>
<col style="width: 23%" />
<col style="width: 10%" />
<col style="width: 65%" />
</colgroup>
<thead>
<tr class="header">
<th style="text-align: center;">Prover</th>
<th style="text-align: center;"></th>
<th style="text-align: center;">Verifier</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;"></td>
<td style="text-align: center;">&lt;- <span
class="math inline">\(y\)</span> --</td>
<td style="text-align: center;"><span class="math inline">\(y\leftarrow
\{0,1\}^n\)</span></td>
</tr>
<tr class="even">
<td style="text-align: center;">Find <span
class="math inline">\(x\)</span> , <span
class="math inline">\(h(x)=y\)</span></td>
<td style="text-align: center;">-- <span
class="math inline">\(x\)</span> -&gt;</td>
<td style="text-align: center;"></td>
</tr>
<tr class="odd">
<td style="text-align: center;"></td>
<td style="text-align: center;"></td>
<td style="text-align: center;"><span
class="math inline">\(\begin{cases}1&amp;h(x)=y\\0&amp;\text{otherwise}\end{cases}\)</span></td>
</tr>
</tbody>
</table></li>
<li><p>Analysis 2</p>
<ol type="1">
<li><p>Completeness : When <span class="math inline">\(h\)</span> is a
permutation , <span class="math inline">\(h\)</span> is a bijection ,
<span class="math inline">\(x\)</span> always exists and unique . Always
accept .</p></li>
<li><p>Soundness : When <span class="math inline">\(h\)</span> is not a
permutation , <span class="math inline">\(|Im(h)|\le 2^{n-1}\)</span> .
<span class="math display">\[
\Pr\{h^{-1}(y)=\varnothing|y\leftarrow
\{0,1\}^n\}=1-\frac{|Im(h)|}{2^n}\ge \frac{1}{2}
\]</span> When <span
class="math inline">\(h^{-1}(y)=\varnothing\)</span> , any Prover cannot
convince verifier , so <span class="math display">\[
\Pr\{V\text{ accepts}\}\le 1-\Pr\{h^{-1}(y)=\varnothing|y\leftarrow
\{0,1\}^n\}\le \frac{1}{2}
\]</span> <span class="math inline">\(k\)</span> rounds : <span
class="math inline">\(\Pr\{V \text{ accepts}\}\le\frac{1}{2^k}\)</span>
.</p></li>
<li><p>Zero-knowledge : <strong>NOT (dishonest-verifier)
zero-knowledge</strong></p>
<p><span class="math inline">\(V\)</span> knows <span
class="math inline">\(h^{-1}(y)\)</span> , which cannot be computed in
poly-time by itself .</p>
<blockquote>
<p>This is still <strong>honest-verifier zero-knowledge</strong> , see
homework 1</p>
</blockquote></li>
</ol></li>
<li><p>Notes</p>
<blockquote>
<p>In Protocol 1 , when <span class="math inline">\(h\)</span> is a
permutation , Verifier can always generate a valid proof</p>
</blockquote>
<blockquote>
<p>In both Protocols , Prover must have the ability to solve <span
class="math inline">\(h^{-1}\)</span> . This may lead Verifier knowing
something more .</p>
</blockquote></li>
</ol></li>
</ol>
<h3 id="formularize-zero-knowledge-ip">4 Formularize zero-knowledge
IP</h3>
<ol type="1">
<li><p>Security Parameter : <span
class="math inline">\(\kappa\)</span></p>
<ol type="1">
<li>For efficiency : usually <span
class="math inline">\(|x|=poly(\kappa)\)</span> , PPT should run in
<span class="math inline">\(poly(\kappa)\)</span> .</li>
<li>For security : negligible function</li>
</ol></li>
<li><p>Negligible function : <span
class="math inline">\(\epsilon\)</span></p>
<ol type="1">
<li><p>Def : negligible function <span
class="math inline">\(\epsilon:\mathbb Z^*\to \mathbb R^*\)</span></p>
<p>For all polynomial <span class="math inline">\(p\)</span> , <span
class="math inline">\(\exists c\in \mathbb Z^* , \forall
k&gt;c,\epsilon(k)&lt;\frac{1}{p(k)}\)</span> .</p></li>
<li><p>Propositions :</p>
<ol type="1">
<li><span class="math inline">\(\epsilon , \delta\)</span> negligible
<span class="math inline">\(\to\)</span> <span
class="math inline">\(\epsilon+\delta , \epsilon\delta\)</span>
negligible</li>
<li><span class="math inline">\(\epsilon\)</span> negligible <span
class="math inline">\(\to\)</span> For all polynomial <span
class="math inline">\(p\)</span> , <span
class="math inline">\(p\epsilon\)</span> negligible</li>
</ol></li>
<li><p>Conventions :</p>
<ol type="1">
<li><p><span class="math inline">\(\epsilon\)</span> noticeable : <span
class="math inline">\(\exists\)</span> polynomial <span
class="math inline">\(p\)</span> , <span class="math inline">\(\exists
c\in \mathbb Z^*,\forall k&gt;c,\epsilon(k)\ge
\frac{1}{p(k)}\)</span></p></li>
<li><p><span class="math inline">\(A\)</span> happens with overwhelming
probability : <span class="math display">\[
\Pr\{A\}\ge 1-\epsilon \quad\quad \epsilon \text{ is negligible}
\]</span></p></li>
</ol></li>
</ol>
<blockquote>
<p>There exists function that is neither negligible nor noticeable</p>
</blockquote></li>
<li><p>View of Verifier <span class="math display">\[
View_V^P(x):=(x,r,\tau)
\]</span></p></li>
<li><p>Honest-Verifier Zero-knowledge</p>
<ol type="1">
<li><p>Perfect honest-verifier zero-knowledge <span
class="math display">\[
\exists M\in PPT\ ,\ \forall x\in L\ ,\ View_V^P(x)\equiv M(x)
\]</span></p></li>
<li><p>Statistical honest-verifier zero-knowledge</p>
<p>The statistical distance between <span
class="math inline">\(View_V^P(x)\)</span> and <span
class="math inline">\(M(x)\)</span> is negligible</p>
<p><span class="math inline">\(\exists M\in PPT\ ,\ \forall x\in
L\)</span> , <span class="math display">\[
SD(View_V^P(x),M(x))=\frac{1}{2}\sum_{s}\left|\Pr\{View_V^P(x)=s\}-\Pr\{M(x)=s\}\right|&lt;\epsilon
\]</span></p></li>
<li><p>Computational honest-verifier zero-knowledge</p>
<p><span class="math inline">\(\exists M\in PPT\ ,\ \forall x\in L\ , \
\forall\)</span> distinguisher <span class="math inline">\(D\in
PPT\)</span> , <span class="math display">\[
\left|\Pr\{D(View_V^P(x))=1\}-\Pr\{D(M(x))=1\}\right|&lt;\epsilon
\]</span></p></li>
</ol></li>
<li><p>Dishonest Verifier</p>
<p><span class="math inline">\(\braket{P,V}\)</span> achieves
perfect/statistical/computational dishonest-verifier zero-knowledge if
:</p>
<p><span class="math inline">\(\forall V^*\in PPT\ , \ \exists\)</span>
expected poly-time randomized algorithm <span
class="math inline">\(M^*\)</span> , <span class="math inline">\(\forall
x\in L\)</span> , <span class="math inline">\(View_{V^*}^P(x)\)</span>
and <span class="math inline">\(M^*(x)\)</span> are
perfectly/statistically/computationally indistinguishable .</p></li>
<li><p>Graph Isomorphism IP with dishonest-verifier zero-knowledge</p>
<ol type="1">
<li><p>Definition</p>
<p>Prover and Verifier know graph <span
class="math inline">\(G_0,G_1\)</span> .</p>
<p>Prover wants to convince Verifier that <span
class="math inline">\(G_0\sim G_1\)</span> , i.e. <span
class="math inline">\(\exists \pi,\pi(G_0)=G_1\)</span> .</p></li>
<li><p>Protocol</p>
<table>
<colgroup>
<col style="width: 35%" />
<col style="width: 12%" />
<col style="width: 52%" />
</colgroup>
<thead>
<tr class="header">
<th style="text-align: center;">Prover</th>
<th style="text-align: center;"></th>
<th style="text-align: center;">Verifier</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;"><span
class="math inline">\(\pi_r\leftarrow S_n\)</span></td>
<td style="text-align: center;"></td>
<td style="text-align: center;"></td>
</tr>
<tr class="even">
<td style="text-align: center;"><span class="math inline">\(\tilde
G:=\pi_r(G_0)\)</span></td>
<td style="text-align: center;">--<span class="math inline">\(\tilde
G\)</span>-&gt;</td>
<td style="text-align: center;"></td>
</tr>
<tr class="odd">
<td style="text-align: center;"></td>
<td style="text-align: center;">&lt;-<span
class="math inline">\(b\)</span>--</td>
<td style="text-align: center;"><span class="math inline">\(b\leftarrow
\{0,1\}\)</span></td>
</tr>
<tr class="even">
<td style="text-align: center;">find <span
class="math inline">\(\pi_b\)</span> , s.t. <span
class="math inline">\(\tilde G=\pi_b(G_b)\)</span></td>
<td style="text-align: center;">--<span
class="math inline">\(\pi_b\)</span>-&gt;</td>
<td style="text-align: center;"></td>
</tr>
<tr class="odd">
<td style="text-align: center;"></td>
<td style="text-align: center;"></td>
<td style="text-align: center;"><span
class="math inline">\(\begin{cases}1&amp;\tilde
G=\pi_b(G_b)\\0&amp;\text{otherwise}\end{cases}\)</span></td>
</tr>
</tbody>
</table>
<blockquote>
<p>Note : If Prover knows <span class="math inline">\(\pi\)</span> that
<span class="math inline">\(\pi(G_0)=G_1\)</span> , then <span
class="math inline">\(\pi_b\)</span> can be constructed : <span
class="math display">\[
\pi_b=\begin{cases}\pi_r&amp;b=0\\\pi_r\circ\pi^{-1}&amp;b=1\end{cases}
\]</span></p>
</blockquote></li>
<li><p>Analysis</p>
<ol type="1">
<li><p>Completeness : If <span class="math inline">\(G_0\sim
G_1\)</span> , <span class="math inline">\(\pi_b\)</span> can be
constructed as above . Always Accept .</p></li>
<li><p>Soundness : If <span class="math inline">\(G_0\not\sim
G_1\)</span> , <span class="math inline">\(\exists b^*\in
\{0,1\},G_{b^*}\not\sim\tilde G\)</span></p>
<p><span class="math inline">\(\Pr\{V \text{ rejects}\}\ge \Pr\{V\text{
chooses }b^*\}=\frac{1}{2}\)</span></p>
<blockquote>
<p>Note : we cannot let Verifier just choose <span
class="math inline">\(b=1\)</span> , since a malicious Prover can
violate the protocol , <span class="math inline">\(\pi_r\)</span> may
not be a permutation , and <span class="math inline">\(\tilde G\)</span>
may not isomorphic to <span class="math inline">\(G_0\)</span> .</p>
</blockquote></li>
<li><p>honest-verifier zero-knowledge :</p>
<p>Verifier itself knows : <span
class="math inline">\(G_0,G_1,b\)</span> , <span
class="math inline">\(r\)</span> : randomness generating <span
class="math inline">\(b\)</span> . Prover tells verifier <span
class="math inline">\(\tilde G,\pi_b\)</span> .</p>
<p><span class="math inline">\(View_V^P=(G_0,G_1,r,b,\tilde
G,\pi_b)\)</span> .</p>
<p><span class="math inline">\(M_V\)</span> :</p>
<ol type="1">
<li>sample <span class="math inline">\(b\leftarrow \{0,1\}\)</span></li>
<li>choose <span class="math inline">\(\pi_b\leftarrow S_n\)</span></li>
<li><span class="math inline">\(\tilde G:=\pi_b(G_b)\)</span></li>
</ol>
<p>Since Prover and Verifier are honest , <span
class="math inline">\(b\)</span> is independent of $G $ , and <span
class="math inline">\(\pi_b\)</span> is generated by a uniformly random
<span class="math inline">\(\pi_r\)</span> hence is also uniformly
random .</p></li>
<li><p>dishonest-verifier zero-knowledge :</p>
<p>Malicious Verifier can choose <span class="math inline">\(b\)</span>
based on <span class="math inline">\(\tilde G\)</span> to gain more
knowledge .</p>
<p><span class="math inline">\(M^*\)</span> : should perform as Prover
:</p>
<ol type="1">
<li>guess <span class="math inline">\(b^*\in \{0,1\}\)</span></li>
<li>compute <span class="math inline">\(\pi_r\leftarrow S_n\)</span> ,
<span class="math inline">\(\tilde G:=\pi_r(G_{b^*})\)</span></li>
<li>use Verifier to receive <span class="math inline">\(b\)</span></li>
<li>If <span class="math inline">\(b\neq b^*\)</span> , go back to
1.</li>
<li>If <span class="math inline">\(b=b^*\)</span> , then let <span
class="math inline">\(\pi_b=\pi_r\)</span> , get the view</li>
</ol></li>
</ol></li>
</ol></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>零知识证明和多方安全计算</category>
      </categories>
      <tags>
        <tag>密码学</tag>
        <tag>密码学-零知识证明</tag>
        <tag>密码学-交互式证明</tag>
      </tags>
  </entry>
  <entry>
    <title>Algorithm Design 2</title>
    <url>/2023/10/06/Algorithm-Design-2/</url>
    <content><![CDATA[<h3 id="dfs-application">1.3 dfs application</h3>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="function">function <span class="title">dfs</span><span class="params">(u)</span></span>&#123;</span><br><span class="line">    ++time;</span><br><span class="line">    discover[u]=time;</span><br><span class="line">    col[u]=grey;</span><br><span class="line">    <span class="keyword">for</span>(v in <span class="built_in">Neighbour</span>(u) )&#123;</span><br><span class="line">        <span class="keyword">if</span>(col[v]==white)&#123;</span><br><span class="line">            <span class="built_in">dfs</span>(v)</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    col[u]=black;</span><br><span class="line">    ++time;</span><br><span class="line">    finish[u]=time;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<ol type="1">
<li><p>THM parenthesis</p>
<p>on dfs tree , if <span class="math inline">\(u\)</span> is one
ancestor of <span class="math inline">\(v\)</span> , then <span
class="math inline">\([v.d,v.f]\subset[u.d,u.f]\)</span></p>
<p>on dfs tree , otherwise , then <span
class="math inline">\([u.d,u.f]\cap
[v.d,v.f]=\varnothing\)</span></p></li>
<li><p>THM white-path</p>
<p>on dfs tree , at the time when <span class="math inline">\(u\)</span>
is discovered ,</p>
<p><span class="math inline">\(u\)</span> is one ancestor of <span
class="math inline">\(v\)</span> <span
class="math inline">\(\Leftrightarrow\)</span> <span
class="math inline">\(\exists\)</span> white path from <span
class="math inline">\(u\)</span> to <span
class="math inline">\(v\)</span></p>
<p>即： <span class="math inline">\(v\)</span> 是 <span
class="math inline">\(u\)</span> 的后代<span
class="math inline">\(\Leftrightarrow\)</span>在 <span
class="math inline">\(u\)</span>
刚访问到的时候一定存在一条完全没有被访问过的路径 <span
class="math inline">\(u\to v\)</span> .</p>
<ul>
<li>Proof : use parenthesis THM</li>
</ul>
<p><span class="math inline">\(\Rightarrow\)</span> trivial</p>
<p><span class="math inline">\(\Leftarrow\)</span> proof by
contradiction</p>
<p>Consider a white path <span
class="math inline">\(x_1=u,x_2,\cdots,x_m=v\)</span> , and <span
class="math inline">\(x_k\)</span> is the last vertex that is a
descendent of <span class="math inline">\(u\)</span> (including <span
class="math inline">\(u\)</span> itself) .</p>
<p>We need to prove that <span class="math inline">\(x_{k+1}\)</span> is
also a descendent of <span class="math inline">\(u\)</span> , leading to
contradiction .</p>
<p>Therefore , <span
class="math inline">\(u.d&lt;x_k.d&lt;x_k.f&lt;u.f\)</span> ,</p>
<p>Case 1 : <span
class="math inline">\(x_k.d&lt;x_{k+1}.d&lt;x_{k+1}.f&lt;x_k.f\)</span>
-&gt; <span class="math inline">\(x_{k+1}\)</span> is also a descendent
of <span class="math inline">\(u\)</span> .</p>
<p>Case 2 : <span
class="math inline">\(x_k.d&lt;x_k.f&lt;x_{k+1}.d&lt;x_{k+1}.f\)</span>
: Impossible .</p></li>
<li><p>Strongly Connected Components (SCC)</p>
<ol type="1">
<li><p>View : any directed graph can be viewed as a DAG of SCC</p></li>
<li><p>Find SCC ? Kosaraju's Algorithm</p>
<ol type="1">
<li><p>Algorithm</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="built_in">dfs1</span>(G);<span class="comment">// compute the finishing time u.f</span></span><br><span class="line">G_=<span class="built_in">reverse_edge</span>(G);</span><br><span class="line"><span class="built_in">dfs2</span>(G_); <span class="comment">// In main loop , consider vertices in decreasing order of u.f</span></span><br><span class="line">Claim : Each dfs-<span class="function">tree in <span class="title">dfs2</span><span class="params">(G_)</span> is a SCC</span></span><br></pre></td></tr></table></figure></li>
<li><p>Proof</p>
<p>Intuition : find and "delete" sink components in dfs2 , like
a-topological order in <span class="math inline">\(G\_\)</span> ( i.e.
topological order in <span class="math inline">\(G\)</span> )</p>
<ol type="1">
<li><p>Lemma : If we start DFS at a node in a sink component , we will
visit precisely all vertices in this component .</p>
<p>Trivial .</p></li>
<li><p>Lemma (key) : node with largest <span
class="math inline">\(u.f\)</span> belongs to a start component in <span
class="math inline">\(G\)</span> (i.e. a sink component in <span
class="math inline">\(G\_\)</span>)</p>
<p>Only need to prove the following lemma .</p></li>
<li><p>Lemma ( for proving key Lemma ) : <span
class="math inline">\(C,D\)</span> are two SCC , <span
class="math inline">\(D\)</span> is reachable from <span
class="math inline">\(C\)</span> ,</p>
<p>Then for <span class="math inline">\(v\in C\)</span> , which is the
firstly visited vertex in <span class="math inline">\(C\)</span> , then
<span class="math inline">\(\forall u\in D,v.f&gt;u.f\)</span></p>
<p>Proof :</p>
<p>Case 1 : <span class="math inline">\(v\)</span> is also the firstly
visited vertex in <span class="math inline">\(C\cup D\)</span> , then by
white-path THM , all nodes in <span class="math inline">\(C\cup
D\)</span> are descendent of <span class="math inline">\(v\)</span> , so
<span class="math inline">\(\forall u\in C\cup D\backslash \{v\} ,
v.f&gt;u.f\)</span> .</p>
<p>Case 2 : <span class="math inline">\(\exists y\in D\)</span> , <span
class="math inline">\(y\)</span> is the firstly visited vertex in <span
class="math inline">\(C\cup D\)</span> , so <span
class="math inline">\(v\)</span> is not a descendent of <span
class="math inline">\(y\)</span> since <span
class="math inline">\(C\)</span> is not reachable from <span
class="math inline">\(D\)</span> .</p>
<p>Therefore , by parenthesis THM , <span class="math inline">\(y.d\le
u.d&lt;u.f\le y.f&lt;v.d&lt;v.f\)</span> for all <span
class="math inline">\(u\in D\)</span> .</p></li>
</ol></li>
</ol></li>
</ol></li>
</ol>
<h2 id="chapter-2-greedy-algorithm">Chapter 2 : Greedy Algorithm</h2>
<h3 id="interval-scheduling">2.1 Interval Scheduling</h3>
<ol type="1">
<li><p>Description</p>
<p>Input : <span class="math inline">\(n\)</span> jobs <span
class="math inline">\(s_i,f_i\)</span></p>
<p>Goal : maximize the #jobs , s.t. at most one job at a time .</p></li>
<li><p>Another view</p>
<p>Connect all jobs pairs <span
class="math inline">\([s_i,f_i],[s_j,f_j]\)</span> if <span
class="math inline">\([s_i,f_i]\cap [s_j,f_j]\neq \varnothing\)</span>
.</p>
<p>Goal <span class="math inline">\(\Leftrightarrow\)</span> maximum
independent set .</p>
<p>In general graph : NP-hard for general graph .</p></li>
<li><p>Algorithm</p>
<p>Repeat : Select the available jobs that finishes first .</p></li>
<li><p>Proof of optimality :</p>
<p>Method : [Exchange Argument] : Compare our solution and the optimal
solution .</p>
<p>SOL : <span class="math inline">\(i_1,i_2,\cdots ,i_m\)</span> , OPT
: <span class="math inline">\(j_1,j_2,\cdots,j_k\)</span> .</p>
<ol type="1">
<li>Claim : If <span class="math inline">\(f_{i_1}\le f_{j_1}\)</span> ,
then <span class="math inline">\(f_{i_r}\le f_{j_r}\)</span> for all
<span class="math inline">\(r\ge 1\)</span> .</li>
</ol>
<p>Proof : [Induction]</p>
<p>If the claim is true for <span class="math inline">\(r-1\)</span> ,
suppose the claim is not true for <span class="math inline">\(r\)</span>
, i.e. <span class="math inline">\(f_{i_r}&gt;f_{j_r}\)</span> . Since
<span class="math inline">\(f_{i_{r-1}}\le f_{j_{r-1}}\)</span> , and
<span class="math inline">\(s_{j_r}&gt;f_{j_{r-1}}\)</span> , then <span
class="math inline">\(f_{i_{r-1}}&lt;s_{j_r}\)</span> , so <span
class="math inline">\(j_r\)</span> is also available , but has earlier
finish time , contradict .</p>
<ol start="2" type="1">
<li>If <span class="math inline">\(m&lt;k\)</span> , then <span
class="math inline">\(f_{i_m}&lt;f_{j_m}\)</span> . We can use <span
class="math inline">\(j_{m+1},\cdots,j_k\)</span> after <span
class="math inline">\(i_m\)</span> .</li>
</ol></li>
</ol>
<h3 id="scheduling-to-minimize-lateness">2.2. Scheduling to minimize
lateness</h3>
<ol type="1">
<li><p>Description</p>
<p>Input : <span class="math inline">\(n\)</span> jobs , each job <span
class="math inline">\(i\)</span> has ddl <span
class="math inline">\(d_i\)</span> and length <span
class="math inline">\(t_i\)</span> . Def lateness : <span
class="math inline">\(l_i:=\max\{0,f_i-d_i\}\)</span></p>
<p>Goal : find a schedule of all <span class="math inline">\(n\)</span>
jobs , and minimize the maximal lateness .</p></li>
<li><p>Equal formularization</p>
<p>Goal : find a permutation <span class="math inline">\(\{p_i\}\in
S_n\)</span> , then <span class="math inline">\(f_i=\sum_{j=1}^i
t_{p_j}\)</span> , <span
class="math inline">\(l_i:=\max\{0,f_i-d_{p_i}\}\)</span> . Minimize
<span class="math inline">\(\max\{l_i\}\)</span> .</p>
<p><span class="math inline">\(\Leftrightarrow\)</span> <span
class="math inline">\(l_i:=f_i-d_{p_i}\)</span> , Minimize <span
class="math inline">\(\max\{0,\max\{l_i\}\}\)</span> .</p></li>
<li><p>Algorithm</p>
<p>Schedule the job in increasing order of <span
class="math inline">\(d_i\)</span> .</p></li>
<li><p>Proof of optimality</p>
<ol type="1">
<li><p>Def (Inversion) : Consider a schedule <span
class="math inline">\(A&#39;\)</span> , <span
class="math inline">\((i,j)\)</span> is an inversion if <span
class="math inline">\(i\)</span> is scheduled before <span
class="math inline">\(j\)</span> but <span
class="math inline">\(d_i&gt;d_j\)</span> .</p></li>
<li><p>If OPT$$ SOL , there must be an inversion , then there must be an
adjacent inversion . Suppose <span
class="math inline">\((i,i+1)\)</span> is an inversion , then <span
class="math inline">\(d_{p_i}&gt;d_{p_{i+1}}\)</span> .</p>
<p>Let <span class="math inline">\(f=\sum_{j=1}^{i-1}t_{p_j}\)</span> ,
so <span class="math inline">\(f_i=f+t_{p_i}\)</span> , <span
class="math inline">\(f_{i+1}=f+t_{p_i}+t_{p_{i+1}}\)</span> , so <span
class="math inline">\(l_i=f+t_{p_i}-d_{p_i}\)</span> , <span
class="math inline">\(l_{i+1}=f+t_{p_i}+t_{p_{i+1}}-d_{p_{i+1}}\)</span>
.</p>
<p>If swap <span class="math inline">\((i,i+1)\)</span> , then <span
class="math inline">\(f_{i+1}&#39;=f+t_{p_{i+1}}\)</span> , <span
class="math inline">\(f_{i}&#39;=f+t_{p_{i+1}}+t_{p_i}\)</span> , so
<span
class="math inline">\(l_{i+1}&#39;=f+t_{p_{i+1}}-d_{p_{i+1}}\)</span> ,
<span
class="math inline">\(l_i&#39;=f+t_{p_{i+1}}+t_{p_i}-d_{p_i}\)</span>
.</p>
<p>Therefore , obviously , <span
class="math inline">\(l_{i+1}&#39;&lt;l_{i+1}\)</span> . Since <span
class="math inline">\(d_{p_i}&gt;d_{p_{i+1}}\)</span> , then <span
class="math inline">\(l_i&#39;&lt;l_i\)</span> . Therefore , swap can
lead to better solution , so OPT is not optimal .</p></li>
</ol></li>
</ol>
<h3 id="shortest-path-without-w0">2.3. Shortest Path (without <span
class="math inline">\(w&lt;0\)</span>)</h3>
<ol type="1">
<li><p>Description</p>
<p>Input : weighted graph <span class="math inline">\(G=(V,E)\)</span> ,
start vertex <span class="math inline">\(s\)</span> .</p>
<p>Output : <span class="math inline">\(d(u)\)</span> for all <span
class="math inline">\(u\in V\)</span> , where <span
class="math inline">\(d(u)=\min_{p\text{ is a path }s\to u}\{\sum_{e\in
p}l(e)\}\)</span> .</p></li>
<li><p>Algorithm [Dijkstra 1959]</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="comment">// Init</span></span><br><span class="line"><span class="keyword">for</span>(u in V)&#123;</span><br><span class="line">    d[u]=+inf;</span><br><span class="line">&#125;</span><br><span class="line">d[s]=<span class="number">0</span>; S.<span class="built_in">insert</span>(s);</span><br><span class="line"><span class="comment">// Main Algorithm</span></span><br><span class="line"><span class="keyword">while</span>(S != V)&#123;</span><br><span class="line">    v=<span class="built_in">argmin</span>(d[u]+<span class="built_in">l</span>(e) |v: v in V\S , e=(u,v) , u in S );</span><br><span class="line">    d[v]=d[u]+<span class="built_in">l</span>(e);</span><br><span class="line">    S.<span class="built_in">insert</span>(v);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li>
<li><p>Proof</p>
<p>Prove by induction on <span class="math inline">\(S\)</span> . <span
class="math inline">\(\forall v\in S , d(v)=\min dist(s,v)\)</span>
,</p>
<p>Suppose we grow <span class="math inline">\(S\)</span> by adding
<span class="math inline">\(v\)</span> , suppose the proposition does
not hold . Then <span class="math inline">\(d(u)+l_e\)</span> is not the
shortest distance to <span class="math inline">\(v\)</span> . Let <span
class="math inline">\(p\)</span> be the shortest path from <span
class="math inline">\(s\)</span> to <span
class="math inline">\(v\)</span> .</p>
<p>Let <span class="math inline">\(w\)</span> be the last vertex that is
in <span class="math inline">\(S\)</span> , then by induction , all
vertices on <span class="math inline">\(p\)</span> from <span
class="math inline">\(s\)</span> to <span
class="math inline">\(w\)</span> are all in <span
class="math inline">\(S\)</span> . Let <span
class="math inline">\(p&#39;=path(s,u)+e\)</span> .</p>
<p><span class="math inline">\(p&#39;: s\to w\to u\to v\)</span> , <span
class="math inline">\(p:s\to w\to x\to v\)</span> . (<span
class="math inline">\(x\notin S\)</span>) . Since <span
class="math inline">\(d(u)+l_e\)</span> is minimal , <span
class="math inline">\(d(u)+l_e\le d(w)+l_{w,x}\)</span> , but <span
class="math inline">\(dist(p)=d(w)+l_{w,x}+dist(x,v)\)</span> , where
<span class="math inline">\(dist(x,v)\ge 0\)</span> , and <span
class="math inline">\(dist(p)&lt;dist(p&#39;)=d(u)+l_e\)</span> , so
<span class="math inline">\(d(w)+l_{w,x}\le dist(p)&lt;d(u)+l_e\)</span>
, contradict .</p></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>算法</tag>
        <tag>算法-搜索</tag>
        <tag>算法-强连通分量(SCC)</tag>
        <tag>算法-贪心</tag>
      </tags>
  </entry>
  <entry>
    <title>Algorithm Design 1</title>
    <url>/2023/10/05/Algorithm-Design-1/</url>
    <content><![CDATA[<h2 id="chapter-0-logistics">Chapter 0 Logistics</h2>
<p><strong>Content</strong> : discrete(combinatorial) algorithm ,
Theoretical</p>
<ul>
<li>[minority] Complexity , NP-Completeness</li>
<li>Basic Graph Algorithm , DFS / BFS</li>
<li>Greedy</li>
<li>Dynamic Programming</li>
<li>Divide and Conquer</li>
<li>NP-completeness Theory</li>
<li>Approximation Algorithm</li>
<li>Randomized Algorithm (Probability Analysis)</li>
<li><ul>
<li>Computational Geometry</li>
</ul></li>
<li><ul>
<li>Streaming Algorithm (online)</li>
</ul></li>
</ul>
<p><strong>Textbook</strong> : [Kleinberg&amp;Tardos] Algorithm
Design</p>
<p><strong>Reference Book</strong> : [CLRS] Intro to Algorithm</p>
<h2 id="chapter-1">Chapter 1</h2>
<h3 id="stable-matching">1.1 Stable matching</h3>
<ol type="1">
<li><p>Def</p>
<ul>
<li><p>Input : <span class="math inline">\(boys=\{B_1,\cdots,B_n\} ,
girls=\{G_1,\cdots,G_n\}\)</span></p>
<p>Preference List : <span class="math inline">\(BP_i\)</span> : a
permutation of <span class="math inline">\(girls\)</span> , <span
class="math inline">\(GP_i\)</span> : a permutation of <span
class="math inline">\(boys\)</span></p></li>
<li><p>output : a stable matching</p></li>
<li><p>stable matching : no unstable pairs</p></li>
<li><p>unstable pair : <span class="math inline">\((B_i,G_j)\)</span>
s.t. <span class="math inline">\(M(B_i)\)</span> after <span
class="math inline">\(G_j\)</span> in <span
class="math inline">\(BP_i\)</span> and <span
class="math inline">\(M(G_j)\)</span> after <span
class="math inline">\(B_i\)</span> in <span
class="math inline">\(GP_j\)</span></p></li>
</ul></li>
<li><p>Efficient Algorithm : Gale &amp; Shapley Algorithm (
propose-reject )</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="keyword">while</span>( exists sb. single )&#123;</span><br><span class="line">    A &lt;- an arbitrary single boy</span><br><span class="line">    X &lt;- <span class="function">first girl A has <span class="keyword">not</span> proposed yet</span></span><br><span class="line"><span class="function">    <span class="title">if</span> <span class="params">( X is single )</span></span></span><br><span class="line"><span class="function">        A-X engage</span></span><br><span class="line"><span class="function">    <span class="keyword">else</span></span></span><br><span class="line"><span class="function">        <span class="title">if</span> <span class="params">( A is better than M(X) )</span></span></span><br><span class="line"><span class="function">            A-X engage</span></span><br><span class="line"><span class="function">        <span class="keyword">else</span></span></span><br><span class="line"><span class="function">            X reject A</span></span><br><span class="line"><span class="function">&#125;</span></span><br></pre></td></tr></table></figure></li>
<li><p>Analysis</p>
<ol type="1">
<li><p>Proof of Termination</p>
<ol type="1">
<li>For girl : once engaged , engaged forever</li>
<li>For one boy : If used all preference list : then all girls must be
engaged</li>
</ol></li>
<li><p>Proof of Correctness</p>
<p>Prove by Contradiction</p>
<p><span class="math inline">\((B_i,G_j)\)</span> : an unstable pair</p>
<p><span class="math inline">\(B_i\)</span> preference list : <span
class="math inline">\(\cdots G_j \cdots M(B_i)\)</span></p>
<p><span class="math inline">\(G_j\)</span> preference list : <span
class="math inline">\(\cdots B_i \cdots Z=M(G_j)\)</span></p>
<p><span class="math inline">\(\therefore\)</span> <span
class="math inline">\(G_j\)</span> rejected <span
class="math inline">\(B_i\)</span> , but <span
class="math inline">\(G_j\)</span> should not reject <span
class="math inline">\(B_i\)</span> compared with <span
class="math inline">\(Z\)</span></p></li>
<li><p>Running Time : <span class="math inline">\(\mathcal
O(n^2)\)</span> ( All boys used up their preference list)</p></li>
</ol></li>
<li><p>Random ver.</p>
<ol type="1">
<li><p>def : <span class="math inline">\(BP_i\)</span> , <span
class="math inline">\(GP_i\)</span> are all random permutations
(uniformly distributed)</p></li>
<li><p>How to get a uniformly distributed random permutation ?</p>
<p>draw-likely process</p></li>
<li><p>THM : <span class="math inline">\(\mathbb E[T]\le n\cdot
H_n\)</span> (<span class="math inline">\(\mathbb E[T]=\mathcal O(n\log
n)\)</span>)</p></li>
<li><p>Proof</p>
<ol type="1">
<li><p>key observation :</p>
<p>G-S' : each time a boy propose to a random girl not proposed yet</p>
<p><strong>This is equivalent as generate a uniformly distributed random
permutation</strong></p>
<p>G-S'' : each time a boy propose to a random girl (can be proposed
yet)</p>
<p>$$ <span class="math inline">\(T(G-S)=T(G-S&#39;)\le
T(G-S&#39;&#39;)\)</span></p></li>
<li><p>Coupon Collector Problem ( Bins-Balls )</p>
<p><span class="math inline">\(n\)</span> bins , each time throw a ball
to a random bin .</p>
<p>Q : <span class="math inline">\(\mathbb E[\text{balls}]\)</span> s.t.
every bin is nonempty .</p>
<p>A : <span class="math inline">\(\mathbb E[\text{balls}]=n\cdot
H_n\)</span></p>
<p>Construct Sequence <span class="math inline">\(a_i\in
\{0,1\}\)</span> , <span
class="math inline">\(a_i=1\Leftrightarrow\)</span> a ball falls in an
empty bin</p>
<p>Exactly <span class="math inline">\(n\)</span> number of <span
class="math inline">\(1\)</span>s . -&gt; <span
class="math inline">\(n\)</span> segments like <span
class="math inline">\(0\cdots 01\)</span> <span class="math display">\[
\begin{aligned}
\mathbb E[T]&amp;=\mathbb E\left[\sum_{i=1}^n \text{len of i-th
segment}\right]\\
&amp;=\sum_{i=1}^n\mathbb E\left[ \text{len of i-th segment}\right]\\
&amp;=\sum_{i=1}^n \frac{1}{\Pr\{\text{in i-th seg , choosed empty
bin}\}}\\
&amp;=\sum_{i=1}^n \frac{n}{n-i+1}\\
&amp;=n\cdot H_n
\end{aligned}
\]</span></p></li>
<li><p>Consider boy -&gt; ball , girl -&gt; bin</p>
<p>G-S'' -&gt; Bins-Balls Problem</p></li>
<li><ul>
<li><p>Concentration inequality for Coupon Collection Running Time</p>
<p>Same as Chernoff Bound</p></li>
</ul></li>
</ol></li>
</ol></li>
</ol>
<h3 id="bfs-dfs">1.2 BFS &amp; DFS</h3>
<ol type="1">
<li><p>BFS</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="comment">// bfs</span></span><br><span class="line">q.<span class="built_in">clear</span>();</span><br><span class="line"><span class="keyword">for</span>(u in Vertices)&#123;</span><br><span class="line">    dep[u]=inf;</span><br><span class="line">    prev[u]=<span class="literal">NULL</span>;</span><br><span class="line">    col[u]=white;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">dep[s]=<span class="number">0</span>;</span><br><span class="line">prev[s]=<span class="literal">NULL</span>;</span><br><span class="line">col[s]=grey;</span><br><span class="line">q.<span class="built_in">push</span>(s);</span><br><span class="line"></span><br><span class="line"><span class="keyword">while</span>(!q.<span class="built_in">empty</span>())&#123;</span><br><span class="line">    u=q.<span class="built_in">front</span>();q.<span class="built_in">pop</span>();</span><br><span class="line">    <span class="keyword">for</span>(v in <span class="built_in">Neighbour</span>(u) )&#123;</span><br><span class="line">        <span class="keyword">if</span>(col[v]==white)&#123;</span><br><span class="line">            dep[v]=dep[u]+<span class="number">1</span>;</span><br><span class="line">            prev[v]=u;</span><br><span class="line">            col[v]=grey;</span><br><span class="line">            q.<span class="built_in">push</span>(v);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    col[v]=black;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>BFS Tree Property : no edges with depth difference <span
class="math inline">\(\ge 2\)</span> .</p></li>
<li><p>DFS</p>
<figure class="highlight cpp"><table><tr><td class="code"><pre><span class="line"><span class="function">function <span class="title">dfs</span><span class="params">(u)</span></span>&#123;</span><br><span class="line">    ++time;</span><br><span class="line">    discover[u]=time;</span><br><span class="line">    col[u]=grey;</span><br><span class="line">    <span class="keyword">for</span>(v in <span class="built_in">Neighbour</span>(u) )&#123;</span><br><span class="line">        <span class="keyword">if</span>(col[v]==white)&#123;</span><br><span class="line">            <span class="built_in">dfs</span>(v)</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    col[u]=black;</span><br><span class="line">    ++time;</span><br><span class="line">    finish[u]=time;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>DFS Tree Properties</p>
<p>Time stamp intervals</p>
<ol type="1">
<li>non-crossing : only non-intersect / totally include</li>
<li>Tree Structure <span class="math inline">\(\Leftrightarrow\)</span>
Time stamp intervals Structure</li>
</ol></li>
<li><p>Connectivity</p>
<ol type="1">
<li><p>undirected graph : connected component</p></li>
<li><p>directed graph : strongly connected component (SCC)</p>
<p>every vertex can reach other vertex</p></li>
<li><p>directed acyclic graph (DAG)</p>
<p>i.e. no directed cycle</p>
<p>i.e. no SCC has <span class="math inline">\(\ge 2\)</span>
vertices</p>
<ul>
<li>DAG has a topological order</li>
</ul></li>
<li><p>A useful view of directed graph : a DAG of SCC</p>
<p>a.k.a. 缩点</p></li>
<li><p>DAG has a topological order</p>
<p>get topological order : use bfs/dfs starting from <span
class="math inline">\(InDeg=0\)</span></p></li>
</ol></li>
</ol>
]]></content>
      <categories>
        <category>课程笔记</category>
        <category>算法设计</category>
      </categories>
      <tags>
        <tag>算法</tag>
        <tag>算法-匹配</tag>
        <tag>算法-搜索</tag>
      </tags>
  </entry>
  <entry>
    <title>Hello World</title>
    <url>/2023/10/05/hello-world/</url>
    <content><![CDATA[<p>Welcome to <a href="https://hexo.io/">Hexo</a>! This is your very
first post. Check <a href="https://hexo.io/docs/">documentation</a> for
more info. If you get any problems when using Hexo, you can find the
answer in <a
href="https://hexo.io/docs/troubleshooting.html">troubleshooting</a> or
you can ask me on <a
href="https://github.com/hexojs/hexo/issues">GitHub</a>.</p>
<h2 id="quick-start">Quick Start</h2>
<h3 id="create-a-new-post">Create a new post</h3>
<figure class="highlight bash"><table><tr><td class="code"><pre><span class="line">$ hexo new <span class="string">&quot;My New Post&quot;</span></span><br></pre></td></tr></table></figure>
<p>More info: <a
href="https://hexo.io/docs/writing.html">Writing</a></p>
<h3 id="run-server">Run server</h3>
<figure class="highlight bash"><table><tr><td class="code"><pre><span class="line">$ hexo server</span><br></pre></td></tr></table></figure>
<p>More info: <a href="https://hexo.io/docs/server.html">Server</a></p>
<h3 id="generate-static-files">Generate static files</h3>
<figure class="highlight bash"><table><tr><td class="code"><pre><span class="line">$ hexo generate</span><br></pre></td></tr></table></figure>
<p>More info: <a
href="https://hexo.io/docs/generating.html">Generating</a></p>
<h3 id="deploy-to-remote-sites">Deploy to remote sites</h3>
<figure class="highlight bash"><table><tr><td class="code"><pre><span class="line">$ hexo deploy</span><br></pre></td></tr></table></figure>
<p>More info: <a
href="https://hexo.io/docs/one-command-deployment.html">Deployment</a></p>
]]></content>
      <categories>
        <category>测试文档</category>
      </categories>
      <tags>
        <tag>测试文档</tag>
      </tags>
  </entry>
</search>