Conjugate Gradient Part 3b

02 May 2019

The source code for all algorithms and plots can be found here.

Now we’re ready to rigorously prove the following:

Proposition PLE.4 Let $V$ be an inner product space with dimension $n$ and let $A$ be a positive definite $n\times n$ matrix on $V$. Let $A$ have distinct eigenvalues and let $R^T$ be the matrix whose columns are the orthonormal eigenvectors of $A$.

Let $f$ be the quadratic form defined by TP.2 for some vector $b$ and some scalar $c$. Let $x_{*}$ be the minimum point of $f$ and let $k$ be some arbitrary scalar such that $k\gt f(x_{*})$.

Define $m$ by ELC.15, define $\phi_k$ by ELC.12b, define $d_{k,1},\dots,d_{k,n}$ by ELC.17, and define $D_k$ to be the diagonal matrix with entries $d_{k,1},\dots,d_{k,n}$.

Let $x$ be a vector in $V$. Then the following are equivalent:

(a) $x$ is a point on the level $k$ ellipsoid of $f$
(b) There exists a unit point $y\in V$ such that $x=R^TD_kRy+m$
(c) For any $q\gt f(x_{*})$, there exists a point $z\in V$ with $\dnorm{z}=\sqrt{\frac{\phi_k}{\phi_q}}$ such that $x=R^TD_qRz+m$

Proof First let’s show that (c)$\implies$(a). Let $q$ be some scalar with $q\gt f(x_{*})$ and suppose there exists a vector $z\in V$ with $\dnorm{z}=\sqrt{\frac{\phi_k}{\phi_q}}$ such that $x=R^TD_qRz+m$. Then

$$\align{ \alpha\sum_{j=1}^n\innprd{x-m}{v_j}^2\lambda_j &= \frac12\sum_{j=1}^n\innprd{R^TD_qRz+m-m}{v_j}^2\lambda_j \\ &= \frac12\sum_{j=1}^n\innprd{R^TD_qRz}{v_j}^2\lambda_j \\ &= \frac12\sum_{j=1}^n\innprdBgg{R^TD_qR\sum_{i=1}^n\innprd{z}{v_i}v_i}{v_j}^2\lambda_j \\ &= \frac12\sum_{j=1}^n\innprdBgg{\sum_{i=1}^n\innprd{z}{v_i}R^TD_qRv_i}{v_j}^2\lambda_j \\ &= \frac12\sum_{j=1}^n\innprdBgg{\sum_{i=1}^n\innprd{z}{v_i}d_{q, i}v_i}{v_j}^2\lambda_j \\ &= \frac12\sum_{j=1}^n\Prngg{\sum_{i=1}^n\innprdbg{\innprd{z}{v_i}d_{q, i}v_i}{v_j}}^2\lambda_j \\ &= \frac12\sum_{j=1}^n\Prngg{\sum_{i=1}^n\innprd{z}{v_i}d_{q, i}\innprd{v_i}{v_j}}^2\lambda_j \\ &= \frac12\sum_{j=1}^n\prn{\innprd{z}{v_j}d_{q, j}}^2\lambda_j \\ &= \frac12\sum_{j=1}^n\innprd{z}{v_j}^2\frac{2\phi_q}{\lambda_j}\lambda_j \\ &= \frac12\sum_{j=1}^n\innprd{z}{v_j}^22\phi_q \\ &= \phi_q\sum_{j=1}^n\innprd{z}{v_j}^2 \\ &= \phi_k }$$

since

$$ \sum_{j=1}^n\innprd{z}{v_j}^2 = \dnorm{\sum_{j=1}^n\innprd{z}{v_j}v_j}^2 = \dnorm{z}^2 = \frac{\phi_k}{\phi_q} $$

Hence we have shown that ELC.13 holds and $x$ is on the level $k$ ellipsoid.

Next let’s show that (a)$\implies$(b). Suppose that $x$ is on the level $k$ ellipsoid. Notice that $D_k$ is a diagonal matrix with no diagonal entries equal to zero (by ELC.18). Hence $D_k$ has no eigenvalues equal to zero (by Axler 5.32) and $D_k$ is nonsingular (by Axler 5.30). Now put $y\equiv R^T\inv{D_k}R(x-m)$. Then

$$ R^TD_kRy+m = R^TD_kRR^T\inv{D_k}R(x-m)+m = R^TD_k\inv{D_k}R(x-m)+m = R^TR(x-m)+m = x-m+m = x $$

and

$$\align{ \dnorm{y}^2 &= \dnorm{R^T\inv{D_k}R(x-m)}^2 \\ &= \dnorm{R^T\inv{D_k}R\sum_{j=1}^n\innprd{x-m}{v_j}v_j}^2 \\ &= \dnorm{\sum_{j=1}^n\innprd{x-m}{v_j}R^T\inv{D_k}Rv_j}^2 \\ &= \dnorm{\sum_{j=1}^n\innprd{x-m}{v_j}R^T\inv{D_k}e_j}^2 \\ &= \dnorm{\sum_{j=1}^n\innprd{x-m}{v_j}R^T\sqrt{\frac{\alpha\lambda_j}{\phi_k}}e_j}^2 \\ &= \dnorm{\sum_{j=1}^n\innprd{x-m}{v_j}\sqrt{\frac{\alpha\lambda_j}{\phi_k}}v_j}^2 \\ &= \sum_{j=1}^n\Prngg{\innprd{x-m}{v_j}\sqrt{\frac{\alpha\lambda_j}{\phi_k}}}^2\tag{by Axler 6.25} \\ &= \sum_{j=1}^n\innprd{x-m}{v_j}^2\frac{\alpha\lambda_j}{\phi_k} \\ &= \frac{\alpha}{\phi_k}\sum_{j=1}^n\innprd{x-m}{v_j}^2\lambda_j \\ &= \frac{\alpha}{\phi_k}\frac{\phi_k}{\alpha}\tag{by ELC.13} \\ &= 1 }$$

And finally let’s show that (b)$\implies$(c). Suppose there exists a unit point $y\in V$ such that $x=R^TD_kRy+m$. Let $q$ be some scalar with $q\gt f(x_{*})$. Now put $z\equiv R^T\inv{D_q}R(x-m)$. Then

$$ R^TD_qRz+m = R^TD_qRR^T\inv{D_q}R(x-m)+m = R^TD_q\inv{D_q}R(x-m)+m = R^TR(x-m)+m = x-m+m = x $$

and

$$\align{ \dnorm{z}^2 &= \dnorm{R^T\inv{D_q}R(x-m)}^2 \\ &= \dnorm{R^T\inv{D_q}R(R^TD_kRy+m-m)}^2 \\ &= \dnorm{R^T\inv{D_q}RR^TD_kRy}^2 \\ &= \dnorm{R^T\inv{D_q}D_kRy}^2 \\ &= \dnorm{R^T\inv{D_q}D_kR\sum_{j=1}^n\innprd{y}{v_j}v_j}^2 \\ &= \dnorm{\sum_{j=1}^n\innprd{y}{v_j}R^T\inv{D_q}D_kRv_j}^2 \\ &= \dnorm{\sum_{j=1}^n\innprd{y}{v_j}R^T\inv{D_q}D_ke_j}^2 \\ &= \dnorm{\sum_{j=1}^n\innprd{y}{v_j}R^T\inv{D_q}\sqrt{\frac{\phi_k}{\alpha\lambda_j}}e_j}^2 \\ &= \dnorm{\sum_{j=1}^n\innprd{y}{v_j}\sqrt{\frac{\phi_k}{\alpha\lambda_j}}R^T\inv{D_q}e_j}^2 \\ &= \dnorm{\sum_{j=1}^n\innprd{y}{v_j}\sqrt{\frac{\phi_k}{\alpha\lambda_j}}R^T\sqrt{\frac{\alpha\lambda_j}{\phi_q}}e_j}^2 \\ &= \dnorm{\sum_{j=1}^n\innprd{y}{v_j}\sqrt{\frac{\phi_k}{\alpha\lambda_j}}\sqrt{\frac{\alpha\lambda_j}{\phi_q}}R^Te_j}^2 \\ &= \dnorm{\sum_{j=1}^n\innprd{y}{v_j}\sqrt{\frac{\phi_k}{\alpha\lambda_j}\frac{\alpha\lambda_j}{\phi_q}}v_j}^2 \\ &= \sum_{j=1}^n\Prngg{\innprd{y}{v_j}\sqrt{\frac{\phi_k}{\phi_q}}}^2 \\ &= \sum_{j=1}^n\innprd{y}{v_j}^2\frac{\phi_k}{\phi_q} \\ &= \frac{\phi_k}{\phi_q}\sum_{j=1}^n\innprd{y}{v_j}^2 \\ &= \frac{\phi_k}{\phi_q} }$$

$\wes$

The equivalence of parts (a) and (c) from the above proposition means that we can dilate the level $k$ ellipsoid to a ball with radius $\sqrt{\frac{\phi_k}{\phi_q}}$ and center $m$ by simply computing $z=R^T\inv{D_q}R(x-m)+m$ for every $x$ on the level $k$ ellipsoid. This is depicted in $\wR^3$ below:

We complete this section by succintly summarizing the equivalence of parts (a) and (b) from proposition PLE.4: the map $R^T\inv{D_k}R$ is an isomorphism from the level $k$ ellipsoid to the unit ball centered at $m$. And we similarly summarize the equivalence of parts (a) and (c) from proposition PLE.4: the map $R^T\inv{D_q}R$ is an isomorphism from the level $k$ ellipsoid to the ball with radius $\sqrt{\frac{\phi_k}{\phi_q}}$ centered at $m$.