Exercise 3.13 (Matrix inversion lemma)

Question

This exercise shows that our efficient procedures for updating a tableau can be derived from a useful fact in numerical linear algebra.

\begin{enumerate}
    \item[(a)] \textbf{(Matrix inversion lemma)} Let $\mathbf{C}$ be an $m \times m$ invertible matrix and let $\mathbf{u}, \mathbf{v}$ be vectors in $\mathbb{R}^m$. Show that
    $$
        (\mathbf{C} + \mathbf{w}\mathbf{v}')^{-1} = \mathbf{C}^{-1} - \frac{\mathbf{C}^{-1}\mathbf{w}\mathbf{v}'\mathbf{C}^{-1}}{1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w}}.
    $$
    (Note that $\mathbf{w}\mathbf{v}'$ is an $m \times m$ matrix). \textit{Hint: Multiply both sides by $(\mathbf{C} + \mathbf{w}\mathbf{v}')$.}
    
    \item[(b)] Assuming that $\mathbf{C}^{-1}$ is available, explain how to obtain $(\mathbf{C} + \mathbf{w}\mathbf{v}')^{-1}$ using only $O(m^2)$ arithmetic operations.
    
    \item[(c)] Let $\mathbf{B}$ and $\bar{\mathbf{B}}$ be basis matrices before and after an iteration of the simplex method. Let $\mathbf{A}_{B(l)}$, $\mathbf{A}_{\bar{B}(l)}$ be the exiting and entering column, respectively. Show that
    $$
        \bar{\mathbf{B}} - \mathbf{B} = (\mathbf{A}_{\bar{B}(l)} - \mathbf{A}_{B(l)})\mathbf{e}'_l,
    $$
    where $\mathbf{e}_l$ is the $l$th unit vector.
    
    \item[(d)] Note that $\mathbf{e}'_i\mathbf{B}^{-1}$ is the $i$th row of $\mathbf{B}^{-1}$ and $\mathbf{e}'_l\mathbf{B}^{-1}$ is the pivot row. Show that
    $$
        \mathbf{e}'_i\bar{\mathbf{B}}^{-1} = \mathbf{e}'_i\mathbf{B}^{-1} - g_i\mathbf{e}'_l\mathbf{B}^{-1}, \quad i=1, \dots, m,
    $$
    for suitable scalars $g_i$. Provide a formula for $g_i$. Interpret the above equation in terms of the mechanics for pivoting in the revised simplex method.
    
    \item[(e)] Multiply both sides of the equation in part (d) by $[\mathbf{b} \mid \mathbf{A}]$ and obtain an interpretation of the mechanics for pivoting in the full tableau implementation.
\end{enumerate}

Ahmed · Accepted Answer

(a) Following the hint, we will prove the identity by showing that the product of $(\mathbf{C} + \mathbf{w}\mathbf{v}')$ and the expression on the right-hand side is the identity matrix. Let the right-hand side be denoted by $\mathbf{B}$:
        $$
            \mathbf{B} = \mathbf{C}^{-1} - \frac{\mathbf{C}^{-1}\mathbf{w}\mathbf{v}'\mathbf{C}^{-1}}{1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w}}.
        $$
        We want to show that $(\mathbf{C} + \mathbf{w}\mathbf{v}')\mathbf{B} = \mathbf{I}$. We expand the product:
        $$
            (\mathbf{C} + \mathbf{w}\mathbf{v}')\mathbf{B} = \mathbf{C}\mathbf{B} + \mathbf{w}\mathbf{v}'\mathbf{B}.
        $$
        First, we compute the term $\mathbf{C}\mathbf{B}$:
        $$
            \mathbf{C}\mathbf{B} = \mathbf{C}\left(\mathbf{C}^{-1} - \frac{\mathbf{C}^{-1}\mathbf{w}\mathbf{v}'\mathbf{C}^{-1}}{1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w}}\right) = \mathbf{I} - \frac{\mathbf{w}\mathbf{v}'\mathbf{C}^{-1}}{1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w}}.
        $$
        Next, we compute the term $\mathbf{w}\mathbf{v}'\mathbf{B}$. To simplify this, we first evaluate $\mathbf{v}'\mathbf{B}$, noting that $\mathbf{v}'\mathbf{C}^{-1}\mathbf{w}$ is a scalar:
        \begin{align*}
            \mathbf{v}'\mathbf{B} &= \mathbf{v}'\left(\mathbf{C}^{-1} - \frac{\mathbf{C}^{-1}\mathbf{w}\mathbf{v}'\mathbf{C}^{-1}}{1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w}}\right) \
            &= \mathbf{v}'\mathbf{C}^{-1} - \frac{(\mathbf{v}'\mathbf{C}^{-1}\mathbf{w})\mathbf{v}'\mathbf{C}^{-1}}{1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w}} \
            &= \mathbf{v}'\mathbf{C}^{-1} \left(1 - \frac{\mathbf{v}'\mathbf{C}^{-1}\mathbf{w}}{1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w}}\right) \
            &= \mathbf{v}'\mathbf{C}^{-1} \left(\frac{1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w} - \mathbf{v}'\mathbf{C}^{-1}\mathbf{w}}{1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w}}\right) \
            &= \frac{\mathbf{v}'\mathbf{C}^{-1}}{1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w}}.
        \end{align*}
        Using this result, we can now find $\mathbf{w}\mathbf{v}'\mathbf{B}$:
        $$
            \mathbf{w}\mathbf{v}'\mathbf{B} = \mathbf{w}(\mathbf{v}'\mathbf{B}) = \frac{\mathbf{w}\mathbf{v}'\mathbf{C}^{-1}}{1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w}}.
        $$
        Finally, we add the two computed parts together:
        $$
            (\mathbf{C} + \mathbf{w}\mathbf{v}')\mathbf{B} = \left(\mathbf{I} - \frac{\mathbf{w}\mathbf{v}'\mathbf{C}^{-1}}{1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w}}\right) + \left(\frac{\mathbf{w}\mathbf{v}'\mathbf{C}^{-1}}{1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w}}\right) = \mathbf{I}.
        $$
        This confirms the identity, provided that the denominator $1 + \mathbf{v}'\mathbf{C}^{-1}\mathbf{w} \neq 0$. \qed

\vspace{1em}
        \noindent (b) We use the formula from part (a) and compute the right-hand side efficiently by \textbf{avoiding matrix-matrix multiplications}. The procedure is as follows:

\begin{enumerate}
            \item Compute the vector $\mathbf{u} = \mathbf{C}^{-1}\mathbf{w}$. This is a matrix-vector product, which requires $O(m^2)$ operations.
            
            \item Compute the scalar $\alpha = 1 + \mathbf{v}'\mathbf{u}$. This is a vector dot product plus an addition, requiring $O(m)$ operations.
            
            \item Compute the row vector $\mathbf{r}' = \mathbf{v}'\mathbf{C}^{-1}$. This is a vector-matrix product, which also requires $O(m^2)$ operations.
            
            \item The expression to be subtracted is now $\dfrac{\mathbf{u}\mathbf{r}'}{\alpha}$. Forming the rank-one matrix $\mathbf{u}\mathbf{r}'$ costs $O(m^2)$, and the subsequent scalar division costs another $O(m^2)$.
            
            \item Finally, subtract this result from $\mathbf{C}^{-1}$. This matrix subtraction requires $O(m^2)$ operations.
        \end{enumerate}
        
        The total cost is dominated by the matrix-vector multiplications and matrix additions/subtractions. The overall complexity is therefore $O(m^2)$. \qed

\vspace{1em}
        \noindent (c) The matrix $\bar{\mathbf{B}}$ is identical to the matrix $\mathbf{B}$, except that its $l$th column, $\mathbf{A}_{B(l)}$, has been replaced by the entering column, $\mathbf{A}_{\bar{B}(l)}$.

The difference $\bar{\mathbf{B}} - \mathbf{B}$ is therefore a matrix that consists of zero vectors in every column, except for the $l$th column. The $l$th column is the difference between the new and old $l$th columns, which is $(\mathbf{A}_{\bar{B}(l)} - \mathbf{A}_{B(l)})$.
        
        The expression on the right-hand side is the outer product of the column vector $(\mathbf{A}_{\bar{B}(l)} - \mathbf{A}_{B(l)})$ and the row vector $\mathbf{e}'_l$. This product results in a matrix whose $k$th column is $(\mathbf{A}_{\bar{B}(l)} - \mathbf{A}_{B(l)})$ multiplied by the $k$th component of $\mathbf{e}'_l$. Since the $k$th component of $\mathbf{e}'_l$ is 1 if $k=l$ and 0 otherwise, the resulting matrix is zero everywhere except for the $l$th column, which is precisely $(\mathbf{A}_{\bar{B}(l)} - \mathbf{A}_{B(l)})$. The two sides are therefore equal. \qed

\vspace{1em}
        \noindent (d) From part (c), we have $\bar{\mathbf{B}} = \mathbf{B} + (\mathbf{A}_{\bar{B}(l)} - \mathbf{A}_{B(l)})\mathbf{e}'_l$. Let $\mathbf{w} = \mathbf{A}_{\bar{B}(l)} - \mathbf{A}_{B(l)}$ and $\mathbf{v} = \mathbf{e}_l$. We can now apply the matrix inversion lemma from part (a) with $\mathbf{C} = \mathbf{B}$:
        $$
            \bar{\mathbf{B}}^{-1} = \mathbf{B}^{-1} - \frac{\mathbf{B}^{-1}\mathbf{w}\mathbf{e}'_l\mathbf{B}^{-1}}{1 + \mathbf{e}'_l\mathbf{B}^{-1}\mathbf{w}}.
        $$
        To find the $i$th row of $\bar{\mathbf{B}}^{-1}$, we left-multiply by $\mathbf{e}'_i$:
        $$
            \mathbf{e}'_i \bar{\mathbf{B}}^{-1} = \mathbf{e}'_i \mathbf{B}^{-1} - \frac{(\mathbf{e}'_i \mathbf{B}^{-1}\mathbf{w})(\mathbf{e}'_l\mathbf{B}^{-1})}{1 + \mathbf{e}'_l\mathbf{B}^{-1}\mathbf{w}}.
        $$
        This is the desired form, where the scalar $g_i$ is given by
        $$
            g_i = \frac{\mathbf{e}'_i \mathbf{B}^{-1}\mathbf{w}}{1 + \mathbf{e}'_l\mathbf{B}^{-1}\mathbf{w}}.
        $$
        To find a more practical form for $g_i$, let $\mathbf{u} = \mathbf{B}^{-1}\mathbf{A}_{j}$ be the pivot column. Note that since $\mathbf{A}_{B(l)}$ is the $l$-th column of $\mathbf{B}$, we have $\mathbf{B}^{-1}\mathbf{A}_{B(l)} = \mathbf{e}_l$. We can then simplify the terms involving $\mathbf{w}$:
        \begin{align*}
            \mathbf{e}'_i \mathbf{B}^{-1}\mathbf{w} &= \mathbf{e}'_i \mathbf{B}^{-1}(\mathbf{A}_{j} - \mathbf{A}_{B(l)}) = \mathbf{e}'_i (\mathbf{u} - \mathbf{e}_l) = u_i - (\mathbf{e}'_i\mathbf{e}_l), \
            1 + \mathbf{e}'_l \mathbf{B}^{-1}\mathbf{w} &= 1 + \mathbf{e}'_l (\mathbf{u} - \mathbf{e}_l) = 1 + u_l - 1 = u_l.
        \end{align*}
        The term $\mathbf{e}'_i\mathbf{e}_l$ is 1 if $i=l$ and 0 otherwise. The pivot element is $u_l \neq 0$. Substituting these back gives the formula:
        $$
            g_i = \frac{u_i}{u_l} \text{ for } i \neq l, \quad \text{and} \quad g_l = \frac{u_l - 1}{u_l}.
        $$
        \textbf{Interpretation:} The identity $\mathbf{e}'_i \bar{\mathbf{B}}^{-1} = \mathbf{e}'_i \mathbf{B}^{-1} - g_i \mathbf{e}'_l \mathbf{B}^{-1}$ shows that the new $i$-th row of $\mathbf{B}^{-1}$ is obtained by taking the old $i$-th row and subtracting a multiple ($g_i$) of the old pivot row. This is precisely the elementary row operation used to update the matrix $\mathbf{B}^{-1}$ in the revised simplex method.

\vspace{1em}
        \noindent (e) The full simplex tableau (excluding the zeroth row) can be represented by the matrix $\mathbf{T} = \mathbf{B}^{-1}[\mathbf{b} \mid \mathbf{A}]$. The $i$th row of this tableau is therefore $\mathbf{e}'_i \mathbf{T} = \mathbf{e}'_i \mathbf{B}^{-1}[\mathbf{b} \mid \mathbf{A}]$.

\noindent We right-multiply the row update formula from part (d) by the matrix $[\mathbf{b} \mid \mathbf{A}]$:
        $$
            \mathbf{e}'_i \bar{\mathbf{B}}^{-1} [\mathbf{b} \mid \mathbf{A}] = (\mathbf{e}'_i \mathbf{B}^{-1} - g_i \mathbf{e}'_l \mathbf{B}^{-1}) [\mathbf{b} \mid \mathbf{A}].
        $$
        Distributing the product on the right-hand side gives:
        $$
            \mathbf{e}'_i \bar{\mathbf{B}}^{-1} [\mathbf{b} \mid \mathbf{A}] = \mathbf{e}'_i \mathbf{B}^{-1}[\mathbf{b} \mid \mathbf{A}] - g_i \mathbf{e}'_l \mathbf{B}^{-1}[\mathbf{b} \mid \mathbf{A}].
        $$
        This can be rewritten in terms of tableau rows as:
        $$
            \text{new\_row}_i = \text{old\_row}_i - g_i \cdot \text{old\_row}_l.
        $$
        \textbf{Interpretation:} This equation shows that each row of the new simplex tableau is obtained by subtracting a multiple ($g_i$) of the old pivot row from the corresponding old row. This is exactly the standard procedure for pivoting in the full tableau implementation:
        \begin{itemize}
            \item For the pivot row ($i=l$), we have $g_l = \frac{u_l - 1}{u_l}$. The update is $\text{new\_row}_l = \text{old\_row}_l - \frac{u_l - 1}{u_l}\text{old\_row}_l = \frac{1}{u_l}\text{old\_row}_l$. This is equivalent to scaling the pivot row to make the pivot element equal to 1.
            \item For any other row ($i \neq l$), we have $g_i = u_i/u_l$. The update is $\text{new\_row}_i = \text{old\_row}_i - \frac{u_i}{u_l}\text{old\_row}_l$. This is equivalent to using the scaled pivot row to eliminate the other non-zero entries in the pivot column. \qed
        \end{itemize}

Exercise 3.13 (Matrix inversion lemma)

Answers

Comments

Exercise 3.13 (Matrix inversion lemma)

Answers

Comments

Add answer