Exercise 3.15 (Perturbation approach to lexicography)

Question

Consider the standard form problem and assume that the rows of the matrix $\mathbf{A}$ are linearly independent. For every $\epsilon > 0$, we define the \textit{$\epsilon$-perturbed problem} to be the linear programming problem obtained by replacing $\mathbf{b}$ with $\mathbf{b}(\epsilon)$, where $$ \mathbf{b}(\epsilon) = \mathbf{b} + \begin{bmatrix} \epsilon \ \epsilon^2 \ \vdots \ \epsilon^m \end{bmatrix}. $$ \begin{enumerate} \item[(a)] Given a basis matrix $\mathbf{B}$, show that the corresponding basic solution $\mathbf{x}_B(\epsilon)$ in the $\epsilon$-perturbed problem is equal to $$ \mathbf{B}^{-1}[\mathbf{b} \mid \mathbf{I}] \begin{bmatrix} 1 \ \epsilon \ \vdots \ \epsilon^m \end{bmatrix}. $$ \item[(b)] Show that there exists some $\epsilon^* > 0$ such that all basic solutions to the $\epsilon$-perturbed problem are nondegenerate, for $0 < \epsilon < \epsilon^*$. \item[(c)] Suppose that all rows of $\mathbf{B}^{-1}[\mathbf{b} \mid \mathbf{I}]$ are lexicographically positive. Show that $\mathbf{x}_B(\epsilon)$ is a basic feasible solution to the $\epsilon$-perturbed problem for $\epsilon$ positive and sufficiently small. \item[(d)] Consider a feasible basis for the original problem, and assume that all rows of $\mathbf{B}^{-1}[\mathbf{b} \mid \mathbf{I}]$ are lexicographically positive. Let some nonbasic variable $x_j$ enter the basis, and define $\mathbf{u} = \mathbf{B}^{-1}\mathbf{A}_j$. Let the exiting variable be determined as follows. For every row $i$ such that $u_i$ is positive, divide the $i$th row of $\mathbf{B}^{-1}[\mathbf{b} \mid \mathbf{I}]$ by $u_i$, compare the results lexicographically, and choose the exiting variable to be the one corresponding to the lexicographically smallest row. Show that this is the same choice of exiting variable as in the original simplex method applied to the $\epsilon$-perturbed problem, when $\epsilon$ is sufficiently small. \item[(e)] Explain why the revised simplex method, with the lexicographic rule described in part (d), is guaranteed to terminate even in the face of degeneracy. \end{enumerate}

Ahmed · Accepted Answer

(a) For the $\epsilon$-perturbed problem, the vector of basic variables, denoted by $\mathbf{x}_B(\epsilon)$, is determined by the system $\mathbf{B}\mathbf{x}_B(\epsilon) = \mathbf{b}(\epsilon)$. Since $\mathbf{B}$ is invertible, the solution is given by: $$ \mathbf{x}_B(\epsilon) = \mathbf{B}^{-1}\mathbf{b}(\epsilon). $$ The perturbed vector $\mathbf{b}(\epsilon)$ is defined as: $$ \mathbf{b}(\epsilon) = \mathbf{b} + \begin{bmatrix} \epsilon \ \epsilon^2 \ \vdots \ \epsilon^m \end{bmatrix}. $$ We can express this vector as a matrix-vector product. Consider the $m \times (m+1)$ matrix $[\mathbf{b} \mid \mathbf{I}]$ and the vector $[1, \epsilon, \dots, \epsilon^m]'$. Their product is: $$ [\mathbf{b} \mid \mathbf{I}] \begin{bmatrix} 1 \ \epsilon \ \vdots \ \epsilon^m \end{bmatrix} = \mathbf{b} \cdot 1 + \mathbf{I} \cdot \begin{bmatrix} \epsilon \ \epsilon^2 \ \vdots \ \epsilon^m \end{bmatrix} = \mathbf{b}(\epsilon). $$ Substituting this back into the expression for $\mathbf{x}_B(\epsilon)$, we obtain the desired result: $$ \mathbf{x}_B(\epsilon) = \mathbf{B}^{-1} \left( [\mathbf{b} \mid \mathbf{I}] \begin{bmatrix} 1 \ \epsilon \ \vdots \ \epsilon^m \end{bmatrix} \right) = \mathbf{B}^{-1}[\mathbf{b} \mid \mathbf{I}] \begin{bmatrix} 1 \ \epsilon \ \vdots \ \epsilon^m \end{bmatrix}. \qed $$ \vspace{1em} \noindent(b) A basic solution $\mathbf{x}_B(\epsilon)$ is degenerate if at least one of its components is zero. Let the $i$-th row of the matrix $\mathbf{B}^{-1}$ be denoted by $\mathbf{v}'_i = [v_{i1}, \dots, v_{im}]$. The $i$-th component of $\mathbf{x}_B(\epsilon)$ is given by: $$ (\mathbf{x}_B(\epsilon))_i = \mathbf{v}'_i \mathbf{b}(\epsilon) = \mathbf{v}'_i \left( \mathbf{b} + \begin{bmatrix} \epsilon \ \epsilon^2 \ \vdots \ \epsilon^m \end{bmatrix} \right) = \mathbf{v}'_i\mathbf{b} + \sum_{j=1}^{m} v_{ij}\epsilon^j. $$ Let us define this expression as a polynomial in $\epsilon$, $P_{B,i}(\epsilon)$. Since $\mathbf{B}$ is a basis matrix, $\mathbf{B}^{-1}$ is invertible and none of its rows $\mathbf{v}'_i$ can be the zero vector. Therefore, $P_{B,i}(\epsilon)$ is a non-zero polynomial of degree at most $m$. A non-zero polynomial of degree $m$ has at most $m$ roots. This holds for a single basis matrix $\mathbf{B}$. To ensure that *all* basic solutions are nondegenerate, we must consider every possible basis matrix. The number of ways to choose $m$ basis columns from the $n$ columns of $\mathbf{A}$ is finite, given by $\binom{n}{m}$. Let $\mathcal{B}$ be the set of all possible basis matrices. For each $\mathbf{B} \in \mathcal{B}$ and each row $i \in \{1, \dots, m\}$, we have a polynomial $P_{B,i}(\epsilon)$. Let $R$ be the set of all positive roots of all these polynomials. Since $R$ is the union of a finite number of finite sets, it is itself a finite set. We can now define $\epsilon^*$. \begin{itemize} \item If the set $R$ of positive roots is empty, any $\epsilon > 0$ will work, so we can arbitrarily set $\epsilon^* = 1$. \item If $R$ is not empty, let $\epsilon^* = \min R$. Since $R$ is a finite set of positive numbers, this minimum is well-defined and strictly positive. \end{itemize} By choosing any $\epsilon$ in the range $0 < \epsilon < \epsilon^*$, we ensure that $\epsilon$ is not a positive root of any polynomial $P_{B,i}(\epsilon)$ for any possible basis $\mathbf{B}$ and any component $i$. Consequently, for any such $\epsilon$, no component of any basic solution can be zero. This proves that all basic, not necessarily feasible, solutions to the $\epsilon$-perturbed problem are nondegenerate. \qed \vspace{1em} \noindent(c) A basic solution $\mathbf{x}_B(\epsilon)$ is feasible if all of its components are nonnegative. As established in part (b), we can choose $\epsilon$ small enough to ensure all components are nonzero, so we only need to show they are strictly positive. Let $\mathbf{w}'_i$ be the $i$-th row of the matrix $\mathbf{B}^{-1}[\mathbf{b} \mid \mathbf{I}]$. The components of this vector are the coefficients of the polynomial $P_{B,i}(\epsilon) = (\mathbf{x}_B(\epsilon))_i$. Specifically, if we write $\mathbf{w}'_i = [w_{i0}, w_{i1}, \dots, w_{im}]$, then $$ P_{B,i}(\epsilon) = w_{i0} + w_{i1}\epsilon + w_{i2}\epsilon^2 + \dots + w_{im}\epsilon^m. $$ The hypothesis is that each row vector $\mathbf{w}'_i$ is lexicographically positive. By definition, this means that for each $i \in \{1, \dots, m\}$, the first nonzero component of $\mathbf{w}'_i$ is positive. Let $k_i$ be the index of the first nonzero component of $\mathbf{w}'_i$. The hypothesis implies $w_{ik_i} > 0$. We can now rewrite the polynomial by factoring out the lowest power of $\epsilon$: $$ P_{B,i}(\epsilon) = \epsilon^{k_i} (w_{ik_i} + w_{i,k_i+1}\epsilon + w_{i,k_i+2}\epsilon^2 + \dots). $$ We analyze the sign of this expression for a sufficiently small, positive $\epsilon$: \begin{itemize} \item The term $\epsilon^{k_i}$ is strictly positive since $\epsilon > 0$. \item For the term in the parenthesis, we consider its limit as $\epsilon \to 0^+$: $$ \lim_{\epsilon \to 0^+} (w_{ik_i} + w_{i,k_i+1}\epsilon + \dots) = w_{ik_i}. $$ Since $w_{ik_i} > 0$, by the definition of a limit, there exists some $\delta_i > 0$ such that the entire parenthetical expression is positive for all $0 < \epsilon < \delta_i$. \end{itemize} The product of two positive numbers is positive, so $P_{B,i}(\epsilon) > 0$ for all $0 < \epsilon < \delta_i$. This argument holds for each component $i=1, \dots, m$. To ensure all components are positive simultaneously, we can choose $\epsilon$ to be in the range $0 < \epsilon < \min\{\delta_1, \dots, \delta_m\}$. This minimum is a well-defined positive number. For such an $\epsilon$, every component of $\mathbf{x}_B(\epsilon)$ is positive, which proves that it is a basic feasible solution. \qed \vspace{1em} \noindent (d) The core of the proof is to show that for any two rows $i$ and $k$ (with $u_i, u_k > 0$), the lexicographical comparison of their corresponding vectors is equivalent to the numerical comparison of their ratios in the perturbed problem, provided $\epsilon$ is sufficiently small. Let $\mathbf{w}'_i$ and $\mathbf{w}'_k$ be the $i$-th and $k$-th rows of the matrix $\mathbf{B}^{-1}[\mathbf{b} \mid \mathbf{I}]$, respectively. \underline{The Lexicographic Rule} chooses the exiting variable by finding the index $l$ that minimizes the vector $\mathbf{w}'_i / u_i$ lexicographically. A comparison between row $i$ and row $k$ is determined by the sign of the first nonzero component of the vector difference: $$ \frac{\mathbf{w}'_i}{u_i} - \frac{\mathbf{w}'_k}{u_k}. $$ \underline{The Standard Simplex Rule} applied to the $\epsilon$-perturbed problem chooses the exiting variable by finding the index $l$ that minimizes the scalar ratio $(\mathbf{x}_B(\epsilon))_i / u_i$. A comparison between row $i$ and row $k$ is determined by the sign of the scalar difference: $$ \frac{(\mathbf{x}_B(\epsilon))_i}{u_i} - \frac{(\mathbf{x}_B(\epsilon))_k}{u_k}. $$ As established in part (a), $(\mathbf{x}_B(\epsilon))_i$ is a polynomial in $\epsilon$ whose coefficients are the components of the vector $\mathbf{w}'_i$. Let $\mathbf{w}'_i = [w_{i0}, w_{i1}, \dots, w_{im}]$. Then: $$ (\mathbf{x}_B(\epsilon))_i = w_{i0} + w_{i1}\epsilon + w_{i2}\epsilon^2 + \dots + w_{im}\epsilon^m. $$ Let's analyze the scalar difference, which is itself a polynomial in $\epsilon$: \begin{align*} P(\epsilon) &= \frac{(\mathbf{x}_B(\epsilon))_i}{u_i} - \frac{(\mathbf{x}_B(\epsilon))_k}{u_k} \ &= \left(\frac{w_{i0}}{u_i} - \frac{w_{k0}}{u_k}\right) + \left(\frac{w_{i1}}{u_i} - \frac{w_{k1}}{u_k}\right)\epsilon + \dots + \left(\frac{w_{im}}{u_i} - \frac{w_{km}}{u_k}\right)\epsilon^m. \end{align*} For a sufficiently small $\epsilon > 0$, the sign of a non-zero polynomial is determined by the sign of its lowest-order coefficient. Let $j^*$ be the index of the first non-zero coefficient of $P(\epsilon)$. Then for small $\epsilon$, $\text{sign}(P(\epsilon)) = \text{sign}\left(\frac{w_{ij^*}}{u_i} - \frac{w_{kj^*}}{u_k}\right)$. \vspace{7pt} \noindent This demonestrates the following equivalence: \begin{itemize} \item The coefficients of the polynomial $P(\epsilon)$ are the components of the vector difference $\dfrac{\mathbf{w}'_i}{u_i} - \dfrac{\mathbf{w}'_k}{u_k}$. \item The lexicographical comparison depends on the sign of the first non-zero component of this vector difference. \item The numerical comparison in the perturbed problem depends on the sign of the polynomial $P(\epsilon)$, which for small $\epsilon$ is determined by the sign of its first non-zero coefficient. \end{itemize} Therefore, the outcome of the lexicographical comparison between the vectors is identical to the outcome of the numerical comparison between the scalar ratios for a sufficiently small $\epsilon$. This proves that both rules result in the same choice of exiting variable. \hfill \qed \vspace{1em} \noindent (e) The revised simplex method, when applied to a nondegenerate problem, is guaranteed to terminate because the cost is strictly decreased at every iteration. Cycling, thus non-termination, is only possible in the presence of degeneracy. As shown in part (b), for a sufficiently small $\epsilon > 0$, the $\epsilon$-perturbed problem is guaranteed to be nondegenerate. Therefore, the standard simplex method is guaranteed to terminate when applied to the $\epsilon$-perturbed problem. Part (d) establishes the equivalence of the lexicographic pivoting rule applied to the original problem and the standard simplex method applied to the nondegenerate $\epsilon$-perturbed problem. Therefore, the revised simplex method with the lexicographic rule is executing the same pivots as an algorithm that is known to terminate. It inherits the termination guarantee of the $\epsilon$-perturbation method. \qed

Exercise 3.15 (Perturbation approach to lexicography)

Answers

Comments

Exercise 3.15 (Perturbation approach to lexicography)

Answers

Comments

Add answer