Pivoting and Simplex Algorithm

I would like to understand exactly how the pivoting works geometrically in Simplex algorithm. What is meant geometrically by moving a vector into BFS and moving out one. Also, what is the geometrical implication of new variables introduced into the linear program.

It would be really helpful if anybody may point me some source from where I can read in details with geometric diagrams for understanding these.

Thanks in advance.

Solutions Collecting From Web of "Pivoting and Simplex Algorithm"

>Also, what is the implication of new geometrical variables introduced into the linear program.

Well, if you mean by that the changes to the linear programming problem to make it in standard form, the argument is that the presence of $m$ equations reduces the dimension at most equal to $n -m$ (equal if there
exists $\it x$ such that $\it Ax=b$). Equations force solutions to be on the frontier (precisely, the feasible set of a problem of L.P. is necessarily a polyhedron $P$, since it is determined by a finite intersection of semispaces; it can also be easily proved that at least one solution is a vertex of $P$).

As regards the rest, established the problem of departure (assuming as objective a minimum),

$\min cx$

$\it Ax$ $=$ $\it b$

$ \it x \ge$ $0$

where $\it A$ has $\it m$ linearly independent rows and $\it n$ columns, and $\it n\ge m$, we consider a subset $\beta$ (called the basis) of $\it m$ elements such that the $\it m$ columns are l.i., so that they form a square submatrix of $\it A$, call it $\it B$; moreover we consider the complementary (in $\it n$) of $\beta$, said it $\eta$, and its corrispondent matrix, said it $\it N$. That said, the initial system of equations can be rewritten as

$\begin{pmatrix}
B & N
\end{pmatrix}
\begin{pmatrix}
x_{\beta} \\
x_{\eta}
\end{pmatrix}\\$

from which we obtain an expression for the variables in the basis as a function of those outside the basis.

$ Bx_{\beta}+Nx_{\eta}=b \implies x_{\beta}=B^{-1}b-B^{-1}Nx_{\eta}$

For each basis $\beta$, a solution of the following type

$ x_{\beta}:= B^{-1}b$

$ x_{\eta}:=0$

takes the name of basic solution. If $ x_{\beta}\ge 0$ we say that the basic solution, the basis matrix and the basis are feasible. If $ x_{\beta}> 0$ we say they are feasible and not-degenerate. If there is a component of
$\it x_{\beta}$ equal to $0$ instead they are called degenerate. (I do not know if you are interested in the problem of degeneration.)

Briefly, there is a necessary and sufficient condition for
optimality. The sufficiency is obtained starting from a bfs whose value is

$ f(\overline{x})=c_{\beta}\overline{x}_{\beta}+c_{\eta}\overline{x}_{\eta}$

and that of the generic feasible solution $(x_{\beta},x_{\eta})$

$f(x) = c_{\beta}x_{\beta}+c_{\eta}x_{\eta}=f(\overline{x})+\overline{c}_{\eta}x_{\eta}$

where it is placed $ \overline{c}_{\eta}=(c_{\eta}-c_{\beta}B
^{-1}N) $. Now, the feasible set $F$ may also be represented as

$F=\{x=(x_{\beta},x_{\eta}): x_{\beta}=\overline{x}_{\beta}-B^{-1}Nx_{\eta},x_{\beta}\geq 0,x_{\eta}\geq 0\}$

for which the problem of p.l. under consideration is equivalent to

$\min \overline{c}_{\eta}x_{\eta}$

$ B^{-1}Nx_{\eta}\leq \overline{x}_{\beta} \ \ \ \ \ (1)$

$x_{\eta}\ge 0$

In $(1)$ the solution $ x_{\eta}=0$ is always permissible, therefore if $\overline{c}_{\eta}\geq 0$, the
optimal value of $(1)$ is $ 0$, and then, in that case, $ x_{\eta}=0$ is optimum in $(1)$ and consequently $(\overline{x}_{\beta},\overline{x}_{\eta})$ is optimum in the problem under consideration. So the condition of sufficiency says that if $ \overline{c}_{\eta}\geq 0 $, the bfs is the optimum, and the basis is called optimal basis. The vector $\overline{c}_{\eta}$ is
referred to as reduced cost. It can be seen that since $ c_{\beta}-c_{\beta}B^{-1}B=0 $, the express
condition can be rewritten as $ c – c_{\beta}B^{-1}A\geq 0 $ and the reduced cost redefined as

$\overline{c}:=c-c_{\beta}B^{-1}A$.

If the basis is not degenerate and there is an index $p$ such that $\overline{c}_{p}<0$, as said, the basic solution in question is not optimal. So we have to find a way to switch to another admittable basis of lower value. To make such a change of basis we consider the following problem (where $\overline{a}:=B^{-1}A^{p} $, with $\it A^{p} $ p-th column of $\it A $):

$\min \overline{c}_{p}x_{p}$

$\overline{a}x_{p}\leq \overline{x}_{\beta} \ \ \ \ \ \ \ \ \ \ \ \ (2)$

$x_{p}\ge 0$

This is obtained by $(1)$ placing all components of $\overline{x}_{\eta}$, except the p-th,
equal to zero. So $(2)$ has the same objective function of $(1)$ but a smaller feasible set and the optimal solutions of $(2)$ are feasible solutions of $(1)$. If in particular the optimal value of $(2)$ is negative, the optimal solution of $(2)$ is
a feasible one of $(1)$ of lower value. The calculation of the optimum is then

$ x_{p}^{\ast}:=\min \left\{\dfrac{\overline{x}_{\beta (i)}}{\overline{a}_{i}}: 1\leq i \leq m, \overline{a}_{i}> 0 \right\}=:\dfrac{\overline{x}_{\beta (q)}}{\overline{a}_{q}}$

If, however, $ \forall i \ \overline{a}_{i}\leq 0 $, then $(2)$ is an unlimited problem and even more so it is $(1)$ (and the original
problem). If instead there is at least one index $\it i$ such that $\overline{a}_{i}> 0$, the new obtained
solution is given by

$x^{\ ‘}_{\beta
(i)}:=\overline{x}_{\beta(i)}-\overline{a}_{i}x_{p}^{\ast}=\overline{x}_{\beta (i)}-\overline{a}_{i}\dfrac{\overline{x}_{\beta (q)}}{\overline{a}_{q}}\geq 0, \quad i=1, …, m $

$x^{\ ‘}_{p}:=x_{p}^{\ast}=\dfrac{\overline{x}_{\beta (q)}}{\overline{a}_{q}}\geq 0\ \qquad \ \ \ \qquad \ \qquad \ \ \ \ \qquad \ \ \ \qquad \qquad \ \qquad \ \ \ \ (3)$

$x^{\ ‘}_{j}:=\overline{x}_{j}=0 $

Because $ x^{\ ‘}_{\beta (q)}=0 $ , It is obtained a solution with $ n-m $ components equal to zero
and the other greater than or equal to zero. If the columns $ A^{j} $ with $ j \in \beta \setminus \beta (q) \cup p$ are
l.i., then the new solution is basic. (To this end it
is noted $ B^{-1}(B \ \vdots \ A^{p})=(I \ \vdots \ \overline{a}) $ and since $\overline{a}_{q}\neq 0$, when we remove the q-th column of the identity
matrix we will have a non-singular matrix.) Then the solution of $(2)$ is a bfs we obtain through $(3)$. At this point it can be checked if the new basic solution is an optimum and, if not, a further change of basis should be performed. If all bases are not degenerate, as consequence we have that in a finite number of steps we reach an optimal basis, or it is determined that the problem is unbounded. In fact, if all bases are not degenerate, the value of the objective function decreases by an amount greater than zero (see the above definition of new solution and what previously said) to each basis change, and therefore no basis is represented twice.

It has been thus identified the simplex algorithm (assuming no degeneration). This method is to implement the most efficient use possible of the algorithm just outlined. The criteria are essentially two. The first is to choose $\it p$ such that

$ \overline{c}_{p}=\min_{j\in \eta} \overline{c}_{j} $

However, this criterion does not necessarily entail the greatest decrease in the cost, for this is given by $\overline{c}_{p}x_{p}^{\ast}$ and it can be therefore an index $ j$ such that $\overline{c}_{p}x_{p}^{\ast}>\overline{c}_{j}x_{j}^{\ast}$.

The second consists in the choice of the first index for which the reduced cost proves to be negative. In this case the decrease of the cost at each iteration may also not be significant, however the computational saving within each iteration can be considerable, particularly in the presence of many variables.

Once you have made the choice of the element to enter in basis, it is automatically determined by the algorithm also the element to be let out from the basis, provided that there is no degeneration. In addition to any change of basis is required to calculate the inverse of the new basis matrix.
If done directely, it would lead to a complexity $O(n^3)$. It can, however, reduce to $O(n^2)$ using the inverse of the preceding basis matrix. Let $\it B$ and $\overline{B} $ two basic matrices which differ by a column, for example the q-th. Denote $ W=B^{-1} $ and $ \overline{W}=\overline{B}^{\ -1} $.
Denote by $\it e_{q} $ the all-zero column vector except a $1$ in q-th position, and with $ e_{q}^{t} $ its transposed. So if $\it a $ is a generic column vector, $a e_{q}^{t} $ is a $0$ matrix except the q-th column which is equal to the column vector $\it a $. Denoting by $ B^{q} $ the q-th column of $\it B$ we have that

$\overline{B}=B-B^{q}e_{q}^{t}+A^{p}e_{q}^{t}$

where it expresses the fact that the q-th column of the basis matrix has been replaced with $\it A^{p} $. Also $ B^{q}=Be_{q} $ and then

$\overline{B}=B(I-e_{q}e_{q}^{t}+B^{-1}A^{p}e_{q}^{t})=B(I-(e_{q}-\overline{a})e_{q}^{t})$

Making use of the formula (generic column vectors $\it a$ and $\it b$ generic vectors)

$( I-ab^{t})^{-1}=I+\dfrac{ab^{t}}{1-b^{t}a}$

we obtain

$\overline{W}:= \left(I+\dfrac{(e_{q}-\overline{a})e_{q}^{t}}{1-e_{q}^{t}(e_{q}-\overline{a})}\right)W$

Denoting by $ W_{q}$ the q-th row of $\it W$ and noting that $1 – \ e_{q}^{t}(e_{q}-\overline{a})=\overline{a}_{q} $
, we have

$\overline{W}:=W+\dfrac{1}{\overline{a}_{q}}(e_{q}-\overline{a})W_{q}$

that is

The updating of the inverse of the matrix basis is made therefore directly by the inverse of the current basis matrix via the formulas $(4)$. This operation is called pivoting by the fact that for each calculation are involved four elements of the array, of indices (ij) (qj) (ip) (qp), arranged in a rectangle in the matrix. The last of these, (qp), remains fixed, as if it were a pivot.
Furthermore you regain easily the formauls $(3)$ directly from

$ x^{\ ‘}_{\beta}:=\overline{B}^{\ -1}b= \left(I+\dfrac{(e_{q}-\overline{a})e_{q}^{t}}{1-e_{q}^{t}(e_{q}-\overline{a})}\right)x_{\beta}$

As the iterations proceed, the accumulation of rounding errors, inevitable in these formulas, can make $\it W $ differ from $ B^{-1}$ in such a way as to alter qualitatively the optimality test and the calculation of $ \overline{a} $. It should therefore be recalculated periodically $\it W $ directly as inverse of $\it B $.

Finally, we can do a quick overview to clarify the link between the algebraic and the geometric aspect. Carry without proof the following results. Be therefore

$ F:=\{x\in \mathbb{R}^{n}:Ax=b,x\geq \ 0\}$

It is a polyhedron not containing lines and then, if there is a finished solution, at least one vertex must be optimal. It is therefore to characterize the vertices of $ \ F $.

(theorem) $\overline{x}$ is vertex of $\it F$ iff the columns of $\it A$ corresponding to the strictly positive components of $\overline{x}$ , are linearly independent.

(corollary) If $\overline{x}$ is vertex $\it n-m$ components of $\overline{x}$ must be null.

(corollary). Let the rows of $\it A$ l.i. Then $\overline{x}$ is the vertex of $\it F$ iff $\overline{x}$ is a basic feasible solution.

(thm) If there is no degeneration correspondence between vertices of $\it F$ and basic feasible solutions is bijective.

(lemma) Let the rows of $\it A$ l.i. If $ \beta^{1} $ and $ \beta^{2} $ are two bases, then

$\qquad\qquad \dim \left[ \ker(A(\beta^{1}\cup \beta^{2} ))\right]=\left | \beta^{1} \cup \beta^{2} \right | – m$

and in particular if $ \beta^{1} $ and $ \beta^{2} $ are adjacent (i.e. they differ only by a element, that’s $ | \beta^{1} \cup \beta^{2} |=m+1 $) we have $ \dim [\ker (A(\beta^{1} \cup \beta^{2} ))]=1 $.
[$A(\alpha), \ \alpha \subset \{1,…,n\}$, is the matrix obtained by the columns relating to indices in $A$ ordered in an arbitrary manner.]

(lemma) Let $ \beta^{1} $ and $ \beta^{2} $ admittable bases with their adjacent non-degenerate basic solutions $x^1$ and $x^2$. Then $Ax=b, \ x\geq 0$ with $x_j =0$ if $j\notin \beta^1 \cup \beta^2$ iff $x$ is a convex combination of $x^1$ and $x^2$.

(thm.) Let $B$ and $b$ two admittable bases non-degenerate. The two bases are adjacent iff the two corresponding vertices $x^1$ and $x^2$ are adjacent.

From these results we obtain the geometric interpretation of the simplex method: the optimal solution is searched passing from one vertex to another ‘moving’ along the edges.