Optimising under arbitrarily many constraint equations

dkl9

This is a linkpost for https://dkl9.net/essays/general_lagrange.html

Say we have a multivariate function to optimise, like , under some constraints, like $g_{1} = x^{2} + y^{2} - z$ and $g_{2} = y + z - 1$ , both to equal zero.

The common method is that of Lagrange multipliers.

Add a variable $λ$ for each constraint function — here, we'll use $λ_{1}$ and $λ_{2}$ .
Declare the set of equations $\nabla f = λ_{1} \nabla g_{1} + λ_{2} \nabla g_{2}$ .
Bring in the equations $g_{1} = 0$ and $g_{2} = 0$ (etc, if there are more constraints).
Solve for $λ$ and, more importantly, the inputs $x$ , $y$ , $z$ .

Lagrange multipliers annoy me, insofar as they introduce extra variables. There is another way — arguably more direct, if perhaps more tedious in calculation and less often taught. I found it alone, tho surely someone else did first — probably Euler.

Lagrange, anyway

For the sake of a standard answer to check against, let's use Lagrange multipliers.

The gradient of $x^{2} + y^{2} + z^{2}$ is $[2 x, 2 y, 2 z]$ . Likewise, $\nabla (x^{2} + y^{2} - z) = [2 x, 2 y, - 1]$ , and $\nabla (y + z - 1) = [0, 1, 1]$ . So step 2 gives these equations:

$2 x = 2 x λ_{1}$
$2 y = 2 y λ_{1} + λ_{2}$
$2 z = - λ_{1} + λ_{2}$

It readily follows that $λ_{1} = 1$ or $x = 0$ .

If $λ_{1} = 1$ , then $λ_{2} = 0$ , and $z = - \frac{1}{2}$ . By the second constraint, $y + z - 1 = 0$ , find that $y = \frac{3}{2}$ . By the first constraint, $x^{2} + y^{2} - z = 0$ , find that $x^{2} = - \frac{11}{4}$ , which is a contradiction for real inputs.

If $x = 0$ , then, by the first constraint, $z = y^{2}$ , and, by the second constraint, $y^{2} + y - 1 = 0$ , so $y = \frac{- 1 \pm \sqrt{5}}{2}$ and $z = \frac{3 \mp \sqrt{5}}{2}$ .

Determinants

With one constraint, the method of Lagrange multipliers reduces to $\nabla f = λ \nabla g$ . $\nabla f$ and $\nabla g$ are vectors, which differ by a scalar factor iff they point in the same (or directly opposite) directions iff (for three dimensions) the cross product $\nabla f \times \nabla g = 0$ iff (for two dimensions) the two-by-two determinant $| \nabla f \nabla g | = 0$ .

With two constraints, the method asks when $\nabla f = λ \nabla g + μ \nabla h$ . That would mean $\nabla f$ is a linear combination of $\nabla g$ and $\nabla h$ , which it is iff $\nabla f$ , $\nabla g$ , and $\nabla h$ are all coplanar iff (for three dimensions) the three-by-three determinant $| \nabla f \nabla g \nabla h | = 0$ .

As it happens, the cross product is a wolf that can wear determinant's clothing. Just fill one column with basis vectors: $\nabla f \times \nabla g = ∣ ∣ \nabla f \nabla g [^i^j^k] ∣ ∣$ .

Likewise, with zero constraints, the "method of Lagrange multipliers" — really, the first-derivative test — asks when $\nabla f = 0$ . Fill a three-by-three matrix with two columns of basis vectors: $[\nabla f [^i^j^k] [^i^j^k]]$ . Suppose the basis vectors multiply like the cross product, as in geometric algebra. Then the determinant, rather than the usual 0 for a matrix with two equal columns, turns out to equal that ordinary column vector $\nabla f$ (up to a scalar constant).

In every scenario so far — and I claim this holds for higher dimensions and more constraints — the core equations to optimise under constraints are the actual constraint equations, along with a single determinant. The matrix has its columns filled with the gradient of the function to optimise, each constraint gradient, and copies of the basis vectors, in order, to make it square.

§ Example

Fill a matrix with those gradients given above. We'll take its determinant.

$\nabla f$	$\nabla g_{1}$	$\nabla g_{2}$
$2 x$	$2 x$	$0$
$2 y$	$2 y$	$1$
$2 z$	$- 1$	$1$

The determinant, when simplified, is $2 x (1 + 2 z)$ . The equations to consider are just

$2 x (1 + 2 z) = 0$
$x^{2} + y^{2} - z = 0$
$y + z - 1 = 0$

The first tells us that $x = 0$ or $z = - \frac{1}{2}$ . If $x = 0$ , $z = y^{2}$ , so $y^{2} + y - 1 = 0$ , so $y = \frac{- 1 \pm \sqrt{5}}{2}$ , and $z = \frac{3 \mp \sqrt{5}}{2}$ . If $z = - \frac{1}{2}$ , then $y = \frac{3}{2}$ and $x$ is imaginary. These are the same results as above; the method works, using only the variables given in the problem.

LESSWRONG
LW

6

Optimising under arbitrarily many constraint equations

6

Lagrange, anyway

Determinants

§ Example

6