Skip to content

Commit 8105147

Browse files
committed
Small changes in admonitions
1 parent 95a8c8d commit 8105147

File tree

5 files changed

+290
-193
lines changed

5 files changed

+290
-193
lines changed

docs/src/lecture_07/constrained.md

Lines changed: 47 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -46,10 +46,10 @@ g(x) = [cos(x[1] + x[2]) - 2*cos(x[1])*sin(x[1]); cos(x[1] + x[2])]
4646
f(x1,x2) = f([x1;x2])
4747
```
4848

49-
5049
# [Constrained optimization](@id lagrangian)
5150

5251
The usual formulation of constrained optimization is
52+
5353
```math
5454
\tag{P}
5555
\begin{aligned}
@@ -58,89 +58,92 @@ The usual formulation of constrained optimization is
5858
&h_j(x) = 0,\ j=1,\dots,J.
5959
\end{aligned}
6060
```
61+
6162
Functions ``g_i`` generate inequality constraints, while functions ``h_j`` generate equality constraints. Box constraints such as ``x\in[0,1]`` are the simplest case of the former. This optimization problem is also called the primal formulation. It is closely connected with the Lagrangian
63+
6264
```math
6365
L(x;\lambda,\mu) = f(x) + \sum_{i=1}^I \lambda_i g_i(x) + \sum_{j=1}^J \mu_j h_j(x).
6466
```
67+
6568
Namely, it is simple to show that the primal formulation (P) is equivalent to
69+
6670
```math
6771
\operatorname*{minimize}_x\quad \operatorname*{maximize}_{\lambda\ge 0,\mu}\quad L(x;\lambda,\mu).
6872
```
73+
6974
The dual problem then switches the minimization and maximization to arrive at
75+
7076
```math
7177
\tag{D} \operatorname*{maximize}_{\lambda\ge 0,\mu} \quad\operatorname*{minimize}_x\quad L(x;\lambda,\mu).
7278
```
7379

7480
Even though the primal and dual formulations are not generally equivalent, they are often used interchangeably.
7581

76-
```@raw html
77-
<div class="admonition is-info">
78-
<header class="admonition-header">Linear programming</header>
79-
<div class="admonition-body">
80-
```
81-
The linear program
82-
```math
83-
\begin{aligned}
84-
\text{minimize}\qquad &c^\top x \\
85-
\text{subject to}\qquad &Ax=b, \\
86-
&x\ge 0
87-
\end{aligned}
88-
```
89-
is equivalent to
90-
```math
91-
\begin{aligned}
92-
\text{maximize}\qquad &b^\top \mu \\
93-
\text{subject to}\qquad &A^\top \mu\le c.
94-
\end{aligned}
95-
```
96-
We can observe several things:
97-
1. Primal and dual problems switch minimization and maximization.
98-
2. Primal and dual problems switch variables and constraints.
99-
```@raw html
100-
</div></div>
101-
```
82+
!!! info "Linear programming:"
83+
The linear program
10284

103-
For the unconstrained optimization, we showed that each local minimum satisfies the optimality condition ``\nabla f(x)=0``. This condition does not have to hold for unconstrained optimization, where the optimality conditions are of a more complex form.
85+
```math
86+
\begin{aligned}
87+
\text{minimize}\qquad &c^\top x \\
88+
\text{subject to}\qquad &Ax=b, \\
89+
&x\ge 0
90+
\end{aligned}
91+
```
10492

105-
```@raw html
106-
<div class="admonition is-category-theorem">
107-
<header class="admonition-header">Theorem: Karush-Kuhn-Tucker conditions</header>
108-
<div class="admonition-body">
109-
```
110-
Let ``f``, ``g_i`` and ``h_j`` be differentiable function and let a constraint qualification hold. If ``x`` is a local minimum of the primal problem (P), then there are $\lambda\ge 0$ and $\mu$ such that
111-
```math
93+
is equivalent to
94+
95+
```math
11296
\begin{aligned}
113-
&\text{Optimality:} && \nabla_x L(x;\lambda,\mu) = 0, \\
114-
&\text{Feasibility:} && \nabla_\lambda L(x;\lambda,\mu)\le 0,\ \nabla_\mu L(x;\lambda,\mu) = 0, \\
115-
&\text{Complementarity:} && \lambda^\top g(x) = 0.
97+
\text{maximize}\qquad &b^\top \mu \\
98+
\text{subject to}\qquad &A^\top \mu\le c.
11699
\end{aligned}
117-
```
118-
If $f$ and $g$ are convex and $h$ is linear, then every stationary point is a global minimum of (P).
119-
```@raw html
120-
</div></div>
121-
```
100+
```
101+
102+
We can observe several things:
103+
1. Primal and dual problems switch minimization and maximization.
104+
2. Primal and dual problems switch variables and constraints.
105+
106+
For the unconstrained optimization, we showed that each local minimum satisfies the optimality condition ``\nabla f(x)=0``. This condition does not have to hold for unconstrained optimization, where the optimality conditions are of a more complex form.
107+
108+
!!! theorem "Theorem: Karush-Kuhn-Tucker conditions"
109+
Let ``f``, ``g_i`` and ``h_j`` be differentiable function and let a constraint qualification hold. If ``x`` is a local minimum of the primal problem (P), then there are $\lambda\ge 0$ and $\mu$ such that
110+
111+
```math
112+
\begin{aligned}
113+
&\text{Optimality:} && \nabla_x L(x;\lambda,\mu) = 0, \\
114+
&\text{Feasibility:} && \nabla_\lambda L(x;\lambda,\mu)\le 0,\ \nabla_\mu L(x;\lambda,\mu) = 0, \\
115+
&\text{Complementarity:} && \lambda^\top g(x) = 0.
116+
\end{aligned}
117+
```
118+
119+
If $f$ and $g$ are convex and $h$ is linear, then every stationary point is a global minimum of (P).
122120

123121
When there are no constraints, the Lagrangian ``L`` reduces to the objective ``f``, and the optimality conditions are equivalent. Therefore, the optimality conditions for constrained optimization generalize those for unconstrained optimization.
124122

125123
## Numerical method
126124

127-
We present only the simplest method for constraint optimization. Projected gradients
125+
We present only the simplest method for constraint optimization. Projected gradients
126+
128127
```math
129128
\begin{aligned}
130129
y^{k+1} &= x^k - \alpha^k\nabla f(x^k), \\
131130
x^{k+1} &= P_X(y^{k+1})
132131
\end{aligned}
133132
```
133+
134134
compute the gradient as for standard gradient descent, and then project the point onto the feasible set. Since the projection needs to be simple to calculate, projected gradients are used for simple ``X`` such as boxes or balls.
135135

136136
We will use projected gradients to solve
137+
137138
```math
138139
\begin{aligned}
139140
\text{minimize}\qquad &\sin(x_1 + x_2) + \cos(x_1)^2 \\
140141
\text{subject to}\qquad &x_1, x_2\in [-1,1].
141142
\end{aligned}
142143
```
144+
143145
The implementation of projected gradients is the same as gradient descent but it needs projection function ```P``` as input. For reasons of plotting, it returns both ``x`` and ``y``.
146+
144147
```@example optim
145148
function optim(f, g, P, x, α; max_iter=100)
146149
xs = zeros(length(x), max_iter+1)

0 commit comments

Comments
 (0)