2018/07/31 21:54

# CVX notes

[TOC]

## Preliminaries

### 1. PSD

M is positive semidefinite matrix $\iff$ all principal submatrices $P$ of $M$ are PSD

Note: This follows by considering the quadratic form $x^T Mx$ and looking at the components of $x$ corresponding to the defining subset of principal submatrix. The converse is trivially true.

M is PSD $\iff$ all principal minors are non-negative (所有主子式非负)

### 2. Matrix norm

General definition of a norm:![2018-06-05 20-39-34 的屏幕截图](/home/yychi/Pictures/2018-06-05 20-39-34 的屏幕截图.png)

Matrix norm:![2018-06-05 20-44-07 的屏幕截图](/home/yychi/Pictures/2018-06-05 20-44-07 的屏幕截图.png)

• Frobenius norm: $|A|_F := \sqrt{\langle A,A\rangle_F} = \sqrt{\mathbf{tr}(A^*A)}$
• Induced norm: $|A|p := \sup\limits{|x|_p = 1} |Ax|_p$
• Nuclear norm: $|A|_{nuclear} := \sum \sigma_i(A)$ (奇异值之和)
• Spectral norm: $|A|_{spectral} := \lambda_1$ (最大特征值)

![2018-06-05 21-01-46 的屏幕截图](/home/yychi/Pictures/2018-06-05 21-01-46 的屏幕截图.png)

![2018-06-05 21-02-47 的屏幕截图](/home/yychi/Pictures/2018-06-05 21-02-47 的屏幕截图.png)

### 3. Duality

Two ==equivalent== ways to represent a convex set:

• standard representation: The family of points in the set
• dual representation: The set of halfspaces containing the set (半平面的交集)

A closed convex set $S$ is the intersection of all closed halfspaces $H$ containing it.![2018-06-05 21-27-43 的屏幕截图](/home/yychi/Pictures/2018-06-05 21-27-43 的屏幕截图.png)

![2018-06-05 21-29-20 的屏幕截图](/home/yychi/Pictures/2018-06-05 21-29-20 的屏幕截图.png)

![2018-06-05 21-31-13 的屏幕截图](/home/yychi/Pictures/2018-06-05 21-31-13 的屏幕截图.png)

Polar Let $S \subseteq \mathbb{R}^n$ be a convex set containing the origin. The polar of $S$ is defined as follows $$S^{\circ} := {y ~|~ y^Tx \le 1, ~\forall x \in S}$$

<u>Note</u>

• polar is one way of representing the all halfspaces containing a convex set
• every halfspace $a^Tx \le b$ with $b \neq 0$ can be written as a “normalized” inequality $y^T x \le 1$, by dividing by $b$
• $S^{\circ}$ can be thought of as the normalized representations of halfspaces containing $S$

Properties of the polar:

1. $S^{\circ\circ} = S$
2. $S^{\circ}$ is a closed convex set containing the origin
3. When 0 is in the interior of $S$, then $S^{\circ}$ is bounded
4. When $S$ is non-convex, $S^{\circ} = (\mathbf{conv}(S))^{\circ}$, and $S^{\circ\circ} = \mathbf{conv}(S)$

![2018-06-05 22-03-51 的屏幕截图](/home/yychi/Pictures/2018-06-05 22-03-51 的屏幕截图.png)

![2018-06-05 22-04-27 的屏幕截图](/home/yychi/Pictures/2018-06-05 22-04-27 的屏幕截图.png)

Polar duality of convex cones![2018-06-05 22-17-25 的屏幕截图](/home/yychi/Pictures/2018-06-05 22-17-25 的屏幕截图.png)

![2018-06-05 22-18-54 的屏幕截图](/home/yychi/Pictures/2018-06-05 22-18-54 的屏幕截图.png)

<u>Notes</u>

1. $K^{\circ\circ} = K$
2. $K^{\circ}$ is closed and convex

Conjugation of convex functions Let $f: \mathbb{R}^n \mapsto \mathbb{R}\cup{\infty}$ be a convex function. The ==conjugation== of $f$ is $$f^*(y) := \sup_\limits{x}(y^Tx - f(x))$$ ![2018-06-05 23-57-38 的屏幕截图](/home/yychi/Pictures/2018-06-05 23-57-38 的屏幕截图.png)

Properties of the conjugate

1. $f^{**} = f$
2. $f^*$ is convex (supremum of affine functions of $y$)

![2018-06-06 00-03-01 的屏幕截图](/home/yychi/Pictures/2018-06-06 00-03-01 的屏幕截图.png)

## Convex functions

![2018-05-14 09-26-39 的屏幕截图](/home/yychi/Pictures/2018-05-14 09-26-39 的屏幕截图.png)

1. <u>affine</u> is convex: $f(x) = a^T x+b$

affine 既凸也凹

2. 任何_<u>范数</u>_是凸的

Proof: let $\pi(x)$ be a norm of $x$, then![2018-05-14 09-32-58 的屏幕截图](/home/yychi/Pictures/2018-05-14 09-32-58 的屏幕截图.png)

3. $f$ is convex $\iff$ epi($f$) is convex

### 1. Closed convex

A convex function $f$ is called closed if its epigraph is a closed set.

1. $f$ which is convex and continuous on a closed domain is a closed function. (norms)
2. all differentiable convex functions are closed with dom$f = \mathbb{R}^n$.
3. 当考虑一个凸函数时，通常认为在dom$f$外取值为$\infty$
4. <u>Jensen's inequality</u>:![2018-05-14 09-52-18 的屏幕截图](/home/yychi/Pictures/2018-05-14 09-52-18 的屏幕截图.png)
5. Corollary:![2018-06-04 18-35-09 的屏幕截图](/home/yychi/Pictures/2018-06-04 18-35-09 的屏幕截图.png)

pf: $f(x) = f(\sum\alpha_i x_i) \le \sum \alpha_i f(x_i) \le \max_\limits{i} f(x_i)$

### 2. Level sets

![2018-05-14 09-57-20 的屏幕截图](/home/yychi/Pictures/2018-05-14 09-57-20 的屏幕截图.png)

Note: the convexity of level sets does not characterize convex functions, but quasiconvex functions.

1. convex $f$ is closed $\implies$ all its level sets are closed

Some convex sets

1. norm ball (${x\in \mathbb{R}^n | |x| \le 1}$) is convex and closed
2. 椭球(${x | (x-a)^T Q (x-a) \le r^2}$) is convex and closed

pf: $x^TQy := \langle x, y \rangle$ 满足内积的三条性质

• bilinearity
• symmetry
• positivity 上述三条性质 $\iff$ Q is PSD
3. $\epsilon$-neighborhood: ![2018-06-04 20-56-40 的屏幕截图](/home/yychi/Pictures/2018-06-04 20-56-40 的屏幕截图.png)

### 3. Operations perserving convexity of functions

1. stability under taking weighted sums: $f,g \mapsto \lambda f + \mu g, ; \lambda,\mu \ge 0$
2. stability under affine substitutions of the argument: $x \mapsto Ax+b$ or $f(x) \mapsto \phi(x) = f(Ax+b)$
3. stability under taking pointwise sup: ${f_i}{i \in \mathcal{I}} \mapsto g(x) := \sup\limits{i \in \mathcal{I}}f_i(x)$, 凸函数族 ${f_i}_{i \in \mathcal{I}}$ 逐点取上确界而成的函数也是凸的
4. stability under partial minimization: $f(x,y)$ jointly convex in $(x,y)$, then $g(x) = \inf_\limits{y} f(x,y)$ is convex (suppose g is proper, i.e., > -$\infty$ everywhere and is finite at least at one point)
5. stability under perspective: $f(x) \mapsto g(x,t) = tf(x/t), \mathbf{dom}g = {(x,t) | x/t \in \mathbf{dom}f, t > 0}$

### 4. Detect convexity

<u>Necessary and Sufficient Convexity Condition for smooth function:</u>

• 一阶可微的光滑函数 $f$ 是凸的 $\iff$ $f'(x)$ 单调非减
• 二阶可微的光滑函数 $f$ 是凸的 $\iff$ $f''(x) \ge 0$

<u>subgradient property is characteristic of convex functions:</u>![2018-06-05 08-38-57 的屏幕截图](/home/yychi/Pictures/2018-06-05 08-38-57 的屏幕截图.png)

Examples

• ![2018-06-05 10-08-50 的屏幕截图](/home/yychi/Pictures/2018-06-05 10-08-50 的屏幕截图.png)
• ![2018-06-05 10-09-54 的屏幕截图](/home/yychi/Pictures/2018-06-05 10-09-54 的屏幕截图.png)
• ![2018-06-05 10-10-54 的屏幕截图](/home/yychi/Pictures/2018-06-05 10-10-54 的屏幕截图.png)
• ![2018-06-05 10-08-04 的屏幕截图](/home/yychi/Pictures/2018-06-05 10-08-04 的屏幕截图.png)

### 6. Optimality conditions

<u>第一充要条件（凸函数）</u> $x^* \in \mathbf{dom}f​$ is the minimizer $\iff​$ $0 \in \partial f(x^*)​$

### 7. Strong convexity

A differentiable function f is strongly convex if $$f(y) \ge f(x) + \nabla f(x)^T(y-x) + \frac{\mu}{2} |y-x|^2$$

<u>Note</u>

1. $f$ is not necessarily differentiable, (see the equivalent definition)
2. if $f$ is non-smooth, gradient -> subgradient
3. strong convexity $\implies$ strict convexity

Note: Intuitively speaking, strong convexity means that there exists a quartic lower bound on the growth of the function.

Equivalent definition \begin{align} &(i)~f(y)\ge f(x)+\nabla f(x)^T(y-x)+\frac{\mu}{2}\lVert y-x \rVert^2,~\forall x, y.\ &(ii)~g(x) = f(x)-\frac{\mu}{2}\lVert x \rVert^2~\text{is convex},~\forall x.\ &(iii)~\langle \nabla f(x) - \nabla f(y),x-y \rangle \ge \mu \lVert x-y\rVert^2,~\forall x, y.\ &(iv)~f(\alpha x+ (1-\alpha) y) \le \alpha f(x) + (1-\alpha) f(y) - \frac{\alpha (1-\alpha)\mu}{2}\Vert x-y\rVert^2,~\alpha \in [0,1].\ &(v)~\nabla^2 f(x) \succeq \mu \boldsymbol{I} \end{align}

## Lagrange Duality

Consider an optimization problem in standard form (not necessarily convex) $$\begin{array}{ll} \underset{x}{\text{minimize}} & f_0(x) \ \text{subject to} & f_i(x) \le 0, ~i=1,\cdots,m \ ~ & h_i(x) = 0, ~i=1,\cdots,p \end{array}$$

The Lagrangian is $$L(x,\boldsymbol{\lambda},\boldsymbol{\mu}) = f_0(x) + \sum_{i=1}^m \lambda_i f_i(x) + \sum_{i=1}^p \mu_i h_i(x)$$

The Lagrange dual function is defined as $$g(\lambda, \mu) = \inf_{x} L(x,\lambda,\mu)$$

![2018-06-11 12-48-10 的屏幕截图](/home/yychi/Pictures/2018-06-11 12-48-10 的屏幕截图.png)

Lagrange dual problem $$\begin{array}{ll} \underset{\lambda, \mu}{\text{maximize}} & g(\lambda, \mu) \ \text{subject to} & \boldsymbol{\lambda} \succeq \mathbf{0} \end{array}$$

Weak duality $$d^* \le p^*$$

• $d^*$: optima of dual problem
• $p^*$: optima of primal problem
• duality gap: $p^* - d^*$
• always hold

Strong dualiy $$d^* = p^*$$

• constraint qualifications $\implies$ strong duality
• Slater’s Constraint Qualification: a convex problem is strictly feasible (i.e., $\exists ~x \in \mathbf{int} \mathcal{D}: x \in \Omega$)

Complementary slackness ![2018-06-11 13-05-01 的屏幕截图](/home/yychi/Pictures/2018-06-11 13-05-01 的屏幕截图.png)

KKT conditions ![2018-06-11 13-06-37 的屏幕截图](/home/yychi/Pictures/2018-06-11 13-06-37 的屏幕截图.png)

![2018-06-11 13-08-39 的屏幕截图](/home/yychi/Pictures/2018-06-11 13-08-39 的屏幕截图.png)

## Cones

Tagent cone Let M be a (nonempty) convex set and $x^* \in M$, the tagent cone of $M$ at $x^*$ is the cone $$\begin{split} T_M(x^) &= {h \in \mathbb{R}^n | x^ + th \in M, ; \forall t > 0 } \ &= {y \in \mathbb{R}^n ~|~ y - x^* \in M} \end{split}$$

<u>Note:</u>

• Geometrically, this is the set of all directions leading from $x^*$ inside $M$
• convex but not necessarily closed
• fact: if $x^$ is a minimizer, then $\forall h \in T_M(x^) \implies h^T \nabla f(x^*) \ge 0$. （因为tangent cone里面都是可行解，所以必须不是下降方向）
• $T_M(x^) = \mathbb{R}^n \iff x^ \in \mathbf{int}M$

e.g. 多面体 $$M = {x | Ax \le b} = {x | a_i^Tx \le b_i, ; i = 1,\dots,m}$$ the tangent cone at $x^$ is $$T_M(x^) = {h~|~a_i^T h \le 0, ~\forall i, ~a_i^T x^* = b_i}$$

Normal cone: the polar cone of tangent cone $$N_M(x^) = {g \in \mathbb{R}^n ~|~ \langle g, y-x^\rangle \le 0, ~\forall y \in M}$$

<u>Note:</u>

• normal cone is the polar to tangent cone, i.e., $$\begin{split} T_M(x^) &= {g \in \mathbb{R}^n ~|~ \langle g, y-x^\rangle \ge 0, ~\forall y \in M} \ N_M(x^) &= {g \in \mathbb{R}^n ~|~ \langle g, y-x^\rangle \le 0, ~\forall y \in M} \end{split}$$
• fact: if $x^$ is a minimizer, then $-\nabla f(x^) \in N_M(x^*)$.
• ![2018-06-05 15-11-35 的屏幕截图](/home/yychi/Pictures/2018-06-05 15-11-35 的屏幕截图.png)

## Algorithm convergence

~ Stepsize Rule Convergence Rate Iteration Complexity
strongly convex & smooth $\eta_t = \frac{2}{\mu + L}$ $O\left(\frac{\kappa -1}{\kappa +1}\right)^t$ $O\left(\frac{\log\frac{1}{\epsilon}}{\log\frac{\kappa+1}{\kappa-1}}\right)$
convex & smooth $\eta_t = \frac{1}{L}$ $O(\frac{1}{\sqrt{t}})$ $O(\frac{1}{\epsilon})$
Frank-Wolfe
(strongly) convex & smooth $\eta_t = \frac{1}{t}$ $O(\frac{1}{\sqrt{t}})$ $O(\frac{1}{\epsilon})$
Projected GD
convex & smooth $\eta_t = \frac{1}{L}$ $O(\frac{1}{\sqrt{t}})$ $O(\frac{1}{\epsilon})$
strongly convex & smooth $\eta_t = \frac{1}{L}$ $O\left((1-\frac{1}{\kappa})^t\right)$ $O(\kappa\log\frac{1}{\epsilon})$
convex & Lipschitz $\eta_t = \frac{1}{\sqrt{t}}$ $O(\frac{1}{\sqrt{t}})$ $O(\frac{1}{\epsilon^2})$
strongly convex & Lipschitz $\eta_t = \frac{1}{t}$ $O\left(\frac{1}{t}\right)$ $O(\frac{1}{\epsilon})$
Proximal GD
convex & smooth (w.r.t. $f$) $\eta_t = \frac{1}{L}$ $O(\frac{1}{t})$ $O(\frac{1}{\epsilon})$
strongly convex & smooth (w.r.t. $f$) $\eta_t = \frac{1}{L}$ $O\left((1-\frac{1}{\kappa})^t\right)$ $O(\kappa\log\frac{1}{\epsilon})$

0
0 收藏

0 评论
0 收藏
0