# 结构化学习

2018/10/31 18:06

## 一．问题的提出

$({x_1},{y_1}),...,({x_n},{y_n}) \in X \times Y$

$f:X \to Y$

## 二．Ranking SVM

### 2. 模型建立

$D = \{ {D_{qi}}\} _{i = 1}^N,{D_{qi}} = \{ ({d_{ij}},{y_{ij}})\} _{j = 1}^{{M_i}}$

$f(q,d) = \left\langle {w,\phi (q,d)} \right\rangle$

$\left( {\phi (q,di) - \phi (q,di),z} \right),z = \left\{ {\matrix{ { + 1,{d_i} > {d_j}} \cr { - 1,{d_i} < {d_j}} \cr } } \right.$

### 3. 模型求解

我们可以使用任何分类模型解决上面的问题，比如SVM，于是就是Ranking SVM。

## 三．模型的建立

${F_{dis}}:X \times Y \to R$

$F(x,y;w) = < w,\psi (x,y) >$

$\hat y = \mathop {\arg \max }\limits_{y \in Y} F(x,y;w)$

$\forall y \in Y\backslash {y_i},(\psi ({x_i},{y_i}) - \psi ({x_i},y), + 1)$

$\forall y \in Y\backslash {y_i},(\psi ({x_i},y) - \psi ({x_i},{y_i}), - 1)$

## 四．损失函数的建立

$E = \mathop {\max }\limits_{y \in Y} < w,\psi ({x_i},y) > - < w,\psi ({x_i},{y_i}) >$

${E_{empirical}} = {1 \over n}\sum\limits_n {\Delta (\tilde y,y)}$

## 五．使用感知机进行求解

${{\partial E} \over {\partial w}} = \psi ({x_i},\tilde y) - \psi ({x_i},{y_i})$

${w^{k + 1}} = {w^k} - \eta \nabla E = {w^k} - \eta (\psi ({x_i},\tilde y) - \psi ({x_i},{y_i}))$

${w^{k + 1}} = {w^k} - \psi ({x_i},\tilde y) + \psi ({x_i},{y_i})$

$k \le {R \over \sigma }$

## 六．Structured SVM

$\forall i \in \{ 1,2,3,...,n\} ,\forall y \in Y\backslash {y_i}$

$< w,\psi ({x_i},{y_i}) > - < w,\psi ({x_i},y) > \ge 1$

$\mathop {\min }\limits_w {1 \over 2}{\left\| w \right\|^2}$

$\mathop {\min }\limits_{w,{\xi _i}} {1 \over 2}{\left\| w \right\|^2} + {C \over n}\sum\limits_{i = 1}^n {{\xi _i}}$

s.t.

$< w,\psi ({x_i},{y_i}) > - < w,\psi ({x_i},y) > \ge 1 - {\xi _i},\forall i \in \{ 1,2,3,...,n\} ,\forall y \in Y\backslash {y_i}$

## 七．re-scaling slacks or margin

### 1. slack re-scaling

$< w,\psi ({x_i},{y_i}) > - < w,\psi ({x_i},y) > \ge 1 - {{{\xi _i}} \over {\Delta ({y_i},y)}},\forall i \in \{ 1,2,3,...,n\} ,\forall y \in Y\backslash {y_i}$

### 2. margin re-scaling

$< w,\psi ({x_i},{y_i}) > - < w,\psi ({x_i},y) > \ge \Delta ({y_i},y) - {\xi _i},\forall i \in \{ 1,2,3,...,n\} ,\forall y \in Y\backslash {y_i}$

## 八．why not 经验损失？

${\xi _i} = {\max _{y \ne {y_i}}}\{ 0,\max \Delta ({y_i},y)(1 - ( < w,\psi ({x_i},y) - < w,\psi ({x_i},{y_i}) > ))\}$

1.$< w,\psi ({x_i},y) > = < w,\psi ({x_i},{y_i}) >$

${\xi _i} \ge 0 = \Delta ({y_i},y)$

2.不相等，即我们没有找到严丝合缝的match：

${\xi _i} = \max \Delta ({y_i},y)(1 - ( < w,\psi ({x_i},y) - < w,\psi ({x_i},{y_i}) > )) \ge \Delta ({y_i},y)$

0
0 收藏

0 评论
0 收藏
0