# Derive hinge loss from SVM

## What is hinge loss

The hinge loss is a loss function used to train the machine learning classifier, which is

$L(\hat{y}) = max(0, 1 - y\hat{y})$      (1)

where $y$ =  -1 or 1  indicating two classes and  $\hat{y}$ represents the output from our classifier.

However, the SVM I know is like

$min\frac{1}{2}\parallel W \parallel^{2}_{} + C\sum^{N}_{i = 1}\xi^{}_{i}$      (2)

s.t.    $\xi^{}_{i} \geqslant 0, y^{}_{i}(x^{T}_{i}W + b) \geqslant 1-\xi^{}_{i} \forall i$

So what is the relation between the two? Are they just two perspectives to look at the same model?