MIS 302 Support Vector Machines (SVM)

I. Ozkan

Spring 2025

Preliminary Readings

Textbook

Some Slides are Adapted from Additional Book Chapter

Hands-On Machine Learning with R

Some Slides are Adapted from this Tutorial

Freie Universität Berlin Learning Module

Classification

Learning Objectives

Keywords:

SVM: Interpretability/Flexibility

SVM

SVM

Or

Hyperplane

Hyperplane: A hyperplane in \(p\)-dimensional space is defined by A flat affine subspace of dimension \(p − 1\) (a linear equation)

\(f\left(X\right) = \beta_0 + \beta_1 X_1 + \dots + \beta_p X_p = 0\)

\(p=2\) is a 2-d space and \(p=3\) is a 3-d space as for example are (adapted from Hands-On Machine Learning with R book):

\(p=2 \implies \beta_0 + \beta_1 X_1 + \beta_2 X_2 = 0\) (a line) and,

\(p=3 \implies \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_3 X_3 = 0\) (a plane)

Hyperplane

\(f\left(X\right) = \beta_0 + \beta_1 X_1 + \dots + \beta_p X_p > 0\)

or the other side of the hyperplane

\(f\left(X\right) = \beta_0 + \beta_1 X_1 + \dots + \beta_p X_p < 0\)

Another Synthetic Example

Another Synthetic Example

Optimal Separating Hyperplane

Geometrically:

\(\begin{align} &\underset{\beta_0, \beta_1, \dots, \beta_p}{\text{maximize}} \quad M \\ &\text{subject to} \quad \begin{cases} \sum_{j = 1}^p \beta_j^2 = 1,\\ y_i\left(\beta_0 + \beta_1 x_{i1} + \dots + \beta_p x_{ip}\right) \ge M,\quad i = 1, 2, \dots, n \end{cases} \end{align}\)

The soft margin classifier

Soft Margin Classifier

The maximization problem is a similar to hard margin with a simple addition, \(\epsilon_i\) , slack variable that allow individual observations to be on the wrong side of the margin or the hyperplane and \(C\) (budget) which is the non negative tunable hyperparameter:

\[\begin{align} &\underset{\beta_0, \beta_1, \dots, \beta_p}{\text{maximize}} \quad M \\ &\text{subject to} \quad \begin{cases} \sum_{j = 1}^p \beta_j^2 = 1,\\ y_i\left(\beta_0 + \beta_1 x_{i1} + \dots + \beta_p x_{ip}\right) \ge M\left(1 - \epsilon_i\right), \quad i = 1, 2, \dots, n\\ \epsilon_i \ge 0, \\ \sum_{i = 1}^n \epsilon_i \le C\end{cases} \end{align}\]

Wrong Side of the Margin and Hyperplane

Synthetic Data



Another Synthetic Data Example

Soft Margins

Support Vector Machines

Example

Kernel Functions

Popular Kernel Functions

Example

More Examples

SVMs with More than Two Classes