Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Posts

Training a Wasserstein GAN on the free google colab TPU

11 minute read

Published: November 08, 2019

The jupyter notebook is available on my github repo. Click HERE to play with google colab on live!

EM Algorithm and Gaussian Mixture Model for Clustering

13 minute read

Published: July 10, 2019

In the last post on EM algorithm, we introduced the deduction of the EM algorithm and use it to solve the MLE of the heads probability of two coins. In this post, we will apply EM algorithm to more practical and useful problem, the Gaussian Mixture Model (GMM), and discuss about using GMM for clustering.

An Introduction to Expectation-Maximization (EM) Algorithm

15 minute read

Published: July 07, 2019

Expectation Maximization (EM) algorithm is a special case of MLE where the observations (data samples $\mathbf{x}$) are inherently related with some hidden variables ($\mathbf{z}$). First of all, we need to review the basics of MLE.

Maximum Likelihood Estimation (MLE)

Let $\{\mathbf{x}^{(i)}\},\ i=1,\dots,n$ be a set of independent and identically distributed observations, and $\mathbf{\theta}$ be the parameters of the data distribution which are unknown for us. The maximum likelihood estimation of the parameters $\theta$ is the parameters which can maximize the joint distribution $p_\theta(\mathbf{x}^{(1)},\dots,\mathbf{x}^{(n)})= \prod_{i=1}^{n}p_\theta(\mathbf{x}^{(i)})$

$$ \hat{\theta}_\text{MLE} = \arg\ \max_{\theta}{\prod_{i=1}^{n}p_\theta(\mathbf{x}^{(i)})} $$

More commonly, we choose to maximize the joint log-likelihood:

$$ \hat{\theta}_\text{MLE} = \arg\ \max_{\theta}{\sum_{i=1}^{n}\log p_\theta(\mathbf{x}^{(i)})} $$

An Introduction to Support Vector Machines (SVM): A Python Implementation

17 minute read

Published: July 04, 2019

The Jupyter notebook is available at my github: SupportVectorMachine/SVM-Tutorial

An Introduction to Support Vector Machines (SVM): Sequential Minimal Optimization (SMO)

9 minute read

Published: June 28, 2019

Recall the Kernel SVM dual problem:

An Introduction to Support Vector Machines (SVM): kernel functions

5 minute read

Published: June 27, 2019

Recall of the Slack SVM dual problem:

An Introduction to Support Vector Machines (SVM): SVM with slack variables

13 minute read

Published: June 07, 2019

Recall of the SVM primal problem:

$$ \min_{\mathbf{w},b}\ \frac{1}{2}\|\mathbf{w}\|^2\\ \begin{align} \ \ \ s.t.\ \ & y_i(\mathbf{w}^T\mathbf{x}_i+b)\geq 1,\ i=1,\dots,n \end{align} $$

An Introduction to Support Vector Machines (SVM): Dual problem solution using Gradient Descent

4 minute read

Published: May 27, 2019

Recall of the SVM primal problem and dual problem:
Primal Problem

$$ \min_{\mathbf{w},b}\ \frac{1}{2}\|\mathbf{w}\|^2\\ \begin{align} s.t.\ \ & y_i(\mathbf{w}^T\mathbf{x}_i+b)\geq 1,\ i=1,\dots,n \end{align} $$

Dual Problem

$$ \max_{\lambda}\ \sum_{i=1}^{n}\lambda_i - \frac{1}{2}\sum_{i,j}\lambda_i \lambda_j y_i y_j \mathbf{x}_i^T\mathbf{x}_j\\ \begin{align} s.t.\ \ & \lambda_i \geq 0,\ i=1,\dots,n\\ & \sum_{i=1}^{n}\lambda_i y_i = 0 \end{align} $$

The the last post we introduced how to apply Lagrangian duality to SVM and how to get the primal optimum once we get the dual optimum. In this post we mainly discuss how to solve the dual problem and get the dual optimum.

An Introduction to Support Vector Machines (SVM): Convex Optimization and Lagrangian Duality Principle

7 minute read

Published: May 25, 2019

In the last post we have conquered how to use gradient descent algorithm to train a SVM. Although using GD can solve the SVM optimization, GD has some shortcomings:

Gradient procedure is time consuming and the solution may be suboptimal.
GD method cannot explicitly identify support vectors (points) which determine the hyperplane.

An Introduction to Support Vector Machines (SVM): Gradient Descent Solution

5 minute read

Published: May 24, 2019

In the last post, we discussed that the SVM optimization problem is:

$$\text{min}\frac{1}{2}\|\mathbf{w}\|^2,\ \ \text{s.t.}\ \ \ y_i(\mathbf{w}^T\mathbf{x}_i+b)\geq 1, \ \ i=1,\dots,n$$

To solve this optimization problem, there are multiple ways. One way is to treat this problem as a standard optimization problem and use gradient descent algorithm to compute the optimal parameters. Another way is to formulate the Lagrangian dual problem of the primal problem, transferring original optimization problem into an easier problem. Here we mainly discuss the first method.

An Introduction to Support Vector Machines (SVM): Basics

4 minute read

Published: May 23, 2019

Support Vector Machine (SVM) is a method for classification (and possibly for regression). Here we mainly discuss the most common application: binary classification problem.

Given a training dataset with binary classes {+1, -1}, SVM means to find a separating hyperplane which can maximize the margin.

Nianlong Gu

Sitemap

Pages

Page Not Found

About Me

Archive Layout with Content

Posts by Category

Posts by Collection

CV

Markdown

Page not in menu

Page Archive

Portfolio

Publications

Sitemap

Posts by Tags

Talk map

Talks and presentations

Teaching

Terms and Privacy Policy

Blog posts

Jupyter notebook markdown generator

Posts

Training a Wasserstein GAN on the free google colab TPU

EM Algorithm and Gaussian Mixture Model for Clustering

An Introduction to Expectation-Maximization (EM) Algorithm

Maximum Likelihood Estimation (MLE)

An Introduction to Support Vector Machines (SVM): A Python Implementation

An Introduction to Support Vector Machines (SVM): Sequential Minimal Optimization (SMO)

An Introduction to Support Vector Machines (SVM): kernel functions

An Introduction to Support Vector Machines (SVM): SVM with slack variables

An Introduction to Support Vector Machines (SVM): Dual problem solution using Gradient Descent

An Introduction to Support Vector Machines (SVM): Convex Optimization and Lagrangian Duality Principle

An Introduction to Support Vector Machines (SVM): Gradient Descent Solution

An Introduction to Support Vector Machines (SVM): Basics

portfolio

publications

talks

teaching