Skip to content
murphyk2 edited this page Jun 10, 2010 · 8 revisions
  1. summary Tutorial on supervised learning using pmtk3

Table of Contents

Overview

To create a model of type 'foo', use one of the following

where '...' refers to optional arguments, and 'foo' is a string such as

  * 'linreg' (linear regression)
  * 'logreg' (logistic regression  binary and multiclass)
  * 'mlp' (multilayer perceptron, aka feedforward neural network)
  * 'naiveBayesBer' (NB with Bernoulli features)
  * 'naiveBayesGauss' (NB with Gaussian features)
  * 'discrimAnalysis' (LDA or QDA)
  * 'RDA' (regularized LDA)

Here X is an N*D design matrix, where N is the number of training cases and D is the number of features. y is an N*1 response vector, which can be real-valued (regression), 0/1 or -1/+1 (binary classification), or 1:C (multi-class).

Once the model has been fit, you can make predictions on test data (using a plug-in estimate of the parameters) as follows

Here yhat is an Ntest*1 vector of predicted responses of the same type as ytrain, where Ntest is the number of rows in Xtest. For regression this is the predicted mean, for classification this is the predicted mode. The meaning of py depends on the model, as follows:

   * For regression, py is an Ntest*1 vector of predicted variances.
   * For binary classification, py is an Ntest*1 vector of the probability of being in class 1.
   * For multi-class, py is an Ntest*C matrix, where py(i,c) = p(y=c|Xtest(i,:),params)

Below we consider some specific examples.

Linear regression

We can fit a linear regression model, and use it to make predictions on a test set, as follows

See the following demos for examples of this code in action:

 * [http://pmtk3.googlecode.com/svn/trunk/docs/demoOutput/Linear_regression/linregDemo1.html linregDemo1] Fit a linear regression model in 1d.

Regularization

Using maximum likelihood to train a model often results in overfitting. So it is very common to use regularization, or MAP estimation, instead. We can use various kinds of prior. The simplest and most widely used is the spherical Gaussian prior, which corresponds to using L2 regularization. We can fit an L2-regularized linear regression model as follows (where lambda is ths strength of the regularizer, or equivalently, the precision of the Gaussian prior):

This technique is known as ridge regression. See the following demos for examples of this code in action:

 * [http://pmtk3.googlecode.com/svn/trunk/docs/demoOutput/Introduction/linregPolyVsReg.html linregPolyVsReg] Show the effect of changing the regularization parameter in ridge regression.

Cross validation

To be written

Kernels

To be written

Clone this wiki locally