-
Notifications
You must be signed in to change notification settings - Fork 792
supervised
- summary Tutorial on supervised learning using pmtk3
To create a model of type 'foo', use one of the following
where '...' refers to optional arguments, and 'foo' is a string such as
* 'linreg' (linear regression) * 'logreg' (logistic regression binary and multiclass) * 'mlp' (multilayer perceptron, aka feedforward neural network) * 'naiveBayesBer' (NB with Bernoulli features) * 'naiveBayesGauss' (NB with Gaussian features) * 'discrimAnalysis' (LDA or QDA) * 'RDA' (regularized LDA)
Here X is an N*D design matrix, where N is the number of training cases and D is the number of features. y is an N*1 response vector, which can be real-valued (regression), 0/1 or -1/+1 (binary classification), or 1:C (multi-class).
Once the model has been fit, you can make predictions on test data (using a plug-in estimate of the parameters) as follows
Here yhat is an Ntest*1 vector of predicted responses of the same type as ytrain, where Ntest is the number of rows in Xtest. For regression this is the predicted mean, for classification this is the predicted mode. The meaning of py depends on the model, as follows:
* For regression, py is an Ntest*1 vector of predicted variances. * For binary classification, py is an Ntest*1 vector of the probability of being in class 1. * For multi-class, py is an Ntest*C matrix, where py(i,c) = p(y=c|Xtest(i,:),params)
Below we consider some specific examples.
We can fit a linear regression model, and use it to make predictions on a test set, as follows
See the following demos for examples of this code in action:
* [http://pmtk3.googlecode.com/svn/trunk/docs/demoOutput/Linear_regression/linregDemo1.html linregDemo1] Fit a linear regression model in 1d.
- linregPolyVsDegree Fit a polynomial regression model in 1d.
Using maximum likelihood to train a model often results in overfitting. So it is very common to use regularization, or MAP estimation, instead. We can use various kinds of prior. The simplest and most widely used is the spherical Gaussian prior, which corresponds to using L2 regularization. We can fit an L2-regularized linear regression model as follows (where lambda is ths strength of the regularizer, or equivalently, the precision of the Gaussian prior):
This technique is known as ridge regression. See the following demos for examples of this code in action:
* [http://pmtk3.googlecode.com/svn/trunk/docs/demoOutput/Introduction/linregPolyVsReg.html linregPolyVsReg] Show the effect of changing the regularization parameter in ridge regression.
To be written
To be written