Goal: To reduce the dimension of the data while retaining as much information as possible
Data: https://www.kaggle.com/c/digit-recognizer
The given Data set has more than 800 columns.
1.) Standardization
2.) Covariance matrix computation
3.) Compute the eigenvectors and eigenvalues of the covariance matrix to identify the principal components
4.) Feature vector
5.) Recast the data along the principal components axes
1.) Compute probabilities proportional to the similarity of objects
2.) Measure similarities between low-dimensional points
3.) Minimization of the Kullback–Leibler divergence reflects similarities between the high-dimensional inputs