GitHub - nitesh2104/Unsupervised-Learning

Unsupervised Learning and Dimensionality Reduction

In this project, we explore unsupervised learning algorithms and perform Dimensionality Reduction to obtain a subset of features with most information

---How to Run---

Each requirement is contained within the respective directory (KMeans, PCA, ICA etc.)
Run this cmd from the root directory: jupyter-lab
After the browser opens - open the file and run all cells
Note: np.random.seed(0) is already added to ensure output consistent runs

Algorithms

Dataset

Phone Price Prediction
Salary Prediction

STEPS and Guidelines

Run the clustering algorithms on the datasets
Apply the dimensionality reduction algorithms to the two datasets
Reproducing the clustering experiments
Applying the dimensionality reduction algorithms and reruning the neural network learner on the newly projected data.
Applying the clustering algorithms to the same dataset to which we just applied the dimensionality reduction algorithms, treating the clusters as if they were new features. In other words, treat the clustering algorithms as if they were dimensionality reduction algorithms. Again, rerun the neural network learner on the newly projected data.

Requirements

a discussion of the datasets, and why they're interesting: If we're using the same datasets as before at least briefly remind us of what they are so we don't have to revisit the old assignment write-up... and if we aren't well that's a whole lot of work we're going to have to recreate from assignment 1 isn't it?
explanations of the methods: for example, how did we choose k?
a description of the kind of clusters that we got.
analyses of the results
Describe how the data looks in the new spaces which is created with the various algorithms? For PCA, what is the distribution of eigenvalues? For ICA, how kurtotic are the distributions? Do the projection axes for ICA seem to capture anything "meaningful"? Assuming we only generate k projections (i.e., we do dimensionality reduction), how well is the data reconstructed by the randomized projections? PCA? How much variation did we get when we re-ran the RP several times (I know I don't have to mention that we might want to run RP many times to see what happens, but I hope we forgive me)?
When the data reproduces the clustering experiments on the datasets projected onto the new spaces created by ICA, PCA, and RP, the clusters same as before ? Different clusters? Why? Why not?
When we re-ran the neural network algorithms were there any differences in performance? Speed? Anything at all?

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Data		Data
ICA		ICA
IPCA		IPCA
KMeans_and_EM		KMeans_and_EM
PCA		PCA
RP		RP
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
common.py		common.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Unsupervised Learning and Dimensionality Reduction

In this project, we explore unsupervised learning algorithms and perform Dimensionality Reduction to obtain a subset of features with most information

---How to Run---

Algorithms

Dataset

STEPS and Guidelines

Requirements

About

Uh oh!

Releases

Packages

Languages

nitesh2104/Unsupervised-Learning

Folders and files

Latest commit

History

Repository files navigation

Unsupervised Learning and Dimensionality Reduction

In this project, we explore unsupervised learning algorithms and perform Dimensionality Reduction to obtain a subset of features with most information

---How to Run---

Algorithms

Dataset

STEPS and Guidelines

Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages