This is a repo for Machince learning algorithms; Python and R.
It includes a lot of commends on each individual template, so that it is easier to use it as an off-the-shelf solution, with minimal effort.
-
Data Prepocessing
Clearing datasets, create categorical variables, separate training/test sets.
-
Regression Algorithms
Simple Linear, Multiple Linear, Polynomial, Support Vector, Decision Trees, Random Forest.
-
Classification Algorithms
Logistic Regression, KNN, SVM, Kernel SVM, Naive Bayes, Decision Trees, Random Forest.
-
Clustering
K-means and Hierarchical clustering.
-
Association Rule Learning
Apriori and Eclat.
-
Reinforcement Learning
Upper Confidence Bound (UCB) and Thompson Sampling.
-
Natural Language Processing
NLP template.
-
Deep Learning
Neural Networks: ANNs and CNN (image recognition) with Keras and TensorFlow.
-
Dimensionality Reduction
Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Kernel PCA.
-
Model Selection + XGBoosting
Model Selection techniques (Grid Search and k-fold Cross Validation) and XGBoost Algorithm.
-
Multi-Output Models
Examples of Model techniques for multiple output dependent variables. MultiOutputRegressor sklearn Class examples.
Various additional modeling scripts, like scorecard etc.
Reading Material.
Folders that include interesting reading and coding material.
The main Python libraries used in this repo are: pandas, numpy, scipy, sklearn, matplotlib, keras, jupyter instructions and tensorflow for ANNs.