Skip to content

πŸ’‘ A detailed repo which consists optimised explanations on major libraries used for implimenting Machine Learning.

Notifications You must be signed in to change notification settings

thenorthkun/ML-foundations

Repository files navigation

ML-foundations

  • This repository covers detailed explanations of all Python libraries which are essential for Machine Learning.
  • Each notebook contains the most commonly used functions/practices which are required in assessment of data sets.
  • Exploratory Data Analysis (EDA) for an hands-on experience for how data is pre-processed before training into a ML model.
  • Probablility concepts which are integral part of the Math behind concepts of Machine Learning.

Build your concepts πŸ“œ

All these libraries forms the ground upon which everything is built in Machine Learning. One should follow this chronology in order to build the foundation step-by-step :-

1. Numpy πŸ”’

NumPy is a python library used for working with arrays. It contains multi-dimensional arrays and matrix data structures. It can be utilised to perform a number of mathematical operations on arrays such as trigonometric, statistical, and algebraic.

2. Pandas 🐼

Pandas is mainly used for data analysis. This library is built on top of Numpy. Pandas allows importing data from various file formats such as comma-separated values, csv, JSON, SQL, Microsoft Excel. Pandas allows various data manipulation operations such as merging, reshaping, selecting, as well as data cleaning, and data wrangling features.

3. Matplotlib πŸ“Š

Matplotlib is a plotting library primarily used for data visualization. This library is also an extension of Numpy. This library is used for plotting various graphs such as Line plots, bar graphs, Pie charts and other figures.

4. Seaborn πŸ“ˆ

Seaborn is a Python data visualization library based on top of matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. Matplotlib generally consists of basic plots but Seaborn, on the other hand, provides a variety of visualization patterns. It uses fewer syntax and has easily interesting default themes. More advanced plots like Heatmaps, Box plots, Histogram, Scatter plots & many more.

5. Sci-kit Learn πŸ’»

Scikit-learn is a machine learning library for Python which features various algorithms like support vector machine, random forests, and k-neighbours in a pre-built form.

6. Data Analysis (EDA) πŸ“

All the libraries are used in this phase.
Exploratory Data Analysis refers to analyzing data sets to summarize their main characteristics, often with visual methods. It is a critical process of performing initial investigations on data so as to discover patterns,to spot anomalies,to test hypothesis, find outliers, look for trends and to check assumptions with the help of summary statistics and graphical representations.

7. Probability (Extra Content) ⏳

Probability plays an underlying part inside the world of Machine Learning. It is used to judge possibilities, perform Hypothesis & categorize data on the basis of there mathematical distributions.


References:

https://numpy.org/devdocs/reference/index.html
https://pandas.pydata.org/docs/reference/index.html#api
https://matplotlib.org/3.3.1/tutorials/index.html
https://www.datacamp.com/community/data-science-cheatsheets
https://elitedatascience.com/python-seaborn-tutorial https://www.youtube.com/watch?v=vmEHCJofslg&list=PLFCB5Dp81iNVmuoGIqcT5oF4K-7kTI5vp
https://www.youtube.com/watch?v=Pkvdc2Z6eBg

About

πŸ’‘ A detailed repo which consists optimised explanations on major libraries used for implimenting Machine Learning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •