A project for the Artificial Intelligence course part of the Master Degree in Computer Science at the University of Bologna.
Specific analysis of the data collected by sensors of industrial machines for the development of a Machine Learning application which exploits the correlations between various features discovered in the dataset we used, in order to provide a sufficiently valid and correct prediction model regarding a probable failure of a machine.
Dataset used: “AI4I Predictive Maintenance Dataset”.
Link: https://archive.ics.uci.edu/dataset/601/ai4i+2020+predictive+maintenance+dataset
The version of the dataset we used can be found in this github repository under the name "predictive_maintenance.csv".
- Refinement and pre-processing of the dataset used in order to keep the useful data for our purpose and discard everything else;
- Exploratory Data Analysis (EDA), an in-depth study of the dataset aimed at discovering its main characteristics and searching for possible patterns, using statistical analysis tools;
- Application of certain classification algorithms to see which one can do the job most accurately.
We can now observe the probability of a failure correlated to the values of the studied features, specifically how with high values this probability increases.
We can observe here the distribution of the various types of failures.
- Logistic regression
- KNN
- Support Vector Machine
- Random Forest
- Ada Boost
- XG Boost
- Naive Bayes
- Decision Tree Classifier
- Multi Layer Perceptron
The 'XGB Classifier' algorithm returned an accuracy of 98% and an execution time of 0.4/0.5s. For a more in-depth study, we show below the scores XGB returned on the classification of records on the target attributes (precision, recall, f-1 score and support).