This project uses sensor data from industrial machines to build a machine learning model that predicts equipment failure. The goal is to demonstrate a full data science workflow, from data exploration to model tuning, to solve a real-world business problem.
This project uses the AI4I 2020 Predictive Maintenance Dataset to predict machine failure. The key challenge was handling the imbalanced nature of the data, where failures are rare events.
- Data Exploration: Analyzed sensor data distributions and identified key relationships.
- Feature Engineering: Created new features like
Power
andtemp_diff
to improve model performance. - Modeling: Trained a
RandomForestClassifier
. - Evaluation: Focused on precision and recall, identifying an initial weakness in detecting failures.
- Tuning: Improved the model by adjusting the prediction threshold to find a better balance between catching failures and avoiding false alarms.
The final model successfully identifies 66% of machine failures while maintaining a precision of 82%. This demonstrates the ability to create a useful predictive tool and make data-driven decisions to optimize model performance for a specific business need.
The project is self-contained. The dataset is included in the repository.
- Clone the repository:
git clone <your-repo-link>
- Navigate to the project directory:
cd <your-repo-name>
- Install the required libraries:
pip install -r requirements.txt
- Run the Jupyter Notebook
predictive_maintenance.ipynb
.