This project focuses on predicting the state of health of batteries using machine learning models. It involves processing and analyzing battery data to forecast their capacity and performance degradation over time. This is crucial for industries like electric vehicles and renewable energy storage, where battery reliability and lifespan are critical.
The script is part of a larger project that aims to predict the state of health of batteries. It uses various machine learning algorithms to analyze battery data and predict their future performance. The script includes the following key features:
-
Data Preprocessing: Reads and parses battery data, transforming it into a suitable format for machine learning models.
-
Feature Engineering: Extracts relevant features from the raw data, such as the max, min, average, and standard deviation of various measurements.
-
Model Training: Employs multiple regression models, including Linear Regression, Ridge, Lasso, RandomForestRegressor, CatBoostRegressor, XGBRegressor, SVR, and neural networks (MLP, LSTM, CNN), to predict battery capacity.
-
Hyperparameter Tuning: Utilizes GridSearchCV for hyperparameter tuning and model optimization.
-
Model Evaluation: Evaluates models based on Mean Squared Error (MSE) and R-squared (R2) metrics.
-
Result Serialization: Saves the trained models, preprocessing pipeline, and performance results for future use and analysis.
-
Performance Visualization: Generates charts to visualize model performance metrics for comparison.
To run the project, ensure you have Python installed along with the following libraries:
- pandas
- numpy
- matplotlib
- scikit-learn
- catboost
- xgboost
- tensorflow
- scipy
You can install these packages via pip:
pip install pandas numpy matplotlib scikit-learn catboost xgboost tensorflow scipy
Place your .mat
data files in the data
directory specified in the script. The project uses a specific dataset for training and evaluation, which can be downloaded from the provided link. After downloading, extract the .mat
files into the data
directory. Run the script to process the data, train the models, and evaluate their performance. The results, including the serialized models, will be saved in a timestamped subdirectory under results
.
To run the script:
python src/main/python/main.py
The data should be in MATLAB .mat
format, containing time-series measurements like voltage, current, and temperature. The script is tailored for data with specific structure, so it might require adjustments for differently formatted datasets.
The dataset used in this project is available for download here: Battery Data Set. This dataset, provided by NASA, includes various measurements crucial for assessing the health and capacity of batteries. It's instrumental for the development and validation of prognostic algorithms.
A detailed description of the dataset is available in the NASA publication: Battery Data Set Description.
The dataset comprises several parameters, each representing a specific aspect of the battery's state. Below is a table outlining the primary columns in the dataset:
Column Name | Description |
---|---|
Voltage_measured |
Voltage measured across the battery terminals. |
Current_measured |
Current measured going in or out of the battery. |
Temperature_measured |
Temperature of the battery. |
Voltage_load |
Voltage of the battery under load conditions. |
Current_load |
Current of the battery under load conditions. |
Time |
Time stamp for each measurement. |
Capacity |
Battery capacity, indicating the health of the battery. |
Each row in the dataset represents a different measurement point, providing a comprehensive view of the battery's performance over time.
Contributions are welcome. You can contribute by improving the existing code, adding new features, or fixing bugs. Please ensure to follow the existing coding style and add comments where necessary.
This project is open-sourced under the MIT License.
Note: Ensure to include a LICENSE.md
file in your project with the appropriate licensing details. This readme is a generic template and might require modifications to fit the specifics of your project.