This repository contains a consumption pipeline for forecasting electrical power consumption using a Long Short-Term Memory (LSTM) neural network and an AutoRegressive Integrated Moving Average (ARIMA) model. The pipeline is designed to download and preprocess the data, train and evaluate the models, and upload the results to Firebase.
pipeline.yml
: A YAML file that defines the GitHub Actions workflow to automate the pipeline.train_evaluate.py
: Python script that handles data preprocessing, model training and evaluation, and forecasting using the LSTM and ARIMA models.firebase_upload.py
: Python script that uploads the prediction results to Firebase Realtime Database.requirements.txt
: Lists the required Python packages to run the pipeline.
- Download the dataset from Kaggle using the Kaggle API.
- Pre-process the data, handling missing values and resampling it to hourly frequency.
- Normalize the data and split it into train and test sets.
- Build and train the LSTM model, and evaluate its performance.
- Forecast global active power using the ARIMA model.
- Upload the forecast results to Firebase Realtime Database.
- Python 3.8
- TensorFlow 2.12.0
- NumPy
- pandas
- Kaggle
- scikit-learn 1.2.2
- matplotlib 3.7.1
- seaborn 0.12.2
- requests
- firebase-admin
- statsmodels 0.14.0
To run the pipeline, execute the following steps:
- Clone the repository.
- Set up the required environment variables:
KAGGLE_USERNAME
: Your Kaggle username.KAGGLE_KEY
: Your Kaggle API key.FORECAST_KEY
: Your Firebase service account key (in JSON format).
- Install the required Python packages using the
requirements.txt
file. - Run the
train_evaluate.py
script to train and evaluate the LSTM and ARIMA models. - Run the
firebase_upload.py
script to upload the forecast results to Firebase Realtime Database.
This project is licensed under the MIT License.