Time Series Forecasting Analysis For Corporation Favorita ⏰

Welcome to the Time Series Forecasting Project. The mission is to predict store sales using data from Corporation Favorita, a major grocery retailer in Ecuador, using Machine Learning techniques.

Preview 🔍

Below is a preview showcasing some features of the notebook.

Top

Middle

Bottom

Project Description 📋

The primary focus of this project is to utilize time series regression analysis to forecast sales for Corporation Favorita, a prominent grocery retailer based in Ecuador.

The objective is to develop a robust model capable of accurately forecasting future sales by leveraging the extensive time series data of thousands of products sold across various Corporation Favorita locations. The resulting forecasts will provide valuable insights to the store's management, enabling them to formulate effective inventory and sales plans.

I will utilize the CRISP-DM framework to execute the project.

Project Overview 🖼

The project includes the following stages:

1. 📥 Data Collection

Azubi Africa's SQL Server database, GitHub repository and OneDrive.

N/B: The datasets from the sources above where saved in the root folder of this repository for easy access

2. 📚 Data Loading

Utilizing pyodbc for SQL data
Leveraging pandas for CSV and Excel files

3. 💡 Exploratory Data Analysis (EDA)

Merging
Duplicate checks
Handling missing values
Renaming Columns and Changing Datatypes
Creating new features
Visualizations
Hypothesis testing
ADF Testing

4. 📈 Answering Questions with Visualizations

Questions:

Is the train dataset complete (has all the required dates)?
Which dates have the lowest and highest sales for each year?
Did the earthquake impact sales?
Are certain groups of stores making more sales than others? (Cluster, city, state, type)
Are sales affected by promotions, oil prices and holidays?
What analysis can we get from the date and its extractable features?
What is the difference between RMSLE, RMSE, MSE (or why is the MAE greater than all of them?)
What is the total sales made each year by the corporation?

Visualization Tools:

Matplotlib
Seaborn

5. ⚙️ Feature Engineering

Data Splitting
Feature Encoding with OneHotEncoder
Feature Scaling with MinMaxScaler

6. ⌛ Model Training

Linear Regression
XGBoost
CatBoost
AutoReg
ARIMA
SARIMA

7. 📊 Model Evaluation

Mean Absolute Error (MAE)
Mean Squared Error (MSE)
Root Mean Squared Error (RMSE)
Mean Squared Logarithmic Error (MSLE)
Root Mean Squared Logarithmic Error (RMSLE)

8. 🎯 Hyperparameter Tuning

RandomSearchCV

9. 🤔 Prediction on Validation Set

10. 📒 Prediction on Test Dataset

11. 💭 Exportation

os
pickle
save_model

Installation and Setup 🔧

To get started with this project, you'll need to install the following Python packages using pip:

pip install pyodbc sqlalchemy lightgbm catboost python-dotenv pandas numpy matplotlib seaborn scipy pmdarima

Make sure to have these packages installed before running the project. Follow these steps for installation:

Clone this repository to your local machine.
Install the required Python packages using pip:
```
pip install -r requirements.txt
```

You're now ready to dive into this exciting data journey!

Author 👨‍💼

This project was built by Chidiebere David Ogbonna.

Here are his details:

Detail	Link
Email	eberedavid326@gmail.com
LinkedIn	chidieberedavidogbonna
GitHub	iameberedavid
Medium	eberedavid
Twitter	iameberedavid

Article Publication 📄

I went further to explain the work process of this project in an article published on Medium. Click on this link to find the article: TIME SERIES FORECASTING ANALYSIS FOR CORPORATION FAVORITA.

App Deployment 📲

The model was embedded into a streamlit app and deployed for public usage. This was done in this GitHub repository: Embed-Corporation-Favorita-Timeseries-Model-To-Streamlit. The deployment process was discussed in this medium article: BUILDING A USER-FRIENDLY SALES PREDICTION APP USING STREAMLIT AND HUGGINGFACE.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments 🙏

I would like to express my gratitude to the Azubi Africa Data Analyst Program for offering valuable projects as part of this program. Not forgeting my scrum masters Rachel Appiah-Kubi & Emmanuel Koupoh for their support throughout this program.

Contact

Feel free to send your reviews, suggestions, questions and collaboration requests to eberedavid326@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
Images		Images
catboost_info		catboost_info
dev		dev
export		export
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
export.zip		export.zip
holiday.csv		holiday.csv
oil.csv		oil.csv
sample_submission.csv		sample_submission.csv
stores.csv		stores.csv
test.csv		test.csv
train.csv		train.csv
transactions.csv		transactions.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Time Series Forecasting Analysis For Corporation Favorita ⏰

Preview 🔍

Project Description 📋

Project Overview 🖼

1. 📥 Data Collection

2. 📚 Data Loading

3. 💡 Exploratory Data Analysis (EDA)

4. 📈 Answering Questions with Visualizations

5. ⚙️ Feature Engineering

6. ⌛ Model Training

7. 📊 Model Evaluation

8. 🎯 Hyperparameter Tuning

9. 🤔 Prediction on Validation Set

10. 📒 Prediction on Test Dataset

11. 💭 Exportation

Installation and Setup 🔧

Author 👨‍💼

Article Publication 📄

App Deployment 📲

License

Acknowledgments 🙏

Contact

About

Uh oh!

Uh oh!

Languages

License

iameberedavid/Time-Series-Forecasting-Analysis-For-Corporation-Favorita

Folders and files

Latest commit

History

Repository files navigation

Time Series Forecasting Analysis For Corporation Favorita ⏰

Preview 🔍

Project Description 📋

Project Overview 🖼

1. 📥 Data Collection

2. 📚 Data Loading

3. 💡 Exploratory Data Analysis (EDA)

4. 📈 Answering Questions with Visualizations

5. ⚙️ Feature Engineering

6. ⌛ Model Training

7. 📊 Model Evaluation

8. 🎯 Hyperparameter Tuning

9. 🤔 Prediction on Validation Set

10. 📒 Prediction on Test Dataset

11. 💭 Exportation

Installation and Setup 🔧

Author 👨‍💼

Article Publication 📄

App Deployment 📲

License

Acknowledgments 🙏

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages