Rossmann Forecast & Promo Impact

End-to-end retail forecasting + promotion uplift with Python (LightGBM & Prophet) and Tableau

Live Dashboard
https://public.tableau.com/app/profile/emre.pelzer/viz/RossmannPerformanceForecastDashboard/RossmannForecastPromoImpact

Contact
Emre Pelzer · emrepel03@gmail.com · LinkedIn · GitHub · Portfolio

Highlights

Forecasting: Store-level daily sales using LightGBM (primary) and Prophet (baseline).
Accuracy: RMSPE 0.1315 (LightGBM) vs 0.1443 (Prophet) on the test window 2015‑06‑20 → 2015‑07‑31.
Promotion “what‑if” simulator: Counterfactual baseline vs promo at daily, store level.
Estimated impact: ~10.29% weighted uplift across the simulated window (top stores much higher).
Estimated annualized revenue gain: Model-driven promo allocation could yield ~€15.02M/year additional sales over baseline, based on test window uplift. (See "notebooks/bussiness_impact.ipynb")
Business outputs: Tableau dashboards (Forecast Viewer, Promo Impact, Model comparison) plus weekly promo action tables.

Color theme: Rossmann Red #E6001A (Actual) and Teal #007F7F (Predicted).

Screenshots

Repository Structure

reports/
  figures/                      # core PNGs for README & Tableau
  metrics/                      # metrics CSVs and Optuna trials
  portfolio/                    # action tables (promo, demand, competition risk)
  predictions/                  # lgbm_test_predictions.csv (test window preds)
  promo_sim/                    # what-if outputs (lift_by_store_day, summaries)
  tableau/                      # tidy CSVs for Tableau + workbook(s)

scripts/
  modeling.py                   # trains LightGBM with Optuna
  compare_models.py             # LGBM vs Prophet evaluation + plots
  evaluate_by_store.py          # per-store error breakdown & figs
  predict.py                    # batch prediction
  promo_simulator.py            # counterfactual promo uplift

Reproduce Locally

1 Environment

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

2 Train & Evaluate

python scripts/modeling.py --wide_path data/processed/rossmann_features_wide.csv --n_trials 200 --test_days 42 --val_days 42
python scripts/compare_models.py
python scripts/evaluate_by_store.py --preds_path reports/predictions/lgbm_test_predictions.csv --wide_path data/processed/rossmann_features_wide.csv

3 Promo Simulator (what‑if)

python scripts/promo_simulator.py --wide_path data/processed/rossmann_features_wide.csv --model_path models/lightgbm_sales.pkl --start 2015-06-20 --end 2015-07-31 --stores_top_k 25 --out_dir reports/promo_sim

This produces:

reports/promo_sim/lift_by_store_day.csv
reports/promo_sim/lift_summary_by_store.csv
charts in reports/promo_sim/

Tableau – How to Open

Connect to these tidy sources (relationships, not physical joins) in Data Model:

From reports/tableau/

test_predictions.csv (grain: Store, Date)
promo_lift_by_store_day.csv (relate on Store + Date)
by_store_compare.csv (relate on Store)
overall_summary.csv (use on Overview only)

From reports/portfolio/

promo_actions.csv (relate on Store + Week Start)
high_demand_weeks.csv (relate on Store + Week Start)
competition_hotspots.csv (relate on Store)

Two pages in the public workbook:

Forecast & Promo Impact – KPI tiles, Forecast Viewer (Actual vs Pred), Promo Top Stores, Model accuracy.
Profit & Runners‑Up – Profit uplift scatter (assumes margin parameter) and per‑store uplift runners‑up.

Modeling Notes (brief)

Features (50): calendar (dow, week, month), promo flags, holidays, lag/rolling stats, competition distance/recency, etc.
Tuning: Optuna over learning rate, leaves, depth, subsampling, regularization; pruning with early stopping; RMSPE on val.
Metric: RMSPE (scale-robust across stores) + RMSE & MAPE for stakeholder clarity.
Validation: Walk‑forward splits (val/test 6 weeks each), strict anti‑leakage on lags/rolls.

Business Outputs

Promo Actions: Weekly recs by store (expected absolute lift, avg % lift, priority) → reports/portfolio/promo_actions.csv.
High‑Demand Weeks: Elevated demand flags per store/week → reports/portfolio/high_demand_weeks.csv.
Competition Hotspots: Risk score blending proximity + recency → reports/portfolio/competition_hotspots.csv.

Data

Source: Kaggle Rossmann Store Sales (public).
Processed feature table: data/processed/rossmann_features_wide.csv (not committed due to size).
Predictions, metrics, figures, and Tableau‑ready CSVs live under reports/.

What’s Next

Add weekly seasonality features per store type; test CatBoost/XGBoost ablations.
Parameterize promo simulator by budget and margin to surface profit‑optimal allocations.
Optional: schedule retraining with GitHub Actions + publish fresh extracts for Tableau Public.

License

MIT (see LICENSE).
If you reuse the Kaggle dataset, please comply with its license/terms.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Rossmann Forecast & Promo Impact

Highlights

Screenshots

Repository Structure

Reproduce Locally

Tableau – How to Open

Modeling Notes (brief)

Business Outputs

Data

What’s Next

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
notebooks		notebooks
reports		reports
scripts		scripts
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

emrepel03/Rossmann-performance-and-forecast

Folders and files

Latest commit

History

Repository files navigation

Rossmann Forecast & Promo Impact

Highlights

Screenshots

Repository Structure

Reproduce Locally

Tableau – How to Open

Modeling Notes (brief)

Business Outputs

Data

What’s Next

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages