The goal of this project is to develop a simple, interpretable forecast model for airline bookings that minimizes the Mean Absolute Scaled Error (MASE) compared to a naïve baseline. We focus on analytics and business logic, not machine learning, to keep the approach transparent and practical.
We use two datasets:
- Training (sample): Cumulative bookings for various departure dates, with each departure having a record of 60 prior days' bookings.
- Validation: Similar structure, used to test our forecasting logic.
- We start by checking for missing values and understanding the distribution of bookings.
- Visualizations show how bookings vary by day of week and by days prior to departure.
- We extract features like day_of_weekandDaysToDepart(as a numeric value).
- For each departure, we calculate the final demand (bookings on the day of departure), remaining demand, and booking rate.
- Additive Model: Forecast = cumulative bookings + mean remaining demand (from training data).
- Multiplicative Model: Forecast = cumulative bookings / mean booking rate (from training data).
- Both models are run in two ways: by days prior only, and by days prior + day of week.
- Weighted Average: We also try a weighted average of the two models, tuning the weight to minimize MASE.
- We use MASE as our main metric, comparing each model's forecast to the naïve forecast provided in the validation set.
- The notebook prints out MASE scores for all approaches and highlights the best result.
- The notebook includes plots for EDA and for comparing actual vs. forecasted demand for a sample departure date.
- Make sure you have Python 3 installed.
- Install all dependencies with:
pip install -r requirements.txt
- Place airline_data_training.csvandairline_data_validation.csvin the project directory.
- Open ForecastFlightBookings.ipynbin Jupyter Notebook or VS Code.
- Run the notebook cells in order. You'll see EDA, model results, and visualizations.
- This project is intentionally simple and transparent—no machine learning, just analytics and business logic.
- The code is well-commented and organized for clarity.
- If you want to extend the analysis, you can add more features, try other error metrics, or experiment with different weighting schemes.