This repository contains two machine learning projects:
- Flight Price Prediction (Regression)
- Customer Satisfaction Prediction (Classification)
Both projects focus on building machine learning models using Python, Streamlit, and MLflow for tracking experiments. The models are deployed in interactive web apps that allow users to input relevant information and get real-time predictions.
This project aims to predict flight ticket prices based on factors like departure time, source, destination, and airline type. The goal is to develop a regression model and deploy it as a Streamlit app that allows users to input filters and get a predicted flight price.
- Cleaned the dataset by removing duplicates and handling missing values.
- Converted date and time columns into standard formats.
- Feature engineering to derive new features like flight duration and price per minute.
- Performed Exploratory Data Analysis (EDA) to identify trends and correlations.
- Trained regression models like Linear Regression, Random Forest, and Gradient Boosting.
- Tuning of Gradient Boosting Regressor achieved:
- RMSE: 1809.13
- R2: 0.8454
- Logged experiments, hyperparameters, and metrics (e.g., RMSE, R2).
- Tracked all models in MLflow's model registry for easy access and comparison.
The Streamlit app allows users to:
- Filter flights by route, time, and airline.
- Get a real-time prediction of flight prices.
This project focuses on predicting customer satisfaction levels based on features such as demographics, flight services, and feedback ratings. The goal is to build a classification model and deploy it as a Streamlit app that predicts whether a customer is satisfied or dissatisfied.
- Cleaned the dataset by handling missing values and duplicates.
- Encoded categorical variables and standardized numerical features.
- Performed EDA to understand feature relationships and trends.
- Trained classification models like Logistic Regression, Random Forest, and Gradient Boosting.
- The Best Random Forest Model achieved:
- Accuracy: 0.9626
- F1-Score: 0.9564
- Logged experiments, metrics (e.g., accuracy, F1-score), and confusion matrices.
- Tracked all models in MLflow for versioning and performance tracking.
The Streamlit app allows users to:
- Input customer demographics, travel preferences, and service ratings.
- Get a prediction of customer satisfaction levels.
- Python: Data cleaning, feature engineering, and machine learning implementation.
- Streamlit: Developed interactive web applications for real-time predictions.
- MLflow: Tracked and logged model performance, hyperparameters, and artifacts.
- Machine Learning: Regression models for flight prices, classification models for customer satisfaction.
Dataset includes:
- Airline
- Date of Journey
- Source and Destination
- Route
- Departure and Arrival Time
- Duration
- Number of Stops
- Additional Information
Dataset includes:
- Gender, Age, and Customer Type
- Flight Distance
- Service Ratings (e.g., Inflight Wi-Fi, Seat Comfort)
- Delay Information
- Overall Satisfaction
- Clone the repository.
- Install required dependencies:
pip install -r requirements.txt streamlit run app.py
- Improved UI: Add more visual elements and filters to enhance user experience.
- Model Optimization: Experiment with ensemble methods to further improve prediction accuracy.
- Deployment: Explore deploying the app on cloud platforms for wider accessibility.