This project focuses on predicting the quality of wine based on various chemical attributes using machine learning models. The dataset includes various features such as acidity, alcohol content, pH, and sulfur dioxide levels to predict the wine quality score (ranging from 1 to 10).
-
Data Preprocessing
- Data cleaning and feature engineering techniques were applied, including handling missing values and transforming variables using methods like Box-Cox and logarithmic transformations.
-
Feature Selection
- Relevant features were selected using Recursive Feature Elimination (RFE) to enhance model performance and reduce overfitting.
-
Modeling
- Several machine learning models were trained, including Linear Regression, Random Forest Regressor, Random Forest Classifier, and LightGBM classifier.
- Hyperparameter tuning was done using K-Fold cross-validation to improve the model's generalization.
-
Evaluation
- Models were evaluated using metrics such as Root Mean Squared Error (RMSE), R-squared (R²), and classification accuracy, depending on whether regression or classification was performed.
-
Results
- The best-performing model was selected based on evaluation metrics, and it provides predictions on the wine quality based on input chemical features.
pandas
seaborn
matplotlib
numpy
scikit-learn
- Clone this repository:
git clone https://github.com/your-username/wine-quality-prediction.git