📚 Student Stress Predictor using Random Forest

🌟 Overview

This project implements a Random Forest Classifier to solve a multi-class classification problem: predicting a student's stress level (0: Low, 1: Medium, or 2: High) based on 20 features. These features span psychological, academic, and environmental factors crucial to student well-being.

The project demonstrates a complete machine learning workflow: meticulous data cleaning, robust outlier treatment, model training, and rigorous evaluation.

💾 Dataset Details

DataSet Link : https://www.kaggle.com/code/mdsultanulislamovi/student-stress-factors-dataset-analysis The model is trained on the StressLevelDataset.csv (1100 records). The target variable, stress_level, is well-balanced across its three classes, which is key for reliable model training.

Feature Type	Key Features	Examples of Range
Psychological/Health	`anxiety_level`, `depression`, `self_esteem`, `sleep_quality`	$0 - 30$
Academic/Career	`study_load`, `academic_performance`, `future_career_concerns`	$0 - 5$
Environmental/Social	`noise_level`, `social_support`, `bullying`	$0 - 5$

🛠️ Methodology & Technical Execution

The pipeline was executed in Python using Scikit-learn, demonstrating attention to data quality and model fairness:

Data Quality Check: Confirmed no missing values (Non-Null Count = 1100) and uniform int64 data types, ensuring the dataset was immediately ready for numerical preprocessing.
Outlier Treatment: Implemented the Interquartile Range (IQR) method to cap outliers in noise_level, living_conditions, and study_load. This preprocessing step was crucial to ensure the Random Forest model's robustness and prevent skewed splits.
Model Training (Random Forest):
- Used n_estimators=500 for high predictive stability.
- Applied class_weight='balanced' to automatically adjust for any slight class imbalance, guaranteeing a fair and non-biased predictive model across all three stress levels.
Split: $80/20$ train-test split (random_state=42) for reproducible results.

✅ Model Performance & Key Results

The Random Forest Classifier achieved excellent performance on the test set:

Metric	Score
Overall Accuracy	$0.868$ ($\approx 86.8%$)

The balanced F1-scores of $\approx 0.86 - 0.88$ across all three classes (Low, Medium, High) confirm the model's reliability in handling this multi-class prediction task.

Detailed Classification Report

    precision    recall  f1-score   support

   0       0.87      0.86      0.86        76
   1       0.90      0.86      0.88        73
   2       0.84      0.89      0.86        71

accuracy 0.87 220

macro avg 0.87 0.87 0.87 220 weighted avg 0.87 0.87 0.87 220

💻 Tech Stack

Python
Jupyter Notebook / Google Colab
Pandas & NumPy
Scikit-learn
Matplotlib & Seaborn

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
Random_Forest.ipynb		Random_Forest.ipynb
StressLevelDataset.csv		StressLevelDataset.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📚 Student Stress Predictor using Random Forest

🌟 Overview

💾 Dataset Details

🛠️ Methodology & Technical Execution

✅ Model Performance & Key Results

Detailed Classification Report

💻 Tech Stack

About

Uh oh!

Releases

Packages

Languages

shlokshukla200/ML-Random_Forest

Folders and files

Latest commit

History

Repository files navigation

📚 Student Stress Predictor using Random Forest

🌟 Overview

💾 Dataset Details

🛠️ Methodology & Technical Execution

✅ Model Performance & Key Results

Detailed Classification Report

💻 Tech Stack

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages