- Rina Kushwaha
- Varsha Dubey
- Aman Khan
- Akash Kumar Verma
[STEP 1] – Problem Definition Define the goal: Predict whether a person is likely to face mental health issues using survey data.
[STEP 2] – Data Collection Obtain the dataset (e.g., from Kaggle – “Mental Health in Tech Survey”).
[STEP 3] – Data Preprocessing
Handle missing values
Encode categorical variables (Label Encoding / One-Hot Encoding)
Scale numerical data (e.g., StandardScaler)
Remove outliers if necessary
[STEP 4] – Exploratory Data Analysis (EDA)
Analyze distributions and relationships between features
Use visualizations (histograms, heatmaps, boxplots)
[STEP 5] – Feature Selection
Choose relevant columns (e.g., age, gender, family history, workplace support)
Optionally use feature importance from models like Random Forest
[STEP 6] – Model Selection & Training
Choose machine learning algorithms (e.g., Logistic Regression, Random Forest, SVM)
Train models using training dataset
Use train_test_split or cross-validation
[STEP 7] – Model Evaluation
Evaluate using metrics: Accuracy, Precision, Recall, F1-score
Use Confusion Matrix to understand results better
[STEP 8] – Model Tuning (Optional)
Tune hyperparameters using Grid Search or Random Search for better performance
[STEP 9] – Model Testing
Test the final model on unseen test data
Ensure it generalizes well and is not overfitting
[STEP 10] – Model Deployment (Optional)
Create a web interface using Streamlit or Flask
Deploy on platforms like Heroku or Render
Let users input values and get predictions
[STEP 11] – Documentation & Report Preparation
Document each step (code, logic, and results)
Prepare project report and presentation for submission/viva
- Final Project Report
- Certificate VII Semester (Dated: December 2024).
- Certificate VIII Semester (Dated: May 2025).
- Synopsis
- Final Presentation
- Source Code
- Database dump (.sql file)
- If a web project, then a Docker file for deployment
- README (This file)