GO DS 4.0 - Mental Health Text Classification

This repository contains my solution for the GO DATA SCIENCE 4.0 Hackathon hosted on Zindi, where I achieved 42rd place out of 194 participants. The challenge focused on classifying mental health-related text discussions into predefined categories using Natural Language Processing (NLP) techniques.

🏅 Competition Results

Final Leaderboard Performance

Rank: 42 out of 194 participants
Validation Accuracy: 74.4%
Public Leaderboard Score: 0.7528
Private Leaderboard Score: 0.7371

Top Performers (Excerpt)

Rank	Team Name	Public Score	Private Score
1	Recursive Duo	0.8189	0.7996
...	...	...	...
9	one crew	0.7786	0.7792
25	Llama	0.7610	0.7648
43	SamehAissa (Me)	0.7686	0.7528
44	...	...	...

📊 Analysis

Key Observations

Top Scores:
- The winning team achieved a public score of 0.8189 and a private score of 0.7996.
My Performance:
- Achieved a public score of 0.7686 and a private score of 0.7528.
- Ranked 42rd, placing in the top 22% of participants.
Leaderboard Insights:
- A small gap between public and private scores indicates robust models.
- The competition was highly competitive, with close scores among top teams.

🏆 Competition Overview

Problem Statement

The goal was to develop a model that accurately classifies text entries (titles and content) from online discussions into categories representing mental health issues. Each entry in the dataset included:

id: Unique identifier
title: Discussion title
content: Main body of the text
target: Mental health category (only in training data)

Example Data Entry

id	title	content	target
101	Feeling Hopeless and Lost	I've been struggling with depression for a while...	Depression
102	Panic Attacks Are Getting Worse	Lately, my panic attacks have been more frequent...	Anxiety

Evaluation Metric

The model's performance was evaluated using Private Accuracy as the primary metric.

Key Steps

Data Preprocessing:
- Combined title and content into a single text feature.
- Handled missing values and cleaned text data.
- Encoded target labels into numerical format.
Modeling:
- Experimented with BERT and RoBERTa architectures.
- Implemented class weighting to handle imbalanced data.
- Used Text Augmentation (EDA) to improve generalization.
Training:
- Fine-tuned transformer models using Hugging Face's Trainer API.
- Applied Focal Loss to focus on hard-to-classify examples.
- Used Test-Time Augmentation (TTA) for robust predictions.
Evaluation:
- Achieved ~76.8% Public Accuracy on the validation set.
- Secured 42rd place on the final leaderboard.

🔮 Future Improvements

Deployement:
- Build a Streamlit/Gradio app for real-time predictions.
- Deploy the model using FastAPI or Flask.
Explainability:
- Use SHAP or LIME to explain model predictions.
- Visualize attention weights for transformer models.
Advanced Models:
- Experiment with DeBERTa, GPT-based models, or ensemble methods.
- Use knowledge distillation to combine multiple models.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
data		data
notebooks		notebooks
scripts		scripts
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GO DS 4.0 - Mental Health Text Classification

🏅 Competition Results

Final Leaderboard Performance

Top Performers (Excerpt)

📊 Analysis

Key Observations

🏆 Competition Overview

Problem Statement

Example Data Entry

Evaluation Metric

Key Steps

🔮 Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

samehaisaa/GO-DataScience-4.0---Mental-Health-Text-Classification-

Folders and files

Latest commit

History

Repository files navigation

GO DS 4.0 - Mental Health Text Classification

🏅 Competition Results

Final Leaderboard Performance

Top Performers (Excerpt)

📊 Analysis

Key Observations

🏆 Competition Overview

Problem Statement

Example Data Entry

Evaluation Metric

Key Steps

🔮 Future Improvements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages