Mental-LLM

Todo

Organize code for prompt designing, model fine-tuning, and inference
Provide hyperparameters for the experiments
Release model weights to Huggingface hub (upon acceptance)

Table of Content

Overview
Inference Settings
Datasets
Models
Results
Fine-tuning Hyperparamters

Overview

This is the repository for the paper Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data, an updated version of this paper is under review.

In this work, we present the first comprehensive evaluation of multiple LLMs, including Alpaca, Alpaca-LoRA, FLAN-T5, GPT-3.5, and GPT-4, on various mental health prediction tasks via online text data. We conduct a broad range of experiments, covering zero-shot prompting, few-shot prompting, and instruction fine-tuning. More importantly, our experiments show that instruction finetuning can significantly boost the performance of LLMs for all tasks simultaneously.

Our best-finetuned models, Mental-Alpaca and Mental-FLAN-T5, outperform the best prompt design of GPT-3.5 by 10.9% and the best of GPT-4 by 4.8% on balanced accuracy and perform on par with the state-of-the-art task-specific language model.

We have publically released our fine-tuned model weights on huggingface hub. The use of both model weights is limited to research purposes only:

Mental-Alpaca: https://huggingface.co/NEU-HAI/mental-alpaca
Mental-FLAN-T5: https://huggingface.co/NEU-HAI/mental-flan-t5-xxl

You may find sample codes to load both models from the repositories above directly. Details about the prompts, training process, and evaluations can be found in our paper. The GPU Memory requirement to load Mental-Alpaca and Mental-FLAN-T5 is 27GB and 44GB, respectively, and will require additional GPU Memory for inference.

Contributions

Inference Settings

Zero-shot Prompting

$Prompt_{𝑍𝑆} = TextData + Prompt_{Part1-S} + Prompt_{Part2-Q} + OutputConstraint$

Few-shot Prompting

$Prompt_{𝐹𝑆} = [Sample Prompt_{𝑍𝑆} − label]𝑀 + Prompt_{𝑍𝑆}$, where $M$ denotes # of demonstrations

Prompt Designs

Datasets

Dreaddit
This dataset collected posts from Reddit, which contains ten subreddits in the five domains (abuse, social, anxiety, PTSD, and financial).
We used this dataset for a post-level binary stress prediction (Task 1).
DepSeverity
This dataset leveraged the same posts collected in Dreaddit, but with a different focus on depression.
We employed this dataset for two post-level tasks: binary depression prediction (i.e., whether a post showed at least mild depression, Task 2), and four-level depression prediction (Task 3).
SDCNL
This dataset also collected posts from Reddit, including r/SuicideWatch and r/Depression.
We employed this dataset for the post-level binary suicide ideation prediction (Task 4).
CSSRS-Suicide
This dataset contains posts from 15 mental health-related subreddits.
We leveraged this dataset for two user-level tasks: binary suicide risk prediction (i.e., whether a user showed at least suicide indicator, Task 5), and five-level suicide risk prediction (Task 6).

Models

Alpaca-7b
Alpaca-LoRA
FLAN-T5-XXL
GPT-3.5
GPT-4

Results

More results can be found in the paper.

Fine-tuning Hyperparamters

MentalRoBERTa (Baseline)
- For each dataset, we convert the original text labels into ascending numbers starting from 0
- num_train_epochs=3, per_device_train_batch_size = 4, gradient_accumulation_steps = 16, per_device_eval_batch_size= 8, learning_rate = 5e-5, warmup_steps=500, weight_decay=0.01, logging_steps = 8, fp16 = False
Mental-Alpaca
- We mostly leverage the same fine-tuning hyperparameters provided here with minor changes to accomdate our computing resources
Mental-FLAN-T5
- max_len=1024, target_max_len=128, per_device_train_batch_size=2, per_device_eval_batch_size=1, gradient_accumulation_steps=2, learning_rate=1e-4, num_train_epochs=2

Citation

@article{xu2023mentalllm,
      title={Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data}, 
      author={Xuhai Xu and Bingshen Yao and Yuanzhe Dong and Saadia Gabriel and Hong Yu and James Hendler and Marzyeh Ghassemi and Anind K. Dey and Dakuo Wang},
      year={2023},
      eprint={2307.14385},
      archivePrefix={arXiv},
      primaryClass={cs.HC}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
README.md		README.md
enhanced_results.png		enhanced_results.png
prompt_designs.png		prompt_designs.png
results.png		results.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Mental-LLM

Todo

Table of Content

Overview