Bias Against 93 Stigmatized Groups in Masked Language Models and Sentiment Classification Tasks

by Katelyn Mei, Sonia Fereidooni, Aylin Caliskan

This paper is accepted and published in ACM Fairness, Accountability, and Transparency 2023 . To read the full paper, please visit link

This paper investigates bias against socially stigmatized groups in masked language models via prompting and sentiment classification tasks.

Caption for the example figure with the main results.

Brief Abstract

This study extends the focus of bias evaluation in extant work by examining bias against social stigmas on a large scale. It focuses on 93 stigmatized groups in the United States, including a wide range of conditions related to disease, disability, drug use, mental illness, religion, sexuality, socioeconomic status, and other relevant factors. We investigate bias against these groups in English pre-trained Masked Language Models (MLMs) and their downstream sentiment classification tasks. To evaluate the presence of bias against 93 stigmatized conditions, we identify 29 non-stigmatized conditions to conduct a comparative analysis. Building upon a psychology scale of social rejection, the Social Distance Scale, we prompt six MLMs that are trained with different datasets: RoBERTa-base, RoBERTa-large, XLNet-large, BERTweet-base, BERTweet-large, and DistilBERT. We use human annotations to analyze the predicted words from these models, with which we measure the extent of bias against stigmatized groups.

Implementaiton

Code for running experiments in this study is from python. Generated words from each model is included in 'data' folder.

Getting the code

You can download a copy of all the files in this repository by cloning the https://git-scm.com/ repository:

git clone https://github.com/Mooniem/MLMs_bias_stigmas.git

Dependencies

You'll need a working Python environment to run the code. The recommended way to set up your environment is through the Anaconda Python distribution which provides the conda package manager. Anaconda can be installed in your user directory and does not interfere with the system Python installation. The required dependencies are specified in the file environment.yml.

We use conda virtual environments to manage the project dependencies in isolation. Thus, you can install our dependencies without causing conflicts with your setup (even with different Python versions).

Run the following command in the repository folder (where environment.yml is located) to create a separate environment and install all required dependencies in it:

conda env create

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
SentimentAnalysis		SentimentAnalysis
data		data
.DS_Store		.DS_Store
Experiment_MLM_models.ipynb		Experiment_MLM_models.ipynb
Experiment_sentiment_classification.ipynb		Experiment_sentiment_classification.ipynb
LICENSE		LICENSE
README.md		README.md
Social_Distance_visualization.Rmd		Social_Distance_visualization.Rmd
helper_functions.py		helper_functions.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Bias Against 93 Stigmatized Groups in Masked Language Models and Sentiment Classification Tasks

Brief Abstract

Implementaiton

Getting the code

Dependencies

Reproducing the results

About

Uh oh!

Releases

Packages

Languages

License

Mooniem/MLMs_bias_stigmas

Folders and files

Latest commit

History

Repository files navigation

Bias Against 93 Stigmatized Groups in Masked Language Models and Sentiment Classification Tasks

Brief Abstract

Implementaiton

Getting the code

Dependencies

Reproducing the results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages