Pronoun Bias in Emotion Classification

This repository contains code to evaluate pronoun-induced emotion prediction drift in pretrained NLP models. It shows how changing the subject pronoun in a sentence ("he", "she", or "they")—while keeping the sentence otherwise identical—can lead to different predicted emotions by widely used classifiers.

📊 What This Code Does

Uses a CSV file of 1,000 sentence triplets, each differing only in subject pronoun.
Runs predictions using a Hugging Face model (e.g., distilbert-base-uncased-go-emotions-student).
Measures label mismatches between:
- he vs she
- he vs they
- she vs they
Visualizes emotion drift using confusion matrix heatmaps.
Prints example mismatches for qualitative understanding.

📁 Dataset Format

The dataset (gender_emotion_sentences_1000.csv) should have the following structure:

he	she	they
He stayed silent.	She stayed silent.	They stayed silent.
He helped a friend.	She helped a friend.	They helped a friend.
...	...	...

🚀 How to Run

Install dependencies:

pip install transformers pandas matplotlib seaborn

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Mita2025_gender_bias.ipynb		Mita2025_gender_bias.ipynb
PronounBiasEval-1K.xsls		PronounBiasEval-1K.xsls
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pronoun Bias in Emotion Classification

📊 What This Code Does

📁 Dataset Format

🚀 How to Run

About

Uh oh!

Releases

Packages

Languages

smu-ivpl/Diagnosing-Emotion-Classification-Drift-from-Pronoun-Substitution

Folders and files

Latest commit

History

Repository files navigation

Pronoun Bias in Emotion Classification

📊 What This Code Does

📁 Dataset Format

🚀 How to Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages