Skip to content

This repository contains the full analysis code for evaluating how pretrained emotion recognition models shift predictions when only the subject pronoun in a sentence is changed (e.g., “he” → “she” → “they”).

Notifications You must be signed in to change notification settings

smu-ivpl/Diagnosing-Emotion-Classification-Drift-from-Pronoun-Substitution

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Pronoun Bias in Emotion Classification

This repository contains code to evaluate pronoun-induced emotion prediction drift in pretrained NLP models. It shows how changing the subject pronoun in a sentence ("he", "she", or "they")—while keeping the sentence otherwise identical—can lead to different predicted emotions by widely used classifiers.


📊 What This Code Does

  • Uses a CSV file of 1,000 sentence triplets, each differing only in subject pronoun.
  • Runs predictions using a Hugging Face model (e.g., distilbert-base-uncased-go-emotions-student).
  • Measures label mismatches between:
    • he vs she
    • he vs they
    • she vs they
  • Visualizes emotion drift using confusion matrix heatmaps.
  • Prints example mismatches for qualitative understanding.

📁 Dataset Format

The dataset (gender_emotion_sentences_1000.csv) should have the following structure:

he she they
He stayed silent. She stayed silent. They stayed silent.
He helped a friend. She helped a friend. They helped a friend.
... ... ...

🚀 How to Run

  1. Install dependencies:
pip install transformers pandas matplotlib seaborn

About

This repository contains the full analysis code for evaluating how pretrained emotion recognition models shift predictions when only the subject pronoun in a sentence is changed (e.g., “he” → “she” → “they”).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%