Skip to content

Update project intro #905

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions content/project/osre25/UCSC/FairFace/.gitkeep
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

Binary file added content/project/osre25/UCSC/FairFace/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
64 changes: 64 additions & 0 deletions content/project/osre25/UCSC/FairFace/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
title: "Understanding Skin-Tone based Bias in Text-to-Image Models Using Stable Diffusion"
date: 2025-05-27
lastmod: 2025-05-27
authors: ["Marzia Binta Nizam", "James Davis"]
tags: ["osre25", "uc", "bias", "stable-diffusion"]
---

This project investigates **skin tone bias in text-to-image generation** by analyzing the output of **Stable Diffusion** models when prompted with socially and occupationally descriptive text. Despite the growing popularity of generative models like Stable Diffusion, little has been done to evaluate how these models reproduce or amplify visual bias—especially related to **skin tone, perceived race, and social class**—based solely on textual prompts.

This work builds on prior studies of bias in large language models (LLMs) and vision-language models (VLMs), and aims to explore how biases manifest visually, without explicitly specifying race or ethnicity in the input prompt. Our approach combines **systematic prompt generation**, **model-based image creation**, and **skin tone quantification** to assess disparities across generated samples.

The ultimate goal is to develop a **reproducible evaluation pipeline**, visualize disparities across demographic and occupational prompts, and explore strategies to mitigate representational harms in generative models.


Our goal is to create a reproducible pipeline for:
- Generating images from prompts
- Annotating or analyzing them using computer vision tools
- Measuring bias across categories like skin tone, gender presentation, or status markers

Project webpage: [https://github.com/marzianizam/ucsc-ospo.github.io/tree/main/content/project/osre25/UCSC/FairFace](https://github.com/marzianizam/ucsc-ospo.github.io/tree/main/content/project/osre25/UCSC/FairFace)

### Project Idea: Measuring Bias in AI-Generated Portraits

- **Topics**: Responsible AI, Generative Models, Ethics in AI
- **Skills**: Python, PyTorch, Stable Diffusion, Prompt Engineering, Data Analysis
- **Difficulty**: Medium
- **Size**: 350 hours
- **Mentors**:
- {{% mention "Marzia Binta Nizam" %}} (mailto:manizam@ucsc.edu)
- {{% mention "Professor James Davis" %}} (mailto:davisje@ucsc.edu)

### Background

Recent research has shown that text-to-image models can perpetuate racial and gender stereotypes through visual output. For instance, prompts like “CEO” or “nurse” often produce racially skewed results even when no explicit race or demographic cues are provided. This project examines whether similar disparities exist **along skin tone dimensions**, focusing on **subtle biases** rather than overt stereotypes.

The key challenge is that visual bias is not always easy to measure. This project addresses this issue by utilizing **melanin-level quantification**, a continuous and interpretable proxy for skin tone, in conjunction with consistent prompt templating and multi-sample averaging to ensure statistical rigor.

---

### Objectives

- Generate datasets using consistent prompts (e.g., "A portrait of a doctor", "A homeless person", etc.)
- Use Stable Diffusion (and optionally, other models like DALL·E or Midjourney) to generate diverse image sets
- Measure bias across demographic and occupational categories using image processing tools
- Visualize the distribution of melanin values and facial features across samples
- Explore prompt-level mitigation strategies to improve fairness in output

---

### Deliverables

- Open-source codebase for prompt generation and image evaluation
- Statistical analysis of visual bias trends
- Blog post or visual explainer on findings
- Final report and recommendations on prompt engineering or model constraints

---






Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
64 changes: 64 additions & 0 deletions content/report/osre25/ucsc/FairFace/07102025-Marzia/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
title: "Auditing Skin Tone Bias in Text-to-Image Models"
subtitle: "Uncovering representational harms in AI-generated imagery"
summary: A midterm blog reflecting on the progress and direction of my OSRE25 project, which investigates how models like Stable Diffusion may encode and reproduce skin tone biases in response to occupational or status-based prompts.
authors:
- marzia
tags: ["osre25"]
categories: []
date: 2025-07-09
lastmod: 2025-07-09
featured: false
draft: false

image:
caption: "A collage of AI-generated faces showing subtle variation in skin tone"
focal_point: "Center"
preview_only: false
---

As part of the [Stable Diffusion Bias Project](/project/osre25/ucsc/sd-bias), my [proposal](https://github.com/ucsc-ospo/ucsc-ospo.github.io/blob/main/content/project/osre25/ucsc/fair-face/index.md) focuses on evaluating **bias in visual outputs of generative AI models**, particularly **skin tone bias** in Stable Diffusion.

The goal is to analyze how models render people based on prompts like “a doctor” or “a homeless person,” and whether certain prompts systematically result in lighter or darker skin tones—even when race isn’t explicitly mentioned.

---

### 🧪 What I’ve Done So Far

- Designed a prompt template covering six social categories (e.g., criminal justice, profession, socioeconomic)
- Generated image datasets using Stable Diffusion with varied seeds
- Built a preprocessing pipeline to estimate **melanin values** from generated faces
- Created early visualizations showing **distributional trends in skin tone**
- Identified early evidence of bias in prompts linked to status or wealth

---

### ⚒️ Tools and Methods

- **Stable Diffusion** for controlled image generation
- **BioSkin pipeline** to extract melanin metrics
- **Fitzpatrick skin type approximation** (in development as a validation method)
- Python-based data analysis and prompt auditing
- [openai/CLIP](https://github.com/openai/CLIP) and BLIP for optional image-text alignment scoring

---

### 🔍 What I’m Seeing

Preliminary results show that even neutral prompts like “a portrait of a professor” tend to favor lighter skin tones, while prompts such as “a manual laborer” or “a homeless person” skew toward darker tones. These trends are **not always obvious to the human eye**, which is why quantitative skin tone analysis is essential.

I'm now exploring whether prompt engineering (e.g., adding “fair,” “dark-skinned,” or “diverse” descriptors) can help mitigate these imbalances.

---

### 🚧 What’s Next

- Expand dataset to 60 prompts across 6 categories
- Incorporate alternate T2I models (Midjourney, DALL·E 3)
- Write a technical report and reproducible evaluation framework
- Submit a short paper or workshop proposal to a fairness or ethics venue

---