Skip to content

FinnGu/analysis-of-visual-datasets-using-llavaguard

Repository files navigation

Evaluating Safety Over Time: An Analysis of Visual Datasets with LlavaGuard

In this master's thesis, I evaluated the safety of images contained in visual datasets as well as the output of text-to-image models for a set of 10,000 prompts.

Datasets Evaluated

  • CIFAR-10
  • CIFAR-100
  • ImageNet
  • MS COCO
  • LSUN
  • CelebA
  • CC12M
  • LAION-5B (excluded, because cluster cannot handle the download)
  • DataComp-1B (5% sample, stored on DFKI cluster)
  • Stylebreeder

Models Evaluated:

  • stable-diffusion-v1-5/stable-diffusion-v1-5
  • stabilityai/stable-diffusion-2-1
  • DeepFloyd/IF
  • stabilityai/stable-diffusion-xl-base-1.0
  • stablediffusionapi/newrealityxl-global-nsfw
  • playgroundai/playground-v2.5-1024px-aesthetic
  • Tencent-Hunyuan/HunyuanDiT-v1.1-Diffusers-Distilled
  • black-forest-labs/FLUX.1-schnell
  • THUDM/CogView4-6B
  • HiDream-ai/HiDream-I1-Full

Prompt Testbench

Our prompt testbench consists of 1,000 safe and 9,000 unsafe prompts (1,000 per LlavaGuard category). It is based on ViSU-Text, filtered to exclude child abuse, transformed into LlavaGuard categories and augmented with prompts generated by Grok AI. The final prompts used in the thesis are stored in the file src/experiments/prompt_testbench/prompt_testbench_10000_v6.csv.

Results

Because safety annotations can be used adversely to find unsafe images, we did not publish them. They are stored on Cluster42 of TU Darmstadt and can be shared if requested. To retrieve the paths of results, see the files src/experiments/datasets/datasets.json and src/experiments/safety_benchmark_models/overview.json.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages