T2IScoreScore

T2IScoreScore is a framework for evaluating text-to-image model evaluation metrics. The framework provides tools to:

Reference implementations of various classes of text-to-image metrics with a consistent API:
- Correlation-based metrics (CLIPScore)
- Likelihood-based metrics
- Visual Question-Answering metrics
Run these metrics on the T2IScoreScore dataset of semantic error graphs:
```
ts2 evaluate CLIPScore --device cuda
```
Compute metametrics to characterize how well the T2I metric ordered different nodes of images along walks of increasing error count:
```
ts2 compute CLIPScore spearman kstest
```
Generate visualizations and reports to analyze metric performance across different error types and image sources.

Installation

git clone https://github.com/michaelsaxon/T2IScoreScore.git
cd T2IScoreScore
pip install -e .

Package Structure

src/
├── T2IMetrics/          # Collection of text-to-image evaluation metrics
└── T2IScoreScore/       # Framework for evaluating metrics
    ├── evaluators/      # Metametric implementations (Spearman, KS test, etc.)
    ├── figures/         # Visualization utilities
    └── run/             # CLI entry points

CLI Usage

The package provides two main commands through the ts2 CLI:

Evaluate a Metric

Run a metric on the T2IScoreScore dataset:

# Basic usage
ts2 evaluate CLIPScore

# With options
ts2 evaluate TIFAScore --device cuda --output results/

Compute Metametrics

Analyze metric performance using metametrics:

# Run all default evaluators (spearman, kstest, delta)
ts2 compute CLIPScore

# Run specific evaluators
ts2 compute TIFAScore spearman kendall

Output Structure

Results are saved in the following structure:

output/
├── scores/             # Raw metric scores
│   └── metric_scores.csv
└── metametrics/        # Metametric results
    ├── metric.csv      # Per-example results
    └── metric_averages.csv  # Partition averages

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
data		data
figures		figures
output		output
scripts		scripts
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
README.old.md		README.old.md
kstest_average_weighted.csv		kstest_average_weighted.csv
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

T2IScoreScore

Installation

Package Structure

CLI Usage

Evaluate a Metric

Compute Metametrics

Output Structure

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

michaelsaxon/T2IScoreScore

Folders and files

Latest commit

History

Repository files navigation

T2IScoreScore

Installation

Package Structure

CLI Usage

Evaluate a Metric

Compute Metametrics

Output Structure

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages