Fairness Bench

This is a benchmark to evaluate AI capabilities to do fair data driven decision-making.

The benchmark consists of several tasks.

Roles:

task-specific: environment files for the task, the train.py, etc
benchmarking infrastructure: code needed to overall run benchmark, scoring etc (eval-<type>.py)
agent: agent tools, agent prompts, etc

file	description	role

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
fairnessBench		fairnessBench
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
baseline.sh		baseline.sh
eval.sh		eval.sh
install.sh		install.sh
log_rubric.json		log_rubric.json
multi_run_experiment.sh		multi_run_experiment.sh
requirements.txt		requirements.txt
research_agent_interactive.sh		research_agent_interactive.sh
rubric.json		rubric.json
run_experiments.sh		run_experiments.sh
setup.py		setup.py
system_prompt.txt		system_prompt.txt
system_prompt_log.txt		system_prompt_log.txt

Provide feedback