Skip to content

[benchmarks] overhaul benchmarks #11565

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 91 commits into from
Jul 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
24a46cc
start overhauling the benchmarking suite.
sayakpaul May 15, 2025
ab7f381
fixes
sayakpaul May 15, 2025
cc0a38a
fixes
sayakpaul May 15, 2025
169f831
checking.
sayakpaul May 15, 2025
ad18983
checking
sayakpaul May 15, 2025
31e34d5
fixes.
sayakpaul May 16, 2025
36afdea
error handling and logging.
sayakpaul May 16, 2025
0d3af90
Merge branch 'main' into benchmarking-overhaul
sayakpaul May 16, 2025
fd85fbc
Merge branch 'main' into benchmarking-overhaul
sayakpaul May 19, 2025
a2c03a4
Merge branch 'main' into benchmarking-overhaul
sayakpaul May 20, 2025
4d83a47
add flops and params.
sayakpaul May 20, 2025
6815cae
add more models.
sayakpaul May 20, 2025
5635bf8
utility to fire execution of all benchmarking scripts.
sayakpaul May 20, 2025
cfbd21e
utility to push to the hub.
sayakpaul May 20, 2025
4ccfad0
push utility improvement
sayakpaul May 20, 2025
dff3144
seems to be working.
sayakpaul May 20, 2025
accd598
okay
sayakpaul May 20, 2025
41f79a0
add torchprofile dep.
sayakpaul May 20, 2025
befdd9e
remove total gpu memory
sayakpaul May 20, 2025
4784b8b
fixes
sayakpaul May 20, 2025
c19dc5b
fix
sayakpaul May 20, 2025
2da4fac
need a big gpu
sayakpaul May 20, 2025
7367bb1
better
sayakpaul May 20, 2025
1cd472f
what's happening.
sayakpaul May 20, 2025
214795d
okay
sayakpaul May 20, 2025
7d4f459
Merge branch 'main' into benchmarking-overhaul
sayakpaul May 21, 2025
2d5b305
Merge branch 'main' into benchmarking-overhaul
sayakpaul May 22, 2025
dd42244
Merge branch 'main' into benchmarking-overhaul
sayakpaul May 27, 2025
9d28606
Merge branch 'main' into benchmarking-overhaul
sayakpaul May 28, 2025
ffed3b3
Merge branch 'main' into benchmarking-overhaul
sayakpaul May 29, 2025
a28c881
Merge branch 'main' into benchmarking-overhaul
sayakpaul May 29, 2025
90b9b42
Merge branch 'main' into benchmarking-overhaul
sayakpaul May 31, 2025
64186b4
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 2, 2025
dfb20b0
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 2, 2025
1b0de9b
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 4, 2025
1122cad
separate requirements and make it nightly.
sayakpaul Jun 5, 2025
baa92c2
add db population script.
sayakpaul Jun 5, 2025
9e1f17f
update secret name
sayakpaul Jun 5, 2025
71200da
update secret.
sayakpaul Jun 5, 2025
1136f92
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 5, 2025
d10024c
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 6, 2025
e45e4eb
population db update
sayakpaul Jun 6, 2025
4a60155
disable db population for now.
sayakpaul Jun 6, 2025
8dd326f
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 6, 2025
e0ccb60
change to every monday
sayakpaul Jun 6, 2025
61dd029
Update .github/workflows/benchmark.yml
sayakpaul Jun 6, 2025
ee0fcd4
quality improvements.
sayakpaul Jun 6, 2025
e35ffe8
reparate hub upload step.
sayakpaul Jun 6, 2025
d3c494a
repository
sayakpaul Jun 6, 2025
ce8d1ec
remove csv
sayakpaul Jun 6, 2025
fc69eb8
check
sayakpaul Jun 6, 2025
a43e8ef
update
sayakpaul Jun 6, 2025
2f5c8d0
update
sayakpaul Jun 6, 2025
1f7587e
threading.
sayakpaul Jun 6, 2025
7a935a4
update
sayakpaul Jun 6, 2025
a6c7359
update
sayakpaul Jun 6, 2025
1150cb0
updaye
sayakpaul Jun 6, 2025
6cc4707
update
sayakpaul Jun 6, 2025
f1ee631
update
sayakpaul Jun 6, 2025
73e07ba
update
sayakpaul Jun 6, 2025
2a65a89
remove peft dep
sayakpaul Jun 6, 2025
dc778b0
upgrade runner.
sayakpaul Jun 6, 2025
8ddf57c
fix
sayakpaul Jun 6, 2025
8161e36
fixes
sayakpaul Jun 6, 2025
807f511
fix merging csvs.
sayakpaul Jun 7, 2025
2dbdfe0
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 7, 2025
a09768f
push dataset to the Space repo for analysis.
sayakpaul Jun 7, 2025
1683c47
warm up.
sayakpaul Jun 8, 2025
d1fb620
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 10, 2025
858dc09
add a readme
sayakpaul Jun 10, 2025
6bfdae6
Apply suggestions from code review
sayakpaul Jun 10, 2025
6b11973
address feedback
sayakpaul Jun 10, 2025
ba7a89c
Apply suggestions from code review
sayakpaul Jun 10, 2025
f9285fd
disable db workflow.
sayakpaul Jun 10, 2025
9017a2c
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 11, 2025
f9d4345
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 12, 2025
5792608
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 14, 2025
3ae040c
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 18, 2025
d9950cd
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 23, 2025
9e235e8
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jun 28, 2025
aac27f0
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jul 2, 2025
6bf5b36
update to bi weekly.
sayakpaul Jul 2, 2025
736f22e
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jul 2, 2025
18c4361
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jul 2, 2025
4bdb865
enable population
sayakpaul Jul 3, 2025
ca81c6e
enable
sayakpaul Jul 3, 2025
1745e45
Merge branch 'main' into benchmarking-overhaul
sayakpaul Jul 3, 2025
26775e5
updaye
sayakpaul Jul 3, 2025
64331b2
update
sayakpaul Jul 3, 2025
01bd03e
metadata
sayakpaul Jul 3, 2025
a76a236
fix
sayakpaul Jul 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 31 additions & 10 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,18 @@ env:
HF_HOME: /mnt/cache
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
BASE_PATH: benchmark_outputs

jobs:
torch_pipelines_cuda_benchmark_tests:
torch_models_cuda_benchmark_tests:
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_BENCHMARK }}
name: Torch Core Pipelines CUDA Benchmarking Tests
name: Torch Core Models CUDA Benchmarking Tests
strategy:
fail-fast: false
max-parallel: 1
runs-on:
group: aws-g6-4xlarge-plus
group: aws-g6e-4xlarge
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host --gpus 0
Expand All @@ -35,27 +36,47 @@ jobs:
nvidia-smi
- name: Install dependencies
run: |
apt update
apt install -y libpq-dev postgresql-client
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install pandas peft
python -m uv pip uninstall transformers && python -m uv pip install transformers==4.48.0
python -m uv pip install -r benchmarks/requirements.txt
- name: Environment
run: |
python utils/print_env.py
- name: Diffusers Benchmarking
env:
HF_TOKEN: ${{ secrets.DIFFUSERS_BOT_TOKEN }}
BASE_PATH: benchmark_outputs
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
run: |
export TOTAL_GPU_MEMORY=$(python -c "import torch; print(torch.cuda.get_device_properties(0).total_memory / (1024**3))")
cd benchmarks && mkdir ${BASE_PATH} && python run_all.py && python push_results.py
cd benchmarks && python run_all.py

- name: Push results to the Hub
env:
HF_TOKEN: ${{ secrets.DIFFUSERS_BOT_TOKEN }}
run: |
cd benchmarks && python push_results.py
mkdir $BASE_PATH && cp *.csv $BASE_PATH

- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: benchmark_test_reports
path: benchmarks/benchmark_outputs
path: benchmarks/${{ env.BASE_PATH }}

# TODO: enable this once the connection problem has been resolved.
- name: Update benchmarking results to DB
env:
PGDATABASE: metrics
PGHOST: ${{ secrets.DIFFUSERS_BENCHMARKS_PGHOST }}
PGUSER: transformers_benchmarks
PGPASSWORD: ${{ secrets.DIFFUSERS_BENCHMARKS_PGPASSWORD }}
BRANCH_NAME: ${{ github.head_ref || github.ref_name }}
run: |
git config --global --add safe.directory /__w/diffusers/diffusers
commit_id=$GITHUB_SHA
commit_msg=$(git show -s --format=%s "$commit_id" | cut -c1-70)
cd benchmarks && python populate_into_db.py "$BRANCH_NAME" "$commit_id" "$commit_msg"

- name: Report success status
if: ${{ success() }}
Expand Down
69 changes: 69 additions & 0 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Diffusers Benchmarks

Welcome to Diffusers Benchmarks. These benchmarks are use to obtain latency and memory information of the most popular models across different scenarios such as:

* Base case i.e., when using `torch.bfloat16` and `torch.nn.functional.scaled_dot_product_attention`.
* Base + `torch.compile()`
* NF4 quantization
* Layerwise upcasting

Instead of full diffusion pipelines, only the forward pass of the respective model classes (such as `FluxTransformer2DModel`) is tested with the real checkpoints (such as `"black-forest-labs/FLUX.1-dev"`).

The entrypoint to running all the currently available benchmarks is in `run_all.py`. However, one can run the individual benchmarks, too, e.g., `python benchmarking_flux.py`. It should produce a CSV file containing various information about the benchmarks run.

The benchmarks are run on a weekly basis and the CI is defined in [benchmark.yml](../.github/workflows/benchmark.yml).

## Running the benchmarks manually

First set up `torch` and install `diffusers` from the root of the directory:

```py
pip install -e ".[quality,test]"
```

Then make sure the other dependencies are installed:

```sh
cd benchmarks/
pip install -r requirements.txt
```

We need to be authenticated to access some of the checkpoints used during benchmarking:

```sh
huggingface-cli login
```

We use an L40 GPU with 128GB RAM to run the benchmark CI. As such, the benchmarks are configured to run on NVIDIA GPUs. So, make sure you have access to a similar machine (or modify the benchmarking scripts accordingly).

Then you can either launch the entire benchmarking suite by running:

```sh
python run_all.py
```

Or, you can run the individual benchmarks.

## Customizing the benchmarks

We define "scenarios" to cover the most common ways in which these models are used. You can
define a new scenario, modifying an existing benchmark file:

```py
BenchmarkScenario(
name=f"{CKPT_ID}-bnb-8bit",
model_cls=FluxTransformer2DModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "transformer",
"quantization_config": BitsAndBytesConfig(load_in_8bit=True),
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=model_init_fn,
)
```

You can also configure a new model-level benchmark and add it to the existing suite. To do so, just defining a valid benchmarking file like `benchmarking_flux.py` should be enough.

Happy benchmarking 🧨
Empty file added benchmarks/__init__.py
Empty file.
Loading