GHA that adds Flash Attention Benchmarking on B200 #49

jduprat · 2025-07-17T19:50:28Z

Triggered on a schedule (every 2 hours for now), by hand, or by remote trigger.
Checkout the FA repo and run its benchmark

.github/workflows/flash_attention.yml

huydhn · 2025-07-17T21:02:54Z

.github/workflows/flash_attention.yml

+  push:
+    paths:
+      - .github/workflows/flash_attention.yml
+  repository_dispatch:


Oh, this is new to me. What does this repository_dispatch do? :)

This allows triggering the action by running —
$ curl -L -X POST -H "Accept: application/vnd.github+json" -H "X-GitHub-Api-Version: 2022-11-28" -H "Authorization: Bearer $TOKEN" https://api.github.com/repos/pytorch/pytorch-integration-testing/dispatches -d '{"event_type": "benchmark_flash_attention"}'

The intent is trigger this action when there is a commit in the FA repo, but the FA repo needs to be updated.

Checkout the FA repo and run benchmark as part of this action

huydhn · 2025-07-18T16:18:04Z

.github/workflows/flash_attention.yml

+          echo '<h1>B200 1000W</h1>' >> $GITHUB_STEP_SUMMARY
+          nvidia-smi
+          export PYTHONPATH=$(pwd)
+          python benchmarks/benchmark_attn.py >> $GITHUB_STEP_SUMMARY


If you are interested in building a dashboard out of it on https://hud.pytorch.org later, feel free to take a look at https://github.com/pytorch/pytorch/wiki/PyTorch-OSS-benchmark-infra#benchmark-results-format to see if we could save the attention benchmark output here in the same format. That's all it is needed to be done.

jduprat self-assigned this Jul 17, 2025

meta-cla bot added the cla signed label Jul 17, 2025

jduprat requested review from seemethere and huydhn July 17, 2025 19:56

jduprat changed the title ~~This action adds Flash Attention Benchmarking on B200~~ GHA that adds Flash Attention Benchmarking on B200 Jul 17, 2025

seemethere reviewed Jul 17, 2025

View reviewed changes

.github/workflows/flash_attention.yml Outdated Show resolved Hide resolved

huydhn reviewed Jul 17, 2025

View reviewed changes

.github/workflows/flash_attention.yml Outdated Show resolved Hide resolved

huydhn reviewed Jul 17, 2025

View reviewed changes

.github/workflows/flash_attention.yml Outdated Show resolved Hide resolved

jduprat force-pushed the flash-attention-benchmarking branch 2 times, most recently from e2a6508 to 4b6416c Compare July 17, 2025 20:44

huydhn reviewed Jul 17, 2025

View reviewed changes

jduprat force-pushed the flash-attention-benchmarking branch from 4b6416c to 7536545 Compare July 17, 2025 21:47

Flash Attention Benchmarking on B200

3b8c5bd

Checkout the FA repo and run benchmark as part of this action

jduprat force-pushed the flash-attention-benchmarking branch from 7536545 to 3b8c5bd Compare July 17, 2025 21:49

huydhn approved these changes Jul 18, 2025

View reviewed changes

huydhn marked this pull request as ready for review July 18, 2025 16:12

jduprat merged commit c5b4d30 into main Jul 18, 2025
3 checks passed

huydhn reviewed Jul 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GHA that adds Flash Attention Benchmarking on B200 #49

GHA that adds Flash Attention Benchmarking on B200 #49

Uh oh!

jduprat commented Jul 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

huydhn Jul 17, 2025 •

edited

Loading

Uh oh!

jduprat Jul 17, 2025

Uh oh!

Uh oh!

huydhn Jul 18, 2025

Uh oh!

Uh oh!

GHA that adds Flash Attention Benchmarking on B200 #49

GHA that adds Flash Attention Benchmarking on B200 #49

Uh oh!

Conversation

jduprat commented Jul 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

huydhn Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jduprat Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

huydhn Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

huydhn Jul 17, 2025 •

edited

Loading