Skip to content

GHA that adds Flash Attention Benchmarking on B200 #49

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 18, 2025

Conversation

jduprat
Copy link
Contributor

@jduprat jduprat commented Jul 17, 2025

Triggered on a schedule (every 2 hours for now), by hand, or by remote trigger.
Checkout the FA repo and run its benchmark

@jduprat jduprat self-assigned this Jul 17, 2025
@meta-cla meta-cla bot added the cla signed label Jul 17, 2025
@jduprat jduprat requested review from seemethere and huydhn July 17, 2025 19:56
@jduprat jduprat changed the title This action adds Flash Attention Benchmarking on B200 GHA that adds Flash Attention Benchmarking on B200 Jul 17, 2025
@jduprat jduprat force-pushed the flash-attention-benchmarking branch 2 times, most recently from e2a6508 to 4b6416c Compare July 17, 2025 20:44
push:
paths:
- .github/workflows/flash_attention.yml
repository_dispatch:
Copy link
Contributor

@huydhn huydhn Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is new to me. What does this repository_dispatch do? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows triggering the action by running —
$ curl -L -X POST -H "Accept: application/vnd.github+json" -H "X-GitHub-Api-Version: 2022-11-28" -H "Authorization: Bearer $TOKEN" https://api.github.com/repos/pytorch/pytorch-integration-testing/dispatches -d '{"event_type": "benchmark_flash_attention"}'

The intent is trigger this action when there is a commit in the FA repo, but the FA repo needs to be updated.

@jduprat jduprat force-pushed the flash-attention-benchmarking branch from 4b6416c to 7536545 Compare July 17, 2025 21:47
Checkout the FA repo and run benchmark as part of this action
@jduprat jduprat force-pushed the flash-attention-benchmarking branch from 7536545 to 3b8c5bd Compare July 17, 2025 21:49
@huydhn huydhn marked this pull request as ready for review July 18, 2025 16:12
@jduprat jduprat merged commit c5b4d30 into main Jul 18, 2025
3 checks passed
echo '<h1>B200 1000W</h1>' >> $GITHUB_STEP_SUMMARY
nvidia-smi
export PYTHONPATH=$(pwd)
python benchmarks/benchmark_attn.py >> $GITHUB_STEP_SUMMARY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are interested in building a dashboard out of it on https://hud.pytorch.org later, feel free to take a look at https://github.com/pytorch/pytorch/wiki/PyTorch-OSS-benchmark-infra#benchmark-results-format to see if we could save the attention benchmark output here in the same format. That's all it is needed to be done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants