Skip to content

Setup vLLM benchmark CI for H100 #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 41 commits into from
Jun 2, 2025
Merged

Setup vLLM benchmark CI for H100 #32

merged 41 commits into from
Jun 2, 2025

Conversation

huydhn
Copy link
Contributor

@huydhn huydhn commented May 29, 2025

The new workflow can be run periodically every 2 hours or on demand by setting the commit from vLLM main branch to benchmark. It works as follows:

  1. When schedule, the workflow checks the latest commits from vLLM main branch chronologically until it finds the latest commit whose vLLM CI Docker image has already been built and has not been benchmarked yet. The Docker image name is public.ecr.aws/q9t5s3a7/vllm-ci-postmerge-repo:<SHA>
  2. When running on demand, it will just check for the request Docker image and returns early if that doesn't exist yet
  3. The workflows uses the official benchmark scripts from vLLM at https://github.com/vllm-project/vllm/blob/main/.buildkite/nightly-benchmarks/scripts/run-performance-benchmarks.sh
  4. Instead of using the list of models from vLLM, we are going to use those from https://github.com/pytorch/pytorch-integration-testing/tree/main/vllm-benchmarks/benchmarks so that we can control exactly what to benchmark
  5. 4xH100 currently takes 45 minutes to finish all benchmarks

Some more PRs are coming after this:

Testing

The results are showing up on the dashboard now https://hud.pytorch.org/benchmark/llms?startTime=Fri%2C%2023%20May%202025%2019%3A19%3A35%20GMT&stopTime=Fri%2C%2030%20May%202025%2019%3A19%3A35%20GMT&granularity=day&lBranch=main&lCommit=7f21e8052b5f3948c8a59514a8dc1e9c5eef70d6&rBranch=main&rCommit=7f21e8052b5f3948c8a59514a8dc1e9c5eef70d6&repoName=vllm-project%2Fvllm&benchmarkName=&modelName=All%20Models&backendName=All%20Backends&modeName=All%20Modes&dtypeName=All%20DType&deviceName=All%20Devices&archName=All%20Platforms

Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
@huydhn huydhn temporarily deployed to pytorch-x-vllm May 29, 2025 01:29 — with GitHub Actions Inactive
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
@huydhn huydhn temporarily deployed to pytorch-x-vllm May 29, 2025 01:56 — with GitHub Actions Inactive
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
@huydhn huydhn temporarily deployed to pytorch-x-vllm May 29, 2025 08:11 — with GitHub Actions Inactive
Signed-off-by: Huy Do <huydhn@gmail.com>
huydhn added 4 commits May 29, 2025 13:21
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
@huydhn huydhn temporarily deployed to pytorch-x-vllm May 30, 2025 09:20 — with GitHub Actions Inactive
Signed-off-by: Huy Do <huydhn@gmail.com>
@huydhn huydhn marked this pull request as ready for review May 30, 2025 18:01
Signed-off-by: Huy Do <huydhn@gmail.com>
@huydhn huydhn temporarily deployed to pytorch-x-vllm May 30, 2025 18:40 — with GitHub Actions Inactive
Signed-off-by: Huy Do <huydhn@gmail.com>
@huydhn huydhn temporarily deployed to pytorch-x-vllm May 30, 2025 19:24 — with GitHub Actions Inactive
@huydhn huydhn requested review from yangw-dev, seemethere and malfet May 30, 2025 19:24
Copy link
Contributor

@yangw-dev yangw-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

jobs:
benchmark-h100:
name: Run vLLM benchmarks
runs-on: linux.aws.h100.4
Copy link
Contributor

@yangw-dev yangw-dev Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for my own knowledge, is this mean instance with 4 h100?

how many of those we have now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have 4 of them atm. Also, FYI, there is one 8xH100 runner too.

@huydhn huydhn merged commit 4a7fc56 into main Jun 2, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants