feat(vllm-tensorizer): Upgrade vLLM version and Resolve Related Build Compatibility Issues #98

JustinPerlman · 2025-06-17T17:11:36Z

This PR focuses on updating the vLLM version being used within the vllm-tensorizer Docker image and resolving the various bugs and compatibility issues that arose from this upgrade.

Key Changes:

Updated Base Image: Switched to ghcr.io/coreweave/ml-containers/torch-extras:es-compute-12.0-67208ca-nccl-cuda12.9.0-ubuntu22.04-nccl2.27.3-1-torch2.7.1-vision0.22.1-audio2.7.1-abi1. This moves from torch:base to torch:nccl for CUDA development tools and aligns with PyTorch 2.7.1 and CUDA 12.9.0.
Upstream vLLM: Updated vLLM source to pull from the official vllm-project/vllm.git repository (v0.9.1), rather than CoreWeave's vLLM fork
Build Compatibility: Resolved various compilation errors, including:
- FileNotFoundError: .../nvcc and CUDA::nvToolsExt not found by ensuring correct torch:nccl base image.
- PyTorch version expected errors by integrating use_existing_torch.py to handle vLLM's PyTorch version expectations.
- Out-of-memory issues during Flash Attention compilation by setting MAX_JOBS=16.
- Dockerfile Hygiene: Improved Dockerfile clarity by consistently uppercasing AS keywords.

This commit updates the default VLLM_COMMIT_HASH used in the GitHub Actions workflow for the vllm-tensorizer image. This change points the build to a more recent commit of the vLLM project.

…vllm compilation

…ency

… build stages

…ilation

…ch 2.7.0 compatibility

…etadata

…atibility

…patibility

…ld issues

… Attention compilation

…balancing OOM risk

…ndency conflict

…ic` conflict

…A dev install; Set MAX_JOBS to 10

…ision, and torchaudio versions

…ion with PyTorch 2.7.1; uppercase all `as` keywords

…orch 2.7.1 and CUDA 12.8.1

…nd CUDA dev tools

…uild flags for PyTorch compatibility; up `MAX_JOBS` to 16

…1 variant; revert xformers constraint removal

arsenetar

How does the result of this build compare to the offical vllm image? Do we have all the same libraries etc? https://github.com/vllm-project/vllm/blob/main/docker/Dockerfile.

The CI here needs updates to build new version without manually specifying the version of base images to use.

I would also like to move to rename this to vllm-openai to align with the vllm project upstream image.

.github/workflows/vllm-tensorizer.yml

vllm-tensorizer/Dockerfile

.github/workflows/vllm-tensorizer.yml

vllm-tensorizer/Dockerfile

Co-authored-by: Eta <24918963+Eta0@users.noreply.github.com>

Eta0 · 2025-06-17T17:33:50Z

@JustinPerlman I think the way to go here to better gather the different workflow variables in one place (vLLM version, base image version, etc.) is to move to a matrix build strategy even if the matrix only has a single element. We can work on this together, as there is a template I put together for doing this.

Eta0 · 2025-06-17T17:42:18Z

@arsenetar, about this:

The CI here needs updates to build new version without manually specifying the version of base images to use.

What would you see as not manually specifying it? Automatically pulling the most recent image from main? Currently the only build in this repository that does that is torch-extras and it has some issues in practice, mainly that we often have changes in downstream images that require corresponding changes in the torch base image that will in turn require a new branch, so hardcoding it to look for images from main doesn't work very well for development

arsenetar · 2025-06-17T18:31:48Z

What would you see as not manually specifying it? Automatically pulling the most recent image from main? Currently the only build in this repository that does that is torch-extras and it has some issues in practice, mainly that we often have changes in downstream images that require corresponding changes in the torch base image that will in turn require a new branch, so hardcoding it to look for images from main doesn't work very well for development

I would expect the CI to set the ARG for the base image, there are likely reasons where we will want to build a matrix of versions similarly to the other torch images. Minimally that is something that should be changed to set that up for the CI to define/allow passing that.

With respect to upstream images breaking things and requiring updates, that is somewhat expected to happen. There is a difference between development and builds that should be considered validated released. This repo's package structure does not really help delineate the differences.

Co-authored-by: Eta <esyra@coreweave.com>

github-actions · 2025-06-17T18:57:38Z

@JustinPerlman Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/15714183890
Image: ``

Co-authored-by: Eta <esyra@coreweave.com>

This commit additionally renames VLLM_COMMIT_HASH to VLLM_COMMIT, makes git clones more efficient, and reformats some YAML lists.

Eta0 · 2025-06-17T21:19:50Z

How does the result of this build compare to the offical vllm image? Do we have all the same libraries etc?

As far as I can tell, the only libraries we don't install that they do are ffmpeg libsm6 libxext6 libgl1, which is a bit of a strange set. I'm not sure we really need those, but if it comes up, we can add them. We also don't install uv to replace pip.

arsenetar · 2025-06-17T21:59:15Z

As far as I can tell, the only libraries we don't install that they do are ffmpeg libsm6 libxext6 libgl1, which is a bit of a strange set. I'm not sure we really need those, but if it comes up, we can add them. We also don't install uv to replace pip.

Out of all of that I could only think of ffmpeg as being of value if it's used by libraries for the multimodal.

Eta0 · 2025-06-17T22:43:32Z

I would also like to move to rename this to vllm-openai to align with the vllm project upstream image.

This would create a new package listing alongside vllm-tensorizer in the repo package list, so I'd prefer not to rename it unless there's value added by doing so.

github-actions · 2025-06-17T23:23:35Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/15718383514
Image: ``

sangstar

Tested; LGTM.

JustinPerlman added 30 commits June 6, 2025 13:43

ci(vllm-tensorizer): Update vLLM source commit in build pipeline

56eed9d

This commit updates the default VLLM_COMMIT_HASH used in the GitHub Actions workflow for the vllm-tensorizer image. This change points the build to a more recent commit of the vLLM project.

build(vllm-tensorizer): Update torch-extras base image

1b8b7bb

chore: Add .idea/ to .gitignore

face617

fix(vllm-tensorizer): Remove redundant CUDA dev package installation

0ca8228

fix(vllm-tensorizer): install setuptools_scm and cmake for vLLM build

1512fdf

fix(vllm-tensorizer): update triton version to 2.58.0

1ccf357

fix(vllm-tensorizer): update triton version to 3.3.1

b32bae5

fix(vllm-tensorizer): update triton version to 2.3.1

86181c3

fix(vllm-tensorizer): remove explicit triton versioning

3228eb7

feat(vllm-tensorizer): implement custom triton build and install for …

5c52d8e

…vllm compilation

fix(vllm-tensorizer): reorder build stages to resolve circular depend…

3a78a56

…ency

fix(vllm-tensorizer): Remove accidental backslashes

a7e3e19

feat(vllm-tensorizer): Add MAX_JOBS unset logic; remove custom triton…

6dcafb5

… build stages

fix(vllm-builder): Configure CUDA environment variables for vLLM comp…

9a1a9c1

…ilation

fix(vllm-tensorizer): Update vLLM commit to a newer version for PyTor…

af19873

…ch 2.7.0 compatibility

fix(vllm-tensorizer): Install missing regex module for vLLM build m…

93db31b

…etadata

feat(vllm-tensorizer): Switch to upstream vLLM for PyTorch 2.7.0 comp…

f865d51

…atibility

feat(vllm-tensorizer): Downgrade vLLM to v0.9.0 for PyTorch 2.7.0 com…

ea0074b

…patibility

fix(vllm-tensorizer): Apply CMake patch for nvToolsExt linking issue

7631031

fix(vllm-builder): Simplify find_library call in nvToolsExt CMake patch

c0b2d0c

fix(vllm-tensorizer): Add missing )

17d917b

fix(vllm-tensorizer): Remove Cmake patch

3bf996b

fix(vllm-tensorizer): Update base image to CUDA 12.8.1 to resolve bui…

ac48891

…ld issues

fix(vllm-tensorizer): Set MAX_JOBS to 2 to prevent OOM during Flash…

ef1ebfc

… Attention compilation

fix(vllm-tensorizer): Increase MAX_JOBS to 8 for faster compilation, …

5bf13cc

…balancing OOM risk

fix(vllm-tensorizer): Remove xformers constraint to resolve vLLM depe…

c0f6a04

…ndency conflict

fix(vllm-tensorizer): Remove fschat installation to resolve `pydant…

a932752

…ic` conflict

feat(vllm-tensorizer): Upgrade to PyTorch 2.7.1; Remove commented CUD…

b872b3e

…A dev install; Set MAX_JOBS to 10

fix(vllm-tensorizer): Correct base image tag to align PyTorch, torchv…

c645bd5

…ision, and torchaudio versions

fix(vllm-tensorizer): Correct base image tag to align torchaudio vers…

9f0eaf6

…ion with PyTorch 2.7.1; uppercase all `as` keywords

JustinPerlman added 5 commits June 16, 2025 14:49

fix(vllm-tensorizer): Use correct and existing base image tag for PyT…

a1a8a22

…orch 2.7.1 and CUDA 12.8.1

fix(vllm-tensorizer): Use 'nccl' compute base image to provide nvcc a…

aea4d46

…nd CUDA dev tools

feat(vllm-tensorizer): Add use_existing_torch.py helper and related b…

ced54a1

…uild flags for PyTorch compatibility; up `MAX_JOBS` to 16

feat(vllm-tensorizer): Update base image to CUDA 12.9.0, PyTorch 2.7.…

9effc1b

…1 variant; revert xformers constraint removal

fix(vllm-tensorizer): Undelete FROM scratch AS freezer

1057644

JustinPerlman requested a review from Eta0 June 17, 2025 17:11

JustinPerlman self-assigned this Jun 17, 2025

Eta0 added the enhancement New feature or request label Jun 17, 2025

Eta0 linked an issue Jun 17, 2025 that may be closed by this pull request

Hi, Would consider update an vllm image update to latest? #69

Closed

arsenetar reviewed Jun 17, 2025

View reviewed changes

.github/workflows/vllm-tensorizer.yml Outdated Show resolved Hide resolved

.github/workflows/vllm-tensorizer.yml Outdated Show resolved Hide resolved

vllm-tensorizer/Dockerfile Show resolved Hide resolved

Eta0 requested changes Jun 17, 2025

View reviewed changes

.github/workflows/vllm-tensorizer.yml Outdated Show resolved Hide resolved

vllm-tensorizer/Dockerfile Show resolved Hide resolved

fix(vllm-tensorizer): Remove leftover TRITON_COMMIT

092946c

Co-authored-by: Eta <24918963+Eta0@users.noreply.github.com>

JustinPerlman and others added 2 commits June 17, 2025 14:53

fix(vllm-tensorizer): Improve Dockerfile ARG passing

1b60e61

Co-authored-by: Eta <esyra@coreweave.com>

style(vllm-tensorizer): Rename build stage

26479bb

Co-authored-by: Eta <esyra@coreweave.com>

JustinPerlman and others added 2 commits June 17, 2025 15:18

feat(vllm-tensorizer): Install OpenAI-compatible server dependencies

ee71e13

Co-authored-by: Eta <esyra@coreweave.com>

feat(vllm-tensorizer): Add flashinfer build, plus misc. minor changes

6ec3cc6

This commit additionally renames VLLM_COMMIT_HASH to VLLM_COMMIT, makes git clones more efficient, and reformats some YAML lists.

fix(vllm-tensorizer): Use POSIX sh-safe string substitution

2193567

Eta0 approved these changes Jun 18, 2025

View reviewed changes

JustinPerlman requested a review from sangstar June 18, 2025 18:32

sangstar approved these changes Jun 18, 2025

View reviewed changes

Eta0 merged commit c87fc8f into main Jun 18, 2025
2 checks passed

Eta0 deleted the jp/testing/update-vllm branch June 18, 2025 18:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(vllm-tensorizer): Upgrade vLLM version and Resolve Related Build Compatibility Issues #98

feat(vllm-tensorizer): Upgrade vLLM version and Resolve Related Build Compatibility Issues #98

Uh oh!

JustinPerlman commented Jun 17, 2025 •

edited

Loading

Uh oh!

arsenetar left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Eta0 commented Jun 17, 2025

Uh oh!

Eta0 commented Jun 17, 2025 •

edited

Loading

Uh oh!

arsenetar commented Jun 17, 2025

Uh oh!

github-actions bot commented Jun 17, 2025

Uh oh!

Eta0 commented Jun 17, 2025

Uh oh!

arsenetar commented Jun 17, 2025

Uh oh!

Eta0 commented Jun 17, 2025

Uh oh!

github-actions bot commented Jun 17, 2025

Uh oh!

sangstar left a comment

Uh oh!

Uh oh!

Uh oh!

feat(vllm-tensorizer): Upgrade vLLM version and Resolve Related Build Compatibility Issues #98

feat(vllm-tensorizer): Upgrade vLLM version and Resolve Related Build Compatibility Issues #98

Uh oh!

Conversation

JustinPerlman commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenetar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Eta0 commented Jun 17, 2025

Uh oh!

Eta0 commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenetar commented Jun 17, 2025

Uh oh!

github-actions bot commented Jun 17, 2025

Uh oh!

Eta0 commented Jun 17, 2025

Uh oh!

arsenetar commented Jun 17, 2025

Uh oh!

Eta0 commented Jun 17, 2025

Uh oh!

github-actions bot commented Jun 17, 2025

Uh oh!

sangstar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

JustinPerlman commented Jun 17, 2025 •

edited

Loading

Eta0 commented Jun 17, 2025 •

edited

Loading