-
Notifications
You must be signed in to change notification settings - Fork 5
feat(vllm-tensorizer): Upgrade vLLM version and Resolve Related Build Compatibility Issues #98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+100
−54
Merged
Changes from 36 commits
Commits
Show all changes
42 commits
Select commit
Hold shift + click to select a range
56eed9d
ci(vllm-tensorizer): Update vLLM source commit in build pipeline
JustinPerlman 1b8b7bb
build(vllm-tensorizer): Update `torch-extras` base image
JustinPerlman face617
chore: Add .idea/ to .gitignore
JustinPerlman 0ca8228
fix(vllm-tensorizer): Remove redundant CUDA dev package installation
JustinPerlman 1512fdf
fix(vllm-tensorizer): install setuptools_scm and cmake for vLLM build
JustinPerlman 1ccf357
fix(vllm-tensorizer): update triton version to 2.58.0
JustinPerlman b32bae5
fix(vllm-tensorizer): update triton version to 3.3.1
JustinPerlman 86181c3
fix(vllm-tensorizer): update triton version to 2.3.1
JustinPerlman 3228eb7
fix(vllm-tensorizer): remove explicit triton versioning
JustinPerlman 5c52d8e
feat(vllm-tensorizer): implement custom triton build and install for …
JustinPerlman 3a78a56
fix(vllm-tensorizer): reorder build stages to resolve circular depend…
JustinPerlman a7e3e19
fix(vllm-tensorizer): Remove accidental backslashes
JustinPerlman 6dcafb5
feat(vllm-tensorizer): Add MAX_JOBS unset logic; remove custom triton…
JustinPerlman 9a1a9c1
fix(vllm-builder): Configure CUDA environment variables for vLLM comp…
JustinPerlman af19873
fix(vllm-tensorizer): Update vLLM commit to a newer version for PyTor…
JustinPerlman 93db31b
fix(vllm-tensorizer): Install missing `regex` module for vLLM build m…
JustinPerlman f865d51
feat(vllm-tensorizer): Switch to upstream vLLM for PyTorch 2.7.0 comp…
JustinPerlman ea0074b
feat(vllm-tensorizer): Downgrade vLLM to v0.9.0 for PyTorch 2.7.0 com…
JustinPerlman 7631031
fix(vllm-tensorizer): Apply CMake patch for nvToolsExt linking issue
JustinPerlman c0b2d0c
fix(vllm-builder): Simplify find_library call in nvToolsExt CMake patch
JustinPerlman 17d917b
fix(vllm-tensorizer): Add missing `)`
JustinPerlman 3bf996b
fix(vllm-tensorizer): Remove Cmake patch
JustinPerlman ac48891
fix(vllm-tensorizer): Update base image to CUDA 12.8.1 to resolve bui…
JustinPerlman ef1ebfc
fix(vllm-tensorizer): Set `MAX_JOBS` to 2 to prevent OOM during Flash…
JustinPerlman 5bf13cc
fix(vllm-tensorizer): Increase MAX_JOBS to 8 for faster compilation, …
JustinPerlman c0f6a04
fix(vllm-tensorizer): Remove xformers constraint to resolve vLLM depe…
JustinPerlman a932752
fix(vllm-tensorizer): Remove `fschat` installation to resolve `pydant…
JustinPerlman b872b3e
feat(vllm-tensorizer): Upgrade to PyTorch 2.7.1; Remove commented CUD…
JustinPerlman c645bd5
fix(vllm-tensorizer): Correct base image tag to align PyTorch, torchv…
JustinPerlman 9f0eaf6
fix(vllm-tensorizer): Correct base image tag to align torchaudio vers…
JustinPerlman ae288f5
fix(vllm-tensorizer): Use correct and existing base image tag for PyT…
JustinPerlman a1a8a22
fix(vllm-tensorizer): Use correct and existing base image tag for PyT…
JustinPerlman aea4d46
fix(vllm-tensorizer): Use 'nccl' compute base image to provide nvcc a…
JustinPerlman ced54a1
feat(vllm-tensorizer): Add use_existing_torch.py helper and related b…
JustinPerlman 9effc1b
feat(vllm-tensorizer): Update base image to CUDA 12.9.0, PyTorch 2.7.…
JustinPerlman 1057644
fix(vllm-tensorizer): Undelete `FROM scratch AS freezer`
JustinPerlman 092946c
fix(vllm-tensorizer): Remove leftover `TRITON_COMMIT`
JustinPerlman 1b60e61
fix(vllm-tensorizer): Improve Dockerfile ARG passing
JustinPerlman 26479bb
style(vllm-tensorizer): Rename build stage
JustinPerlman ee71e13
feat(vllm-tensorizer): Install OpenAI-compatible server dependencies
JustinPerlman 6ec3cc6
feat(vllm-tensorizer): Add `flashinfer` build, plus misc. minor changes
Eta0 2193567
fix(vllm-tensorizer): Use POSIX `sh`-safe string substitution
Eta0 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -162,3 +162,6 @@ flycheck_*.el | |
.env* | ||
.environment | ||
.environment* | ||
|
||
# JetBrains Idea files | ||
.idea/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.