Skip to content

Commit efe73d0

Browse files
authored
[doc] update doc format (#20673)
Signed-off-by: reidliu41 <reid201711@gmail.com>
1 parent 853487b commit efe73d0

File tree

1 file changed

+51
-27
lines changed

1 file changed

+51
-27
lines changed

docs/contributing/ci/update_pytorch_version.md

Lines changed: 51 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -16,46 +16,67 @@ by waiting for the next release or by implementing hacky workarounds in vLLM.
1616
The better solution is to test vLLM with PyTorch release candidates (RC) to ensure
1717
compatibility before each release.
1818

19-
PyTorch release candidates can be downloaded from PyTorch test index at https://download.pytorch.org/whl/test.
20-
For example, torch2.7.0+cu12.8 RC can be installed using the following command:
19+
PyTorch release candidates can be downloaded from [PyTorch test index](https://download.pytorch.org/whl/test).
20+
For example, `torch2.7.0+cu12.8` RC can be installed using the following command:
2121

22-
```
23-
uv pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/test/cu128
22+
```bash
23+
uv pip install torch torchvision torchaudio \
24+
--index-url https://download.pytorch.org/whl/test/cu128
2425
```
2526

2627
When the final RC is ready for testing, it will be announced to the community
2728
on the [PyTorch dev-discuss forum](https://dev-discuss.pytorch.org/c/release-announcements).
2829
After this announcement, we can begin testing vLLM integration by drafting a pull request
2930
following this 3-step process:
3031

31-
1. Update requirements files in https://github.com/vllm-project/vllm/tree/main/requirements
32-
to point to the new releases for torch, torchvision, and torchaudio.
33-
2. Use `--extra-index-url https://download.pytorch.org/whl/test/<PLATFORM>` to
34-
get the final release candidates' wheels. Some common platforms are `cpu`, `cu128`,
35-
and `rocm6.2.4`.
36-
3. As vLLM uses uv, make sure that `unsafe-best-match` strategy is set either
37-
via `UV_INDEX_STRATEGY` env variable or via `--index-strategy unsafe-best-match`.
32+
1. Update [requirements files](https://github.com/vllm-project/vllm/tree/main/requirements)
33+
to point to the new releases for `torch`, `torchvision`, and `torchaudio`.
34+
35+
2. Use the following option to get the final release candidates' wheels. Some common platforms are `cpu`, `cu128`, and `rocm6.2.4`.
36+
37+
```bash
38+
--extra-index-url https://download.pytorch.org/whl/test/<PLATFORM>
39+
```
40+
41+
3. Since vLLM uses `uv`, ensure the following index strategy is applied:
42+
43+
- Via environment variable:
44+
45+
```bash
46+
export UV_INDEX_STRATEGY=unsafe-best-match
47+
```
48+
49+
- Or via CLI flag:
50+
51+
```bash
52+
--index-strategy unsafe-best-match
53+
```
3854

3955
If failures are found in the pull request, raise them as issues on vLLM and
4056
cc the PyTorch release team to initiate discussion on how to address them.
4157

4258
## Update CUDA version
4359

4460
The PyTorch release matrix includes both stable and experimental [CUDA versions](https://github.com/pytorch/pytorch/blob/main/RELEASE.md#release-compatibility-matrix). Due to limitations, only the latest stable CUDA version (for example,
45-
torch2.7.0+cu12.6) is uploaded to PyPI. However, vLLM may require a different CUDA version,
61+
`torch2.7.0+cu12.6`) is uploaded to PyPI. However, vLLM may require a different CUDA version,
4662
such as 12.8 for Blackwell support.
4763
This complicates the process as we cannot use the out-of-the-box
4864
`pip install torch torchvision torchaudio` command. The solution is to use
4965
`--extra-index-url` in vLLM's Dockerfiles.
5066
51-
1. Use `--extra-index-url https://download.pytorch.org/whl/cu128` to install torch+cu128.
52-
2. Other important indexes at the moment include:
53-
1. CPU ‒ https://download.pytorch.org/whl/cpu
54-
2. ROCm ‒ https://download.pytorch.org/whl/rocm6.2.4 and https://download.pytorch.org/whl/rocm6.3
55-
3. XPU ‒ https://download.pytorch.org/whl/xpu
56-
3. Update .buildkite/release-pipeline.yaml and .buildkite/scripts/upload-wheels.sh to
57-
match the CUDA version from step 1. This makes sure that the release vLLM wheel is tested
58-
on CI.
67+
- Important indexes at the moment include:
68+
69+
| Platform | `--extra-index-url` |
70+
|----------|-----------------|
71+
| CUDA 12.8| [https://download.pytorch.org/whl/cu128](https://download.pytorch.org/whl/cu128)|
72+
| CPU | [https://download.pytorch.org/whl/cpu](https://download.pytorch.org/whl/cpu)|
73+
| ROCm 6.2 | [https://download.pytorch.org/whl/rocm6.2.4](https://download.pytorch.org/whl/rocm6.2.4) |
74+
| ROCm 6.3 | [https://download.pytorch.org/whl/rocm6.3](https://download.pytorch.org/whl/rocm6.3) |
75+
| XPU | [https://download.pytorch.org/whl/xpu](https://download.pytorch.org/whl/xpu) |
76+
77+
- Update the below files to match the CUDA version from step 1. This makes sure that the release vLLM wheel is tested on CI.
78+
- `.buildkite/release-pipeline.yaml`
79+
- `.buildkite/scripts/upload-wheels.sh`
5980
6081
## Address long vLLM build time
6182
@@ -66,7 +87,7 @@ it doesn't populate the cache, so re-running it to warm up the cache
6687
is ineffective.
6788
6889
While ongoing efforts like [#17419](gh-issue:17419)
69-
address the long build time at its source, the current workaround is to set VLLM_CI_BRANCH
90+
address the long build time at its source, the current workaround is to set `VLLM_CI_BRANCH`
7091
to a custom branch provided by @khluu (`VLLM_CI_BRANCH=khluu/use_postmerge_q`)
7192
when manually triggering a build on Buildkite. This branch accomplishes two things:
7293
@@ -86,31 +107,34 @@ releases (which would take too much time), they can be built from
86107
source to unblock the update process.
87108

88109
### FlashInfer
89-
Here is how to build and install it from source with torch2.7.0+cu128 in vLLM [Dockerfile](https://github.com/vllm-project/vllm/blob/27bebcd89792d5c4b08af7a65095759526f2f9e1/docker/Dockerfile#L259-L271):
110+
Here is how to build and install it from source with `torch2.7.0+cu128` in vLLM [Dockerfile](https://github.com/vllm-project/vllm/blob/27bebcd89792d5c4b08af7a65095759526f2f9e1/docker/Dockerfile#L259-L271):
90111

91112
```bash
92113
export TORCH_CUDA_ARCH_LIST='7.5 8.0 8.9 9.0 10.0+PTX'
93114
export FLASHINFER_ENABLE_SM90=1
94-
uv pip install --system --no-build-isolation "git+https://github.com/flashinfer-ai/flashinfer@v0.2.6.post1"
115+
uv pip install --system \
116+
--no-build-isolation "git+https://github.com/flashinfer-ai/flashinfer@v0.2.6.post1"
95117
```
96118

97119
One caveat is that building FlashInfer from source adds approximately 30
98120
minutes to the vLLM build time. Therefore, it's preferable to cache the wheel in a
99-
public location for immediate installation, such as https://download.pytorch.org/whl/cu128/flashinfer/flashinfer_python-0.2.6.post1%2Bcu128torch2.7-cp39-abi3-linux_x86_64.whl. For future releases, contact the PyTorch release
121+
public location for immediate installation, such as [this FlashInfer wheel link](https://download.pytorch.org/whl/cu128/flashinfer/flashinfer_python-0.2.6.post1%2Bcu128torch2.7-cp39-abi3-linux_x86_64.whl). For future releases, contact the PyTorch release
100122
team if you want to get the package published there.
101123
102124
### xFormers
103125
Similar to FlashInfer, here is how to build and install xFormers from source:
104126
105127
```bash
106128
export TORCH_CUDA_ARCH_LIST='7.0 7.5 8.0 8.9 9.0 10.0+PTX'
107-
MAX_JOBS=16 uv pip install --system --no-build-isolation "git+https://github.com/facebookresearch/xformers@v0.0.30"
129+
MAX_JOBS=16 uv pip install --system \
130+
--no-build-isolation "git+https://github.com/facebookresearch/xformers@v0.0.30"
108131
```
109132
110133
### Mamba
111134
112135
```bash
113-
uv pip install --system --no-build-isolation "git+https://github.com/state-spaces/mamba@v2.2.4"
136+
uv pip install --system \
137+
--no-build-isolation "git+https://github.com/state-spaces/mamba@v2.2.4"
114138
```
115139
116140
### causal-conv1d
@@ -125,6 +149,6 @@ Rather than attempting to update all vLLM platforms in a single pull request, it
125149
to handle some platforms separately. The separation of requirements and Dockerfiles
126150
for different platforms in vLLM CI/CD allows us to selectively choose
127151
which platforms to update. For instance, updating XPU requires the corresponding
128-
release from https://github.com/intel/intel-extension-for-pytorch by Intel.
152+
release from [Intel Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch) by Intel.
129153
While <gh-pr:16859> updated vLLM to PyTorch 2.7.0 on CPU, CUDA, and ROCm,
130154
<gh-pr:17444> completed the update for XPU.

0 commit comments

Comments
 (0)