Skip to content

[docs] Add install from PyPI to docs #327

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Jul 23, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/lint_docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,4 @@ jobs:
- name: Install dependencies
run: uv sync --frozen --only-group lint
- name: Lint docs
run: tools/lint_docs.sh
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tools/lint_docs.sh runs both scan and fix -- but during a CI run, changing files to fix lint errors is not needed

run: pymarkdownlnt scan docs -r
4 changes: 4 additions & 0 deletions docs/contributing/continuous_batching/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,15 @@ Brief overview of what has been implemented so far in VLLM to test / debug conti
* **Purpose:** Debugging (ie. using manual execution)

### Description

* Runs inference on a set of prompts with continuous batching enabled (number of prompts is parametrizable)
* Prints the generated text for each sequence.
* All the requested sequences are defined in the beginning, there is no requests joining the waiting queue while the decoding of some other request has already started.
* The exact sequence of prefill and decode steps depends on the parameter values `max_num_seqs`, `num-prompts`, `max-tokens`.
* If `--compare-with-CPU` is set, then the output text is compared to the one of hugging face, running on CPU. Note that here the logprobs are not compared, only tokens.

### Parametrization

For `cb_spyre_inference.py`

* `--model`: the model
Expand Down Expand Up @@ -49,6 +51,7 @@ For `long_context.py`: the same parameters, but with some differences:
* Other Tests: various files including `vllm-spyre/tests/e2e/test_spyre_cb.py`

<!-- markdownlint-disable MD031 MD046 -->

### Usage (when running locally)

#### Commands
Expand All @@ -65,6 +68,7 @@ For `long_context.py`: the same parameters, but with some differences:
<!-- markdownlint-enable MD031 MD046 -->

#### Parameters description

* `-x` option: stops the execution as soon as a test fails
* `-s` option: show all the print statements in the code
* `-v` option: verbose mode, make the test output more detailed: show name of each test function and whether it passed, failed or was skipped
Expand Down
220 changes: 202 additions & 18 deletions docs/getting_started/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,43 +5,227 @@ installation of the plugin and its dependencies. `uv` provides advanced
dependency resolution which is required to properly install dependencies like
`vllm` without overwriting critical dependencies like `torch`.

First, clone the `vllm-spyre` repo:
## Install `uv`

```sh
git clone https://github.com/vllm-project/vllm-spyre.git
cd vllm-spyre
```
You can [install `uv`](https://docs.astral.sh/uv/guides/install-python/) using `pip`:

Then, install `uv`:

```sh
pip install uv
```

Now, create and activate a new [venv](https://docs.astral.sh/uv/pip/environments/):

## Create a Python Virtual Environment

Now create and activate a new Python (3.12) [virtual environment](https://docs.astral.sh/uv/pip/environments/):

```sh
uv venv --python 3.12 --seed .venv
uv venv --python 3.12 --seed .venv --system-site-packages
source .venv/bin/activate
```

To install `vllm-spyre` locally with development dependencies, use the following command:
??? question "Why do I want the `--system-site-packages`?"
Because the full `torch_sendnn` stack is only available pre-installed in a
base environment, we need to add the `--system-site-packages` to the new
virtual environment in order to fully support the Spyre hardware.

**Note**, pulling in the system site packages is not required for CPU-only
installations.

## Install vLLM with the vLLM-Spyre Plugin

You can either install a released version of the vLLM-Spyre plugin directly from
[PyPI](https://pypi.org/project/vllm-spyre/) or you can install from source by
cloning the [vLLM-Spyre](https://github.com/vllm-project/vllm-spyre) repo from
GitHub.

=== "Release (PyPI)"

```sh
echo "torch; sys_platform == 'never'
torchaudio; sys_platform == 'never'
torchvision; sys_platform == 'never'
triton; sys_platform == 'never'" > overrides.txt

uv pip install vllm-spyre --overrides overrides.txt
```

??? question "Why do I need the `--overrides`?"
To avoid dependency resolution errors, we need to install `torch`
separately and tell `uv` to ignore any of it's dependencies while
installing the `vllm-spyre` plugin.

=== "Source (GitHub)"

First, clone the `vllm-spyre` repo:

```sh
git clone https://github.com/vllm-project/vllm-spyre.git
cd vllm-spyre
```

To install `vllm-spyre` locally with development dependencies, use the following command:

```sh
uv sync --frozen --active --inexact
```

To include optional linting dependencies, include `--group lint`:

```sh
uv sync --frozen --active --inexact --group lint
```

!!! tip
The `dev` group (i.e. `--group dev`) is enabled by default.

## Install PyTorch

Finally, `torch` is needed to run examples and tests. If it is not already installed,
install it using `pip`.

=== "Linux"

```sh
pip install torch=="2.7.1+cpu" --index-url "https://download.pytorch.org/whl/cpu"
```

=== "Windows/macOS"

```sh
pip install torch=="2.7.1"
```

!!! note
On Linux the `+cpu` package should be installed, since we don't need any of
the `cuda` dependencies which are included by default for Linux installs.
This requires `--index-url https://download.pytorch.org/whl/cpu` on Linux.
On Windows and macOS the CPU package is the default one.

## Troubleshooting

As the installation process is evolving over time, you may have arrived here after
following outdated installation steps. If you encountered any of the errors below,
it may be easiest to start over with a new Python virtual environment (`.venv`)
as outlined above.

### Installation using `pip` (instead of `uv`)

If you happen to have followed the pre-`uv` installation instructions, you might
encounter an error like this:

```sh
LookupError: setuptools-scm was unable to detect version for /home/senuser/multi-aiu-dev/_dev/sentient-ci-cd/_dev/sen_latest/vllm-spyre.

Make sure you're either building from a fully intact git repository or PyPI tarballs. Most other sources (such as GitHub's tarballs, a git checkout without the .git folder) don't contain the necessary metadata and will not work.

For example, if you're using pip, instead of https://github.com/user/proj/archive/master.zip use git+https://github.com/user/proj.git#egg=proj
```

Make sure the follow the latest installation steps outlined above.

### Failed to activate the Virtual Environment

If you encounter any of the following errors, it's likely you forgot to activate
the (correct) Python Virtual Environment:

```sh
File "/home/senuser/.local/lib/python3.12/site-packages/vllm/config.py", line 2260, in __post_init__
self.device = torch.device(self.device_type)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Device string must not be empty
```

### No module named `torch`

You may have installed PyTorch into the system-wide Python environment, not into
the virtual environment used for vLLM-Spyre:

```sh
uv sync --frozen --active --inexact
File "/home/senuser/multi-aiu-dev/_dev/sentient-ci-cd/_dev/sen_latest/vllm-spyre/.venv/lib64/python3.12/site-packages/vllm/env_override.py", line 4, in <module>
import torch
ModuleNotFoundError: No module named 'torch'
```

To include optional linting dependencies, include `--group lint`:
Make sure to activate the same virtual environment for installing `torch` that
was used to install `vllm-spyre`. If you already have a system-wide `torch`
installation and want to reuse that for your `vllm-spyre` environment, you can
create a new virtual environment and add the `--system-site-packages` flag to
pull in the `torch` dependencies from the base Python environment:

```sh
uv sync --frozen --active --inexact --group lint
rm -rf .venv
uv venv --python 3.12 --seed .venv --system-site-packages
source .venv/bin/activate
```

!!! tip
The `dev` group (i.e. `--group dev`) is enabled by default.
If you forget to override the `torch` dependencies when installing a released
version from PyPI, you will likely see a dependency resolution error like this:

```sh
$ uv pip install vllm-spyre

Using Python 3.12.11 environment at: .venv3
Resolved 155 packages in 45ms
× Failed to build `xformers==0.0.28.post1`
├─▶ The build backend returned an error
╰─▶ Call to `setuptools.build_meta:__legacy__.build_wheel` failed (exit status: 1)

Finally, the `torch` is needed to run examples and tests. If it is not already installed, install it using `pip`:
[stderr]
Traceback (most recent call last):
File "<string>", line 14, in <module>
File "~.cache/uv/builds-v0/.tmpo0aEXS/lib/python3.12/site-packages/setuptools/build_meta.py", line 331, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=[])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~.cache/uv/builds-v0/.tmpo0aEXS/lib/python3.12/site-packages/setuptools/build_meta.py", line 301, in _get_build_requires
self.run_setup()
File "~.cache/uv/builds-v0/.tmpo0aEXS/lib/python3.12/site-packages/setuptools/build_meta.py", line 512, in run_setup
super().run_setup(setup_script=setup_script)
File "~.cache/uv/builds-v0/.tmpo0aEXS/lib/python3.12/site-packages/setuptools/build_meta.py", line 317, in run_setup
exec(code, locals())
File "<string>", line 24, in <module>
ModuleNotFoundError: No module named 'torch'

hint: This error likely indicates that `xformers@0.0.28.post1` depends on `torch`, but doesn't declare it as a build dependency. If `xformers` is a first-party package, consider adding
`torch` to its `build-system.requires`. Otherwise, `uv pip install torch` into the environment and re-run with `--no-build-isolation`.
help: `xformers` (v0.0.28.post1) was included because `vllm-spyre` (v0.1.0) depends on `vllm` (v0.2.5) which depends on `xformers`
```

To avoid this error, make sure to include the dependency `--overrides` as described
in the installation from a [Release (PyPI)](#release-pypi) section.

### No solution found when resolving dependencies

If you forget to override the `torch` dependencies when installing from PyPI you
will likely see a dependency resolution error like this:

```sh
pip install torch==2.7.0
$ uv pip install vllm-spyre==0.4.1
...
× No solution found when resolving dependencies:
╰─▶ Because fms-model-optimizer==0.2.0 depends on torch>=2.1,<2.5 and only the following versions of fms-model-optimizer are available:
fms-model-optimizer<=0.2.0
fms-model-optimizer==0.3.0
we can conclude that fms-model-optimizer<0.3.0 depends on torch>=2.1,<2.5.
And because fms-model-optimizer==0.3.0 depends on torch>=2.2.0,<2.6 and all of:
vllm>=0.9.0,<=0.9.0.1
vllm>=0.9.2
depend on torch==2.7.0, we can conclude that all versions of fms-model-optimizer and all of:
vllm>=0.9.0,<=0.9.0.1
vllm>=0.9.2
are incompatible.
And because only the following versions of vllm are available:
vllm<=0.9.0
vllm==0.9.0.1
vllm==0.9.1
vllm==0.9.2
and vllm-spyre==0.4.1 depends on fms-model-optimizer, we can conclude that all of:
vllm>=0.9.0,<0.9.1
vllm>0.9.1
and vllm-spyre==0.4.1 are incompatible.
And because vllm-spyre==0.4.1 depends on one of:
vllm>=0.9.0,<0.9.1
vllm>0.9.1
and you require vllm-spyre==0.4.1, we can conclude that your requirements are unsatisfiable.
```

To avoid this error, make sure to include the dependency `--overrides` as described
in the installation from a [Release (PyPI)](#release-pypi) section.
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -136,8 +136,9 @@ markers = [

[tool.pymarkdown]
plugins.md013.enabled = false # line-length
plugins.md041.enabled = false # first-line-h1
plugins.md033.enabled = false # inline-html
plugins.md041.enabled = false # first-line-h1
plugins.md046.enabled = false # code-block-style
plugins.md024.allow_different_nesting = true # no-duplicate-headers
plugins.md007.enabled = true
plugins.md007.indent = 4
Expand Down
Loading