Skip to content

Commit f017369

Browse files
DarkLight1337minpeter
authored andcommitted
[Doc] Update V1 Guide for embedding models (vllm-project#19141)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: minpeter <kali2005611@gmail.com>
1 parent 1aa7888 commit f017369

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

docs/usage/v1_guide.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ This living user guide outlines a few known **important changes and limitations*
5555
| **Spec Decode** | <nobr>🚧 WIP ([PR #13933](https://github.com/vllm-project/vllm/pull/13933))</nobr>|
5656
| **Prompt Logprobs with Prefix Caching** | <nobr>🟡 Planned ([RFC #13414](https://github.com/vllm-project/vllm/issues/13414))</nobr>|
5757
| **Structured Output Alternative Backends** | <nobr>🟡 Planned</nobr> |
58-
| **Embedding Models** | <nobr>🚧 WIP ([PR #18015](https://github.com/vllm-project/vllm/pull/18015))</nobr> |
58+
| **Embedding Models** | <nobr>🚧 WIP ([PR #16188](https://github.com/vllm-project/vllm/pull/16188))</nobr> |
5959
| **Mamba Models** | <nobr>🟡 Planned</nobr> |
6060
| **Encoder-Decoder Models** | <nobr>🟠 Delayed</nobr> |
6161
| **Request-level Structured Output Backend** | <nobr>🔴 Deprecated</nobr> |
@@ -145,9 +145,9 @@ vLLM V1 currently excludes model architectures with the `SupportsV0Only` protoco
145145
and the majority fall into the following categories. V1 support for these models will be added eventually.
146146

147147
**Embedding Models**
148-
Initially, we will create a [separate model runner](https://github.com/vllm-project/vllm/pull/18015) to provide V1 support without conflicting with other ongoing work.
148+
The initial support will be provided by [PR #16188](https://github.com/vllm-project/vllm/pull/16188).
149149

150-
Later, we will consider using [hidden states processor](https://github.com/vllm-project/vllm/issues/12249), which is based on [global logits processor](https://github.com/vllm-project/vllm/pull/13360) to enable simultaneous generation and embedding using the same engine instance in V1. [PR #16188](https://github.com/vllm-project/vllm/pull/16188) is the first step towards enabling this.
150+
Later, we will consider using [hidden states processor](https://github.com/vllm-project/vllm/issues/12249), which is based on [global logits processor](https://github.com/vllm-project/vllm/pull/13360) to enable simultaneous generation and embedding using the same engine instance in V1.
151151

152152
**Mamba Models**
153153
Models using selective state-space mechanisms (instead of standard transformer attention)

0 commit comments

Comments
 (0)