From 4faf2d10670897e3a4f58bae1eff4defd33a1f97 Mon Sep 17 00:00:00 2001 From: 22quinn <33176974+22quinn@users.noreply.github.com> Date: Wed, 18 Jun 2025 23:32:32 -0700 Subject: [PATCH 1/2] [Doc] Update embedding model support status Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> --- docs/usage/v1_guide.md | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/docs/usage/v1_guide.md b/docs/usage/v1_guide.md index 28c501439325..60c3e55ed665 100644 --- a/docs/usage/v1_guide.md +++ b/docs/usage/v1_guide.md @@ -39,9 +39,9 @@ This living user guide outlines a few known **important changes and limitations* For each item, our progress towards V1 support falls into one of the following states: - **🚀 Optimized**: Nearly fully optimized, with no further work currently planned. -- **🟢 Functional**: Fully operational, with ongoing optimizations. -- **🚧 WIP**: Under active development. -- **🟡 Planned**: Scheduled for future implementation (some may have open PRs/RFCs). +- **🟢 Functional**: Fully operational, with ongoing optimizations. +- **🚧 WIP**: Under active development. +- **🟡 Planned**: Scheduled for future implementation (some may have open PRs/RFCs). - **🟠 Delayed**: Temporarily dropped in V1 but planned to be re-introduced later. - **🔴 Deprecated**: Not planned for V1 unless there is strong demand. @@ -70,7 +70,7 @@ For each item, our progress towards V1 support falls into one of the following s |-----------------------------|------------------------------------------------------------------------------------| | **Decoder-only Models** | 🚀 Optimized | | **Encoder-Decoder Models** | 🟠 Delayed | -| **Embedding Models** | 🚧 WIP ([PR #16188](https://github.com/vllm-project/vllm/pull/16188)) | +| **Embedding Models** | 🟢 Functional | | **Mamba Models** | 🚧 WIP ([PR #19327](https://github.com/vllm-project/vllm/pull/19327)) | | **Multimodal Models** | 🟢 Functional | @@ -82,14 +82,6 @@ vLLM V1 currently excludes model architectures with the `SupportsV0Only` protoco See below for the status of models that are still not yet supported in V1. -#### Embedding Models - -The initial support will be provided by [PR #16188](https://github.com/vllm-project/vllm/pull/16188). - -Later, we will consider using [hidden states processor](https://github.com/vllm-project/vllm/issues/12249), -which is based on [global logits processor](https://github.com/vllm-project/vllm/pull/13360) -to enable simultaneous generation and embedding using the same engine instance in V1. - #### Mamba Models Models using selective state-space mechanisms instead of standard transformer attention (e.g., `MambaForCausalLM`, `JambaForCausalLM`) From 93445491d2f2460d612e209c4281cfcb8fc25108 Mon Sep 17 00:00:00 2001 From: 22quinn <33176974+22quinn@users.noreply.github.com> Date: Thu, 19 Jun 2025 01:26:02 -0700 Subject: [PATCH 2/2] wording Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> --- docs/usage/v1_guide.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/docs/usage/v1_guide.md b/docs/usage/v1_guide.md index 60c3e55ed665..1ec3e72a4f56 100644 --- a/docs/usage/v1_guide.md +++ b/docs/usage/v1_guide.md @@ -80,7 +80,15 @@ vLLM V1 currently excludes model architectures with the `SupportsV0Only` protoco This corresponds to the V1 column in our [list of supported models][supported-models]. -See below for the status of models that are still not yet supported in V1. +See below for the status of models that are not yet supported or have more features planned in V1. + +#### Embedding Models + +The initial basic support is now functional. + +Later, we will consider using [hidden states processor](https://github.com/vllm-project/vllm/issues/12249), +which is based on [global logits processor](https://github.com/vllm-project/vllm/pull/13360) +to enable simultaneous generation and embedding using the same engine instance in V1. #### Mamba Models