Skip to content

Commit be0cfb2

Browse files
authored
fix[Docs]: link anchor is incorrect #20309 (#20315)
Signed-off-by: zxw <1020938856@qq.com>
1 parent 1a03dd4 commit be0cfb2

File tree

8 files changed

+10
-10
lines changed

8 files changed

+10
-10
lines changed

docs/configuration/engine_args.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ title: Engine Arguments
66
Engine arguments control the behavior of the vLLM engine.
77

88
- For [offline inference][offline-inference], they are part of the arguments to [LLM][vllm.LLM] class.
9-
- For [online serving][openai-compatible-server], they are part of the arguments to `vllm serve`.
9+
- For [online serving][serving-openai-compatible-server], they are part of the arguments to `vllm serve`.
1010

1111
You can look at [EngineArgs][vllm.engine.arg_utils.EngineArgs] and [AsyncEngineArgs][vllm.engine.arg_utils.AsyncEngineArgs] to see the available engine arguments.
1212

docs/design/arch_overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ python -m vllm.entrypoints.openai.api_server --model <model>
7474

7575
That code can be found in <gh-file:vllm/entrypoints/openai/api_server.py>.
7676

77-
More details on the API server can be found in the [OpenAI-Compatible Server][openai-compatible-server] document.
77+
More details on the API server can be found in the [OpenAI-Compatible Server][serving-openai-compatible-server] document.
7878

7979
## LLM Engine
8080

docs/features/structured_outputs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ The following parameters are supported, which must be added as extra parameters:
2121
- `guided_grammar`: the output will follow the context free grammar.
2222
- `structural_tag`: Follow a JSON schema within a set of specified tags within the generated text.
2323

24-
You can see the complete list of supported parameters on the [OpenAI-Compatible Server][openai-compatible-server] page.
24+
You can see the complete list of supported parameters on the [OpenAI-Compatible Server][serving-openai-compatible-server] page.
2525

2626
Structured outputs are supported by default in the OpenAI-Compatible Server. You
2727
may choose to specify the backend to use by setting the

docs/getting_started/installation/intel_gaudi.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,7 @@ docker run \
110110
### Supported features
111111

112112
- [Offline inference][offline-inference]
113-
- Online serving via [OpenAI-Compatible Server][openai-compatible-server]
113+
- Online serving via [OpenAI-Compatible Server][serving-openai-compatible-server]
114114
- HPU autodetection - no need to manually select device within vLLM
115115
- Paged KV cache with algorithms enabled for Intel Gaudi accelerators
116116
- Custom Intel Gaudi implementations of Paged Attention, KV cache ops,

docs/models/generative_models.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ outputs = llm.chat(conversation, chat_template=custom_template)
134134

135135
## Online Serving
136136

137-
Our [OpenAI-Compatible Server][openai-compatible-server] provides endpoints that correspond to the offline APIs:
137+
Our [OpenAI-Compatible Server][serving-openai-compatible-server] provides endpoints that correspond to the offline APIs:
138138

139139
- [Completions API][completions-api] is similar to `LLM.generate` but only accepts text.
140140
- [Chat API][chat-api] is similar to `LLM.chat`, accepting both text and [multi-modal inputs][multimodal-inputs] for models with a chat template.

docs/models/pooling_models.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ A code example can be found here: <gh-file:examples/offline_inference/basic/scor
113113

114114
## Online Serving
115115

116-
Our [OpenAI-Compatible Server][openai-compatible-server] provides endpoints that correspond to the offline APIs:
116+
Our [OpenAI-Compatible Server][serving-openai-compatible-server] provides endpoints that correspond to the offline APIs:
117117

118118
- [Pooling API][pooling-api] is similar to `LLM.encode`, being applicable to all types of pooling models.
119119
- [Embeddings API][embeddings-api] is similar to `LLM.embed`, accepting both text and [multi-modal inputs][multimodal-inputs] for embedding models.

docs/models/supported_models.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ llm.apply_model(lambda model: print(type(model)))
3434
If it is `TransformersForCausalLM` then it means it's based on Transformers!
3535

3636
!!! tip
37-
You can force the use of `TransformersForCausalLM` by setting `model_impl="transformers"` for [offline-inference][offline-inference] or `--model-impl transformers` for the [openai-compatible-server][openai-compatible-server].
37+
You can force the use of `TransformersForCausalLM` by setting `model_impl="transformers"` for [offline-inference][offline-inference] or `--model-impl transformers` for the [openai-compatible-server][serving-openai-compatible-server].
3838

3939
!!! note
4040
vLLM may not fully optimise the Transformers implementation so you may see degraded performance if comparing a native model to a Transformers model in vLLM.
@@ -53,8 +53,8 @@ For a model to be compatible with the Transformers backend for vLLM it must:
5353

5454
If the compatible model is:
5555

56-
- on the Hugging Face Model Hub, simply set `trust_remote_code=True` for [offline-inference][offline-inference] or `--trust-remote-code` for the [openai-compatible-server][openai-compatible-server].
57-
- in a local directory, simply pass directory path to `model=<MODEL_DIR>` for [offline-inference][offline-inference] or `vllm serve <MODEL_DIR>` for the [openai-compatible-server][openai-compatible-server].
56+
- on the Hugging Face Model Hub, simply set `trust_remote_code=True` for [offline-inference][offline-inference] or `--trust-remote-code` for the [openai-compatible-server][serving-openai-compatible-server].
57+
- in a local directory, simply pass directory path to `model=<MODEL_DIR>` for [offline-inference][offline-inference] or `vllm serve <MODEL_DIR>` for the [openai-compatible-server][serving-openai-compatible-server].
5858

5959
This means that, with the Transformers backend for vLLM, new models can be used before they are officially supported in Transformers or vLLM!
6060

docs/serving/openai_compatible_server.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: OpenAI-Compatible Server
33
---
4-
[](){ #openai-compatible-server }
4+
[](){ #serving-openai-compatible-server }
55

66
vLLM provides an HTTP server that implements OpenAI's [Completions API](https://platform.openai.com/docs/api-reference/completions), [Chat API](https://platform.openai.com/docs/api-reference/chat), and more! This functionality lets you serve models and interact with them using an HTTP client.
77

0 commit comments

Comments
 (0)