Skip to content

Commit b38823b

Browse files
authored
modify reasoning_output docs (#2696)
1 parent 050d965 commit b38823b

File tree

2 files changed

+23
-18
lines changed

2 files changed

+23
-18
lines changed

docs/features/reasoning_output.md

Lines changed: 21 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,35 @@
1-
# Chain-of-Thought Content
1+
# Reasoning Outputs
22

3-
The reasoning model returns a `reasoning_content` field in the output, representing the chain-of-thought content—the reasoning steps that lead to the final conclusion.
3+
Reasoning models return an additional `reasoning_content` field in their output, which contains the reasoning steps that led to the final conclusion.
44

5-
## Currently Supported Chain-of-Thought Models
6-
| Model Name | Parser Name | Chain-of-Thought Enabled by Default |
7-
|----------------|----------------|-------------------------------------|
8-
| ernie-45-vl | ernie-45-vl | |
9-
| ernie-lite-vl | ernie-45-vl | |
5+
## Supported Models
6+
| Model Name | Parser Name | Eable_thinking by Default |
7+
|----------------|----------------|---------------------------|
8+
| baidu/ERNIE-4.5-VL-424B-A47B-Paddle | ernie-45-vl ||
9+
| baidu/ERNIE-4.5-VL-28B-A3B-Paddle | ernie-45-vl ||
1010

11-
The reasoning model requires a specified parser to interpret the reasoning content. The reasoning mode can be disabled by setting the `enable_thinking=False` parameter.
11+
The reasoning model requires a specified parser to extract reasoning content. The reasoning mode can be disabled by setting the `enable_thinking=False` parameter.
1212

1313
Interfaces that support toggling the reasoning mode:
14-
1. `/v1/chat/completions` request in OpenAI services.
15-
2. `/v1/chat/completions` request in the OpenAI Python client.
16-
3. `llm.chat` request in Offline interfaces.
14+
1. `/v1/chat/completions` requests in OpenAI services.
15+
2. `/v1/chat/completions` requests in the OpenAI Python client.
16+
3. `llm.chat` requests in Offline interfaces.
1717

1818
For reasoning models, the length of the reasoning content can be controlled via `reasoning_max_tokens`. Add `metadata={"reasoning_max_tokens": 1024}` to the request.
1919

2020
### Quick Start
2121
When launching the model service, specify the parser name using the `--reasoning-parser` argument.
2222
This parser will process the model's output and extract the `reasoning_content` field.
2323
```bash
24-
python -m fastdeploy.entrypoints.openai.api_server --model /root/merge_llm_model --enable-mm --tensor-parallel-size=8 --port 8192 --quantization wint4 --reasoning-parser=ernie-45-vl
24+
python -m fastdeploy.entrypoints.openai.api_server \
25+
--model /path/to/your/model \
26+
--enable-mm \
27+
--tensor-parallel-size 8 \
28+
--port 8192 \
29+
--quantization wint4 \
30+
--reasoning-parser ernie-45-vl
2531
```
26-
27-
Next, send a `chat completion` request to the model:
32+
Next, make a request to the model that should return the reasoning content in the response.
2833
```bash
2934
curl -X POST "http://0.0.0.0:8192/v1/chat/completions" \
3035
-H "Content-Type: application/json" \
@@ -40,8 +45,8 @@ curl -X POST "http://0.0.0.0:8192/v1/chat/completions" \
4045
```
4146
The `reasoning_content` field contains the reasoning steps to reach the final conclusion, while the `content` field holds the conclusion itself.
4247

43-
### Streaming Sessions
44-
In streaming sessions, the `reasoning_content` field can be retrieved from the `delta` in `chat completion response chunks`.
48+
### Streaming chat completions
49+
Streaming chat completions are also supported for reasoning models. The `reasoning_content` field is available in the `delta` field in `chat completion response chunks`
4550
```python
4651
from openai import OpenAI
4752
# Set OpenAI's API key and API base to use vLLM's API server.

docs/zh/features/reasoning_output.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,8 @@
55
##目前支持思考链的模型
66
| 模型名称 | 解析器名称 | 默认开启思考链 |
77
|---------------|-------------|---------|
8-
| ernie-45-vl | ernie-45-vl ||
9-
| ernie-lite-vl | ernie-45-vl ||
8+
| baidu/ERNIE-4.5-VL-424B-A47B-Paddle | ernie-45-vl ||
9+
| baidu/ERNIE-4.5-VL-28B-A3B-Paddle | ernie-45-vl ||
1010

1111
思考模型需要指定解析器,以便于对思考内容进行解析. 通过`enable_thinking=False` 参数可以关闭模型思考模式.
1212

0 commit comments

Comments
 (0)