Skip to content

Commit 7c43500

Browse files
authored
Update disaggregated.md
1 parent ea29b01 commit 7c43500

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

docs/features/disaggregated.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Disaggregated Deployment
22

3-
Large model inference consists of two phases: Prefill and Decode, which are compute-intensive (Prefill) and compute-intensive (Decode) respectively. Deploying Prefill and Decode separately in certain scenarios can improve hardware utilization, effectively increase throughput, and reduce overall sentence latency.
3+
Large model inference consists of two phases: Prefill and Decode, which are compute-intensive (Prefill) and Memory access-intensive(Decode) respectively. Deploying Prefill and Decode separately in certain scenarios can improve hardware utilization, effectively increase throughput, and reduce overall sentence latency.
44

55
* Prefill phase: Processes all input Tokens (such as user prompts), completes the model's forward propagation, and generates the first token.
66
* Decode phase: Starting from the first generated token, it generates one token at a time autoregressively until reaching the stop token. For N output tokens, the Decode phase requires (N-1) forward propagations that must be executed serially. During generation, the number of tokens to attend to increases, and computational requirements gradually grow.
@@ -163,4 +163,4 @@ python -m fastdeploy.entrypoints.openai.api_server \
163163
* --scheduler-port: Redis port to connect to
164164
* --scheduler-ttl: Specifies Redis TTL time in seconds
165165
* --pd-comm-port: Specifies PD communication port
166-
* --rdma-comm-ports: Specifies RDMA communication ports, multiple ports separated by commas, quantity should match GPU count
166+
* --rdma-comm-ports: Specifies RDMA communication ports, multiple ports separated by commas, quantity should match GPU count

0 commit comments

Comments
 (0)