You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[AscendScheduler][Bugfix] Remove num_draft_tokens while allocating slots (#1718)
### What this PR does / why we need it?
Now there is no need to calculate `num_draft_tokens` when allocating
slots.
This PR follows the changes in vllm:
vllm-project/vllm#20701
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
CI passed with existing test
- vLLM version: v0.9.2
- vLLM main:
vllm-project/vllm@cc876d0
---------
Signed-off-by: MengqingCao <cmq0113@163.com>
0 commit comments