Replies: 1 comment 1 reply
-
Swapping in v1 is removed intentionally. Instead, we expect when recomputing the preempted requests, most prompt tokens could be bypassed due to prefix caching. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
As mentioned in the title, there is no swap deque in class Scheduler, and no swap relative operation in schedule func.
Is this not implemented yet or just remove for some reason?
codes in the vllm/v1/core/scheduler.py shown as above
And I found the
_initialize_kv_caches
func in class EngineCore init the cpu blocks number to 0 as followed:Maybe someone can help me to explain this?
Beta Was this translation helpful? Give feedback.
All reactions