How can I optimize CPU Time in VLLM Sampler #6019
fnavigate84
announced in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
When I perform inference using the CPU, I notice significant differences in the sampler part between different CPUs, with AMD CPUs performing much faster than Intel CPUs. Are there any methods to accelerate the CPU usage for the sampler part of VLLM?
Beta Was this translation helpful? Give feedback.
All reactions