vLLM CPU Phi 3 mini 128K instruct - OOM issues #5059
Replies: 5 comments 2 replies
-
Oh, in case it helps, I am running vLLM from commit: 2ba80bed2732edf42b1014ea4e34757849fc93d0. |
Beta Was this translation helpful? Give feedback.
-
Wow, okay, so an interesting followup. I bumped this down to microsoft/Phi-3-mini-4k-instruct, and it still OOMs with 32GB of RAM available to it. 😂 😭 |
Beta Was this translation helpful? Give feedback.
-
Okay, same issue with commit 8e192ff967b44b186ea02d30e49fddf656fdfe50. Backing off to v0.4.2 and trying again. |
Beta Was this translation helpful? Give feedback.
-
Okay, same issue with version v0.4.2 of vLLM. Any ideas of what to try next? |
Beta Was this translation helpful? Give feedback.
-
I gave the container twice the amount of memory that is given to the KV cache with the |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi y'all, I'm trying out vLLM on Phi 3 with no GPU, and I seem to be hitting some OOM issues with the model.
These are the configurations that I am running with:
I'm running in docker with 32GB of memory available and 12 CPU cores. I've looked at the memory requirements for the model, and I can't quite fathom how this model is not able to not OOM on me. If I do not set the
--max-model-len
, then I am not able to get anywhere with this and I receive errors similar to this:It never seems to have enough memory. 🤔
Beta Was this translation helpful? Give feedback.
All reactions