Estimate GPU Type or Total VRAM Required using HF Repo ID #8084
stikkireddy
announced in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey team trying to estimate gpu usage given a repo id from huggingface. Trying to determine whether to use A10, 4xA10, 8xA10, A100, 2xA100, 4xA100, 8xA100... No consumer cards.
I am able to estimate model weights using this:
But i am trying to understand given that i also have the
Is there a way to estimate the memory required given that I know the desired context length (ideally default provided from the config.json) and max number of output tokens. Assuming i use default settings provided in the args. It is missing computation for kv cache and intermediate states and misc 1-2gb overhead. Any help/guidance?
Beta Was this translation helpful? Give feedback.
All reactions