Estimate GPU Type or Total VRAM Required using HF Repo ID #8084

stikkireddy · 2024-09-02T12:59:01Z

stikkireddy
Sep 2, 2024

Hey team trying to estimate gpu usage given a repo id from huggingface. Trying to determine whether to use A10, 4xA10, 8xA10, A100, 2xA100, 4xA100, 8xA100... No consumer cards.

I am able to estimate model weights using this:

from huggingface_hub import parse_safetensors_file_metadata, get_safetensors_metadata

metadata = get_safetensors_metadata(repo_id="microsoft/Phi-3.5-mini-instruct")

def estimate_gpu_memory_for_weights(*, repo_id: str, revision: str = None):
    metadata = get_safetensors_metadata(repo_id=repo_id, revision=revision)
    parameter_count = metadata.parameter_count
    precision_size = {
        'FP32': 4,
        'FP16': 2,
        'BF16': 2,
        'INT8': 1,
        'INT4': 0.5
    }
    total_bytes = 0
    for precision, num_params in parameter_count.items():
        total_bytes += num_params * precision_size.get(precision, 0)
    return total_bytes / (1024 ** 3)
  
estimate_gpu_memory_for_weights(repo_id="microsoft/Phi-3.5-mini-instruct")

But i am trying to understand given that i also have the

from transformers import PretrainedConfig
config = PretrainedConfig.from_pretrained(pretrained_model_name_or_path="microsoft/Phi-3.5-mini-instruct")

Is there a way to estimate the memory required given that I know the desired context length (ideally default provided from the config.json) and max number of output tokens. Assuming i use default settings provided in the args. It is missing computation for kv cache and intermediate states and misc 1-2gb overhead. Any help/guidance?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Estimate GPU Type or Total VRAM Required using HF Repo ID #8084

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Estimate GPU Type or Total VRAM Required using HF Repo ID #8084

Uh oh!

Uh oh!

stikkireddy Sep 2, 2024

Replies: 0 comments

stikkireddy
Sep 2, 2024