How to get detail metrics when offline llm inference #2944

tohneecao · 2024-02-21T03:44:35Z

tohneecao
Feb 21, 2024

How can i get metrics like {gpu_cache_usage, cpu_cache_usage, time_to_first_tokens, time_per_output_tokens, time_per_output_tokens} when using offline inference

tohneecao · 2024-02-21T07:45:10Z

tohneecao
Feb 21, 2024
Author

Add the requests to the engine.

for prompt, _, output_len in requests:
    sampling_params = SamplingParams(
        n=n,
        temperature=0.0 if use_beam_search else 1.0,
        top_p=1.0,
        use_beam_search=use_beam_search,
        ignore_eos=True,
        max_tokens=output_len,
    )
    # FIXME(woosuk): Do not use internal method.
    llm._add_request(
        prompt=prompt,
        prompt_token_ids=None,
        sampling_params=sampling_params,
    )
_, scheduler_outputs = llm.llm_engine.scheduler.schedule()
print(scheduler_outputs)
stats = llm.llm_engine._get_stats(scheduler_outputs)
print(stats)
llm.llm_engine.stat_logger
print(llm.llm_engine.stat_logger)
print(llm.llm_engine.stat_logger.log(stats))
outputs = []
start = time.perf_counter()
# FIXME(woosuk): Do not use internal method.
outputs = llm._run_engine(use_tqdm=True)
end = time.perf_counter()

in this way ,i can only get the beginning stats info.how could i get all stats info?

any answers will be appreciated!

0 replies

anjali-chadha · 2024-09-10T22:42:22Z

anjali-chadha
Sep 10, 2024

I have similar use case. @tohneecao Were you able to get this working?

0 replies

mohanajuhi166 · 2025-01-31T16:39:08Z

mohanajuhi166
Jan 31, 2025

@anjali-chadha were you able to solve this in python ?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

How to get detail metrics when offline llm inference #2944

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

How to get detail metrics when offline llm inference #2944

Uh oh!

tohneecao Feb 21, 2024

Replies: 3 comments

Uh oh!

tohneecao Feb 21, 2024 Author

Add the requests to the engine.

Uh oh!

anjali-chadha Sep 10, 2024

Uh oh!

mohanajuhi166 Jan 31, 2025

tohneecao
Feb 21, 2024

tohneecao
Feb 21, 2024
Author

anjali-chadha
Sep 10, 2024

mohanajuhi166
Jan 31, 2025