Is batch inference enabled now?

I found that the GPU memory usage is relatively small when batch_size=1 is setted for inference.
I want to make full use of the GPU by assigning a larger batch_size. But I encountered the following error, can anyone help me?

`
Traceback (most recent call last):
  File "/home/user/MLLM/LLaVA-NeXT/lmms-eval/lmms_eval/__main__.py", line 330, in cli_evaluate
    results, samples = cli_evaluate_single(args)
  File "/home/user/MLLM/LLaVA-NeXT/lmms-eval/lmms_eval/__main__.py", line 471, in cli_evaluate_single
    results = evaluator.simple_evaluate(
  File "/home/user/MLLM/LLaVA-NeXT/lmms-eval/lmms_eval/utils.py", line 533, in _wrapper
    return fn(*args, **kwargs)
  File "/home/user/MLLM/LLaVA-NeXT/lmms-eval/lmms_eval/evaluator.py", line 177, in simple_evaluate
    lm = lmms_eval.models.get_model(model).create_from_arg_string(
  File "/home/user/MLLM/LLaVA-NeXT/lmms-eval/lmms_eval/api/model.py", line 111, in create_from_arg_string
    return cls(**args, **args2)
  File "/home/user/MLLM/LLaVA-NeXT/lmms-eval/lmms_eval/models/llava_onevision.py", line 148, in __init__
    assert self.batch_size_per_gpu == 1, "Llava currently does not support batched generation. See https://github.com/haotian-liu/LLaVA/issues/754. HF Llava also has this issue."
AssertionError: Llava currently does not support batched generation. See https://github.com/haotian-liu/LLaVA/issues/754. HF Llava also has this issue.
`
By the way, I am evaluating LLaVA-OneVision on videomme.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is batch inference enabled now? #635

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Is batch inference enabled now? #635

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions