How to use two models in same inference code #2991
siddhantwaghjale
announced in
Q&A
Replies: 1 comment
-
I think it's not feasible with vllm currently(please correct me if I was wrong). But you can try to search "LLM gateway" in github. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm trying with two model inference in the same code using vLLM. But while trying to load the 2nd model it fails with error
AssertionError: tensor model parallel group is already initialized.
Any help will be appreciated
Beta Was this translation helpful? Give feedback.
All reactions