docker-compose multi-gpu example #1127
Replies: 5 comments
-
Since we don't do tensor parallelism at the moment, the only way to utilize multiple gpu is separate the workload into different process / different cuda devices. In this particular case, you might:
|
Beta Was this translation helpful? Give feedback.
-
@wsxiaoys Any chance I could get a docker-compose example? Not sure how that would work if both containers are using port 8080. |
Beta Was this translation helpful? Give feedback.
-
Just separating the models into different containers to use each GPU seems to work. I was hoping for a single model to use two GPUs, but that does not seem possible. For anyone looking for a multi-gpu setup, here is the docker-compose setup:
|
Beta Was this translation helpful? Give feedback.
-
Putting a reverse proxy upfront is a natural choice in this case (e.g. Caddy), on the other hand, if you're interested in Tabby's built-in distributed worker support...
|
Beta Was this translation helpful? Give feedback.
-
A blog post added here: https://tabby.tabbyml.com/blog/2024/03/26/tabby-with-replicas-behind-reverse-proxy |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Can anyone give an example on how to get TabbyML working with two GPUs?
I have two 3060s. This is what I have so far, but I'm not sure if TabbyML is actually using both GPUs.
I also get this error:
Beta Was this translation helpful? Give feedback.
All reactions