Configuring localGPT for production #472
Replies: 4 comments
-
@PromtEngineer Need your input on this! |
Beta Was this translation helpful? Give feedback.
-
@PromtEngineer It's the same issue that my team and I have |
Beta Was this translation helpful? Give feedback.
-
@AnandMoorthy @matheus-mondaini we will need to implement queue in the api for multiple users, should be relatively easy to implement. Will have a look. Getting back to this project back soon. |
Beta Was this translation helpful? Give feedback.
-
Hello, @PromtEngineer , do you have any news about this feature ? Thanks ;) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I am planning to configure the project to production, i am expecting around 10 peoples to use this concurrently. My current setup is RTX 4090 with 24Gig memory. Flask app is working fine when a single user using localGPT but when multiple requests comes in at the same time the app is crashing.
Also i see whenever request comes in GPU goes 100%, is there any way with the current configuration i can server for around 10 peoples concurrently? Other suggestions are welcome too :)
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions