BLS scripting: executing / submitting multiple requests at once in a single-shot? #7928
Unanswered
vadimkantorov
asked this question in
Q&A
Replies: 1 comment
-
Currently I'm constructing individual |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Does BLS support submitting multiple InferenceRequest's at once as an API? (to save maybe some gRPC roundtrips and networking API-related thread blocking) Or is there no point in this?
And also, is there support in InferenceClient of submitting multiple requests to the same model as a single gRPC roundtrip packing them into a single gRPC request to reduce overhead?
At least from DX/UX standpoint, this would be a useful regime to support at least in frontend methods IMO... and could be useful for text models: e.g. BERT-based classification of multiple short strings
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions