How does the Python Backend create multiple instances with instance_kind: count
setting?
#8256
Unanswered
IMG-PRCSNG
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Does the python backend load the model and fork to many instances? Can we configure it to use the equivalent of
multiprocessing spawn
mode?I am using a custom backend to run some old caffe models but finding the throughput remaining the same when I increase the instance count, when testing with perf_analyzer. CPU and GPU are not at 100% utilisation.
I think the caffe is being imported in the parent process and somehow is shared with the child, thus only allowing sequential access.
Beta Was this translation helpful? Give feedback.
All reactions