Replies: 3 comments
-
Beta Was this translation helpful? Give feedback.
0 replies
-
@krrishdholakia for your awareness, is it expected that |
Beta Was this translation helpful? Give feedback.
0 replies
-
A simple local test setup is: import asyncio
import time
import litellm
concurrent_requests = 100
async def acompletion_with_async_sleep(mock_delay):
await asyncio.sleep(mock_delay)
return "pong"
async def acompletion(i):
start_time = time.time()
print(f"Task {i}: Starting")
response = await litellm.acompletion(
model="azure/gpt-4",
messages=[{"role": "user", "content": "ping"}],
mock_response=True,
mock_delay=4,
)
print(f"Task {i}: Finished in {round(time.time() - start_time, 2)} seconds")
return response
tasks = [acompletion(i) for i in range(concurrent_requests)]
results = await asyncio.gather(*tasks) with output:
But when using |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Async
acompletion
runs synccompletion
in executor of the event loop partially with all callkwargs
.If we set
mock_response
toTrue
and specify amock_delay
, then thencompletion
does async
time.sleep fortime_delay
. This blocks the event loop when making multipleacompletion
calls withmock_delay
concurrently.We should instead use
asyncio.sleep
for the case of mocking an async response, to make sure the event loop is not blocked and concurrent calls can be mocked.Beta Was this translation helpful? Give feedback.
All reactions