Skip to content

Add async support for dspy.Evaluate #8504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

chenmoneygithub
Copy link
Collaborator

We are introducing:

  • _execute_with_multithreading for running evaluation in multithreading
  • _execute_with_event_loop for running evaluation in async concurrency.

The weird part is although technically async eval should run faster than multithreading because evaluation is IO bound task, but I am not noticing it consistently.

Testing script is pasted below:

import asyncio

import dspy
from dspy.datasets.gsm8k import GSM8K

dspy.configure(lm=dspy.LM("openai/gpt-4o-mini", cache=False))


# Load math questions from the GSM8K dataset.
gsm8k = GSM8K()
gsm8k_trainset, gsm8k_devset = gsm8k.train[:50], gsm8k.dev[:100]


cot = dspy.ChainOfThought("question->answer")


def my_metric(args, pred):
    return 1.0 if pred.answer == args.answer else 0.0


evaluator = dspy.Evaluate(devset=gsm8k_devset, num_threads=50, display_table=False)


import time

start_time = time.time()
result = evaluator(cot, metric=my_metric)
end_time = time.time()
print(f"Time taken with multithreading: {end_time - start_time} seconds")
print(result)


async def main():
    return await evaluator.acall(cot, metric=my_metric)


start_time = time.time()
result = asyncio.run(main())
end_time = time.time()
print(f"Time taken with async: {end_time - start_time} seconds")
print(result)

70% time async runs faster than multithreading insignificantly, and 30% time it's the reverse. Two potential theories:

  • Bottleneck happens on provider side, like rate limiting.
  • Behind the scene litellm acompletion is not true async, I need to dig the code a bit.

queue.task_done()

workers = [asyncio.create_task(worker()) for _ in range(num_threads)]
await asyncio.gather(*workers)
Copy link
Collaborator

@TomeHirata TomeHirata Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: Does acyncio.gather use multiple threads with this setup?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants