-
Notifications
You must be signed in to change notification settings - Fork 133
💥 Expose agent testing utils #1164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
raise NotImplementedError() | ||
|
||
|
||
class StaticTestModel(TestModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked the idea of changing this to a static factory on testmodel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed the static factory method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit hard to see from this PR how this looks from a user POV. One reason we did "ActivityEnvironment" and "WorkflowEnvironment" instead of only the building blocks is because users like the nice simplicity of one-liners and reusable constructs. I'm wondering if there's an opportunity to design something here. If not too much trouble, can I see what tests/openai_agents/basic/test_hello_world_workflow.py
will look like using these utilities?
Part of me wonders if we can have an AgentEnvironment
that basically accepts everything the plugin accepts and also some of this mock stuff. So maybe something like:
from temporalio.contrib.openai_agents.testing import AgentEnvironment
# ...
async def test_hello_world_agent_workflow(client: Client):
async def on_model_call(req: WhateverOpenAIRequestType) -> WhateverOpenAIResponseType:
# Do some stuff
# on_model_call is just an advanced example, accepting direct mocks can
# in this constructor be allowed too
async with AgentEnvironment(on_model_call=on_model_call) as env:
# Applies plugin and such (which is also available on env.plugin if you want it)
client = env.applied_on_client(client)
# Rest of the stuff w/ worker and such
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently (with the change to static factory method I just pushed), that test would look like:
@pytest.fixture
def test_model():
return TestModel.returning_responses(
[ResponseBuilders.output_message("This is a haiku (not really)")]
)
async def test_execute_workflow(client: Client):
task_queue_name = str(uuid.uuid4())
async with Worker(
client,
task_queue=task_queue_name,
workflows=[HelloWorldAgent],
activity_executor=ThreadPoolExecutor(5),
):
result = await client.execute_workflow(
HelloWorldAgent.run,
"Write a recursive haiku about recursive haikus.",
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
assert isinstance(result, str)
assert len(result) > 0
client
is a fixture that depends on the test_model
fixture, so you can override the test_model
fixture per test or per module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for most users this is missing the client and plugin configuration which I think we should make easy for testers too. I think to show the full code to compare, you'd have to include your other fixtures like client configuration and plugin creation. Those fixtures are a little pytest specific and external to the test and not really have we have done test helpers in the past. I guess I was thinking something you could easily configure inside your test for each test (but still share if you want). Basically you need an easy way to configure an existing client with the plugin and model stuff.
a31afee
to
d3010d2
Compare
What was changed
temporalio.contrib.openai_agents.test
module, for keeping utilities to assist in writing tests of agents.TestModel
andTestModelProvider
fromtemporalio.contrib.openai_agents
totemporalio.contrib.openai_agents.test
(this is a breaking change 💥)StaticTestModel
andResponseBuilders
intemporalio.contrib.openai_agents.test
Why?
Writing tests of agentic code requires boilerplate setup of model mocks. This is a first attempt to make this easier for users.
Checklist
How was this tested: updated existing unit tests to use the
temporalio.contrib.openai_agents.test
module.Any docs updates needed? No, there are currently no docs for testing utils. Let's add in later PRs after building samples.