-
Notifications
You must be signed in to change notification settings - Fork 3k
BugBash - PR #40894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
BugBash - PR #40894
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces several new sample scripts and documentation updates to demonstrate the use of Azure AI Evaluation APIs, including simulation, content safety evaluation, and direct evaluator usage.
- Added simulation_and_eval.py that integrates the AdversarialSimulator and evaluation API.
- Added content_safety_using_evaluate_api.py and content_safety_evaluator.py to demonstrate different content safety evaluation approaches.
- Updated bugbash_instructions.md with setup and usage instructions.
Reviewed Changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
sdk/evaluation/azure-ai-evaluation/samples/onedp/simulation_and_eval.py | Introduces a simulation example with evaluation API usage and writes output to a file. |
sdk/evaluation/azure-ai-evaluation/samples/onedp/content_safety_using_evaluate_api.py | Adds a sample script to call the Evaluate API for content safety evaluation. |
sdk/evaluation/azure-ai-evaluation/samples/onedp/content_safety_evaluator.py | Provides a standalone example for using the ContentSafetyEvaluator. |
sdk/evaluation/azure-ai-evaluation/samples/onedp/bugbash_instructions.md | Supplies detailed instructions and prerequisites for the bug bash. |
Files not reviewed (1)
- sdk/evaluation/azure-ai-evaluation/samples/onedp/oai-integration-testing/test_eval_input.jsonl: Language not supported
session_state: Any = None, | ||
context: Dict[str, Any] = None, | ||
) -> dict: | ||
query = messages["messages"][0]["content"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 'messages' parameter is annotated as a List[Dict] but is accessed as a dictionary with a 'messages' key. Consider updating the type annotation to Dict[str, List[Dict]] (or adjusting the usage) to prevent potential runtime errors.
Copilot uses AI. Check for mistakes.
with open(path, "w") as file: | ||
file.write(JsonLineChatProtocol(simulator_output[0]).to_eval_qr_json_lines()) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accessing simulator_output[0] without checking if simulator_output contains any elements might lead to an IndexError if the output is empty. It is recommended to validate the output before indexing.
with open(path, "w") as file: | |
file.write(JsonLineChatProtocol(simulator_output[0]).to_eval_qr_json_lines()) | |
if not simulator_output: | |
raise ValueError("Simulator output is empty. Cannot write to file.") | |
with open(path, "w") as file: | |
file.write(JsonLineChatProtocol(simulator_output[0]).to_eval_qr_json_lines()) |
Copilot uses AI. Check for mistakes.
API change check API changes are not detected in this pull request. |
…on-testing/azure_ai_evaluation-1.6.0-py3-none-any.whl
No description provided.