Skip to content

[evaluation] refactor _evaluate_query parameter to kwargs #42168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 24, 2025

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Jul 22, 2025

Problem

The _evaluate_query parameter in RAI service evaluators was implemented as a private parameter (prefixed with underscore), making it non-standard for public API usage. Users had to rely on private kwargs to access this functionality, which is not good API design practice.

Solution

Refactored the _evaluate_query parameter to a public evaluate_query parameter across all RAI service evaluators while maintaining full backward compatibility.

Changes Made

Public API Change

  • Before: _evaluate_query=True (private parameter)
  • After: evaluate_query=True (public parameter)

Affected Components

  • Base Class: Updated RaiServiceEvaluatorBase with public parameter and backward compatibility logic
  • 11 Evaluators Updated:
    • ContentSafetyEvaluator
    • ViolenceEvaluator
    • SexualEvaluator
    • SelfHarmEvaluator
    • HateUnfairnessEvaluator
    • ProtectedMaterialEvaluator
    • IndirectAttackEvaluator (XPIA)
    • CodeVulnerabilityEvaluator
    • UngroundedAttributesEvaluator
    • GroundednessProEvaluator
    • EciEvaluator

Usage Examples

# ✅ NEW (Recommended) - Public API
evaluator = ViolenceEvaluator(
    credential=credential,
    azure_ai_project=project,
    evaluate_query=True  # Public parameter
)

# ⚠️ OLD (Deprecated but still works) - Shows deprecation warning
evaluator = ViolenceEvaluator(
    credential=credential,
    azure_ai_project=project,  
    _evaluate_query=True  # Private parameter (deprecated)
)

Backward Compatibility

  • Old _evaluate_query parameter continues to work but shows a deprecation warning
  • Error is raised if both evaluate_query and _evaluate_query are specified
  • All existing code continues to work unchanged
  • Internal implementation (self._evaluate_query) remains the same

Benefits

  • ✅ Follows proper public API design principles
  • ✅ Makes the parameter discoverable through IDEs and documentation
  • ✅ Maintains full backward compatibility with deprecation path
  • ✅ Consistent across all RAI service evaluators

The parameter controls whether queries are included in evaluation data when evaluating query-response pairs. With this change, users no longer need to rely on private kwargs to access this important functionality.


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

Copilot AI and others added 2 commits July 22, 2025 20:17
…ward compatibility

Co-authored-by: slister1001 <103153180+slister1001@users.noreply.github.com>
…efactoring

Co-authored-by: slister1001 <103153180+slister1001@users.noreply.github.com>
@Copilot Copilot AI changed the title [WIP] Find a different way to expose the _evaluate_query parameter without using private kwargs Refactor _evaluate_query parameter from private to public API across RAI service evaluators Jul 22, 2025
@Copilot Copilot AI requested a review from slister1001 July 22, 2025 20:22
Copilot finished work on behalf of slister1001 July 22, 2025 20:22
@slister1001
Copy link
Member

Instead of making it public, make it a kwargs

…plicit parameter

Co-authored-by: slister1001 <103153180+slister1001@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jul 22, 2025

Instead of making it public, make it a kwargs

Changed the approach to use kwargs pattern instead of an explicit public parameter. The evaluate_query parameter is now handled via **kwargs in all RAI service evaluators, keeping the API signatures cleaner while maintaining full backward compatibility. Updated in commit 0470066.

Copilot finished work on behalf of slister1001 July 22, 2025 21:08
@slister1001
Copy link
Member

we don't need to maintain backward compatibility since we never released _evaluate_query. also please update the changelog to reflect the updated behavior

…e changelog

Co-authored-by: slister1001 <103153180+slister1001@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jul 23, 2025

we don't need to maintain backward compatibility since we never released _evaluate_query. also please update the changelog to reflect the updated behavior

Removed backward compatibility logic for _evaluate_query parameter since it was never released, and updated the changelog to reflect evaluate_query as a new feature rather than a breaking change. The implementation is now simplified to only handle evaluate_query from kwargs without deprecation warnings. Changes in commit 7a9f7cc.

Copilot finished work on behalf of slister1001 July 23, 2025 14:12
Co-authored-by: slister1001 <103153180+slister1001@users.noreply.github.com>
@Copilot Copilot AI requested a review from slister1001 July 23, 2025 14:30
Copilot finished work on behalf of slister1001 July 23, 2025 14:30
Copilot finished work on behalf of slister1001 July 23, 2025 14:38
@slister1001 slister1001 changed the title Refactor _evaluate_query parameter from private to public API across RAI service evaluators Refactor _evaluate_query parameter to kwargs Jul 23, 2025
@slister1001 slister1001 changed the title Refactor _evaluate_query parameter to kwargs [evaluation] refactor _evaluate_query parameter to kwargs Jul 23, 2025
Copy link

github-actions bot commented Jul 23, 2025

API Change Check

APIView identified API level changes in this PR and created the following API reviews

azure-ai-evaluation

@slister1001 slister1001 marked this pull request as ready for review July 23, 2025 19:02
@Copilot Copilot AI review requested due to automatic review settings July 23, 2025 19:02
@slister1001 slister1001 requested a review from a team as a code owner July 23, 2025 19:02
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the _evaluate_query parameter from a private parameter (prefixed with underscore) to a public evaluate_query parameter across all RAI service evaluators, following proper API design principles while maintaining full backward compatibility.

Key changes:

  • Converted private _evaluate_query parameter to public evaluate_query parameter across 11 RAI service evaluators
  • Updated base class RaiServiceEvaluatorBase to handle the new parameter through kwargs with backward compatibility
  • Modified ContentSafetyEvaluator to properly extract and pass the evaluate_query parameter to its constituent evaluators

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
_base_rai_svc_eval.py Updated base class to extract evaluate_query from kwargs instead of explicit parameter
_content_safety.py Updated docstring and refactored to extract evaluate_query from kwargs for sub-evaluators
_violence.py, _sexual.py, _self_harm.py, _hate_unfairness.py Removed explicit _evaluate_query parameter, now using kwargs
_service_groundedness.py Removed _evaluate_query parameter from constructor
_protected_material.py, _code_vulnerability.py, _ungrounded_attributes.py Added default value logic for evaluate_query in kwargs
xpia.py, _eci.py Simplified constructor to use kwargs instead of explicit parameter
CHANGELOG.md Updated to reflect the new public parameter addition

@slister1001 slister1001 enabled auto-merge (squash) July 24, 2025 15:33
@slister1001 slister1001 merged commit 89c53e7 into main Jul 24, 2025
20 checks passed
@slister1001 slister1001 deleted the copilot/fix-842ec8e6-fb2d-4b07-a7f6-6ff4af53d525 branch July 24, 2025 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants