Skip to content

Judge create endpoint #42

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 5, 2025
Merged

Judge create endpoint #42

merged 4 commits into from
Jun 5, 2025

Conversation

kkjarvis
Copy link
Contributor

@kkjarvis kkjarvis commented Jun 4, 2025

Summary by CodeRabbit

  • New Features

    • Added support for specifying an intent when creating a judge.
    • Increased maximum allowed length for certain text fields in judge and evaluator requests.
    • Introduced an example script demonstrating how to create and run a custom judge using the API.
    • Enabled synchronous and asynchronous judge creation methods in the client library.
  • Documentation

    • Updated API documentation to clarify judge creation requirements and field constraints.
    • Added a new example section on creating a judge with code samples.

Copy link

coderabbitai bot commented Jun 4, 2025

Walkthrough

A new example script demonstrates creating and running a custom judge using the Root Signals API. The OpenAPI specification is updated to clarify judge creation, increase string length limits, and add a required intent property for judges. The Judges class gains synchronous and asynchronous methods for judge creation.

Changes

File(s) Change Summary
examples/create_judge.py Added a new example script showing how to create and run a judge using the Root Signals API.
openapi.yaml Updated judge/evaluator schemas: added required intent, increased string length limits, made some fields optional, clarified descriptions, and removed nullable from date-time fields.
src/root/judges.py Added create and acreate methods to the Judges class for synchronous and asynchronous judge creation.
docs/examples.md Added a new "Creating a judge" section with example usage referencing create_judge.py.

Sequence Diagram(s)

sequenceDiagram
    participant User Script
    participant JudgesClass
    participant RootSignalsAPI

    User Script->>JudgesClass: create(name, intent, evaluator_references)
    JudgesClass->>RootSignalsAPI: POST /v1/judges/ (with name, intent, evaluators)
    RootSignalsAPI-->>JudgesClass: Judge object
    JudgesClass-->>User Script: Judge instance

    User Script->>JudgesClass: run(judge, request, response, context)
    JudgesClass->>RootSignalsAPI: Evaluate judge (with input data)
    RootSignalsAPI-->>JudgesClass: Evaluation result
    JudgesClass-->>User Script: Result output
Loading

Poem

In the garden of code, a new judge takes the stand,
With intent and evaluators, all carefully planned.
The API grows stronger, its schema refined,
Longer strings and clear purpose, all well-aligned.
As rabbits we cheer, with a hop and a nudge—
For now, you can easily create your own judge! 🐇


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7200c66 and e45a4da.

📒 Files selected for processing (2)
  • docs/examples.md (1 hunks)
  • examples/create_judge.py (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • docs/examples.md
  • examples/create_judge.py
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
examples/create_judge.py (2)

1-1: Add module docstring
Include a brief module-level docstring summarizing the purpose of this example script.

🧰 Tools
🪛 Pylint (3.3.7)

[convention] 1-1: Missing module docstring

(C0114)


[error] 1-1: Unable to import 'root'

(E0401)


22-26: Refactor long string literals
Lines 22–23 and 25 exceed the recommended line length. Consider using triple-quoted strings or concatenation with parentheses to improve readability.

🧰 Tools
🪛 Pylint (3.3.7)

[convention] 25-25: Line too long (137/100)

(C0301)

src/root/judges.py (1)

308-339: Fix line length formatting issue.

The asynchronous create method implementation is functionally correct and follows the established async patterns. However, there's a formatting issue that needs to be addressed.

Apply this diff to fix the line length issue:

-        return await AJudge._awrap(
-            await api_instance.judges_create(judge_request=request, _request_timeout=_request_timeout),
-            client_context=self.client_context,
-        )
+        result = await api_instance.judges_create(
+            judge_request=request, _request_timeout=_request_timeout
+        )
+        return await AJudge._awrap(result, client_context=self.client_context)
🧰 Tools
🪛 Pylint (3.3.7)

[convention] 337-337: Line too long (103/100)

(C0301)


[warning] 336-336: Access to a protected member _awrap of a client class

(W0212)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ce0ce0 and 7200c66.

⛔ Files ignored due to path filters (22)
  • src/root/generated/openapi_aclient/api/judges_api.py is excluded by !**/generated/**
  • src/root/generated/openapi_aclient/models/evaluator.py is excluded by !**/generated/**
  • src/root/generated/openapi_aclient/models/evaluator_execution_request.py is excluded by !**/generated/**
  • src/root/generated/openapi_aclient/models/evaluator_list_output.py is excluded by !**/generated/**
  • src/root/generated/openapi_aclient/models/judge_execution_request.py is excluded by !**/generated/**
  • src/root/generated/openapi_aclient/models/judge_generator_response.py is excluded by !**/generated/**
  • src/root/generated/openapi_aclient/models/judge_rectifier_request_request.py is excluded by !**/generated/**
  • src/root/generated/openapi_aclient/models/judge_request.py is excluded by !**/generated/**
  • src/root/generated/openapi_aclient/models/objective.py is excluded by !**/generated/**
  • src/root/generated/openapi_aclient/models/objective_list.py is excluded by !**/generated/**
  • src/root/generated/openapi_aclient/models/patched_judge_request.py is excluded by !**/generated/**
  • src/root/generated/openapi_client/api/judges_api.py is excluded by !**/generated/**
  • src/root/generated/openapi_client/models/evaluator.py is excluded by !**/generated/**
  • src/root/generated/openapi_client/models/evaluator_execution_request.py is excluded by !**/generated/**
  • src/root/generated/openapi_client/models/evaluator_list_output.py is excluded by !**/generated/**
  • src/root/generated/openapi_client/models/judge_execution_request.py is excluded by !**/generated/**
  • src/root/generated/openapi_client/models/judge_generator_response.py is excluded by !**/generated/**
  • src/root/generated/openapi_client/models/judge_rectifier_request_request.py is excluded by !**/generated/**
  • src/root/generated/openapi_client/models/judge_request.py is excluded by !**/generated/**
  • src/root/generated/openapi_client/models/objective.py is excluded by !**/generated/**
  • src/root/generated/openapi_client/models/objective_list.py is excluded by !**/generated/**
  • src/root/generated/openapi_client/models/patched_judge_request.py is excluded by !**/generated/**
📒 Files selected for processing (3)
  • examples/create_judge.py (1 hunks)
  • openapi.yaml (8 hunks)
  • src/root/judges.py (2 hunks)
🧰 Additional context used
🪛 Pylint (3.3.7)
examples/create_judge.py

[convention] 2-2: Line too long (102/100)

(C0301)


[convention] 25-25: Line too long (137/100)

(C0301)


[convention] 1-1: Missing module docstring

(C0114)


[error] 1-1: Unable to import 'root'

(E0401)


[error] 2-2: Unable to import 'root.generated.openapi_client.models.evaluator_reference_request'

(E0401)


[error] 3-3: Unable to import 'root.skills'

(E0401)

src/root/judges.py

[convention] 337-337: Line too long (103/100)

(C0301)


[warning] 303-303: Access to a protected member _wrap of a client class

(W0212)


[warning] 336-336: Access to a protected member _awrap of a client class

(W0212)

🔇 Additional comments (12)
openapi.yaml (7)

947-947: Clarify POST /v1/judges/ description
Updating the operation description to include intent and evaluator_references aligns with the new requirements. Please ensure downstream documentation and client method docstrings are updated accordingly.


2167-2167: Extend EvaluatorExecutionRequest.expected_output length
Increasing maxLength to 3.5M supports larger evaluation contexts; this change is consistent across execution schemas.


2858-2858: Extend JudgeExecutionRequest.expected_output length
Aligns with other execution requests.


3015-3015: Extend JudgeRectifierRequestRequest.expected_output length
Consistent adjustment for rectifier requests.


2916-2919: Allow nullable error_code in JudgeGeneratorResponse
Making error_code nullable and removing it from required avoids schema validation errors when no error occurs.


3054-3059: Add required intent field to JudgeRequest schema
Introducing the write-only intent property and marking it as required enforces capturing judge intent at creation time. Confirm that all client bindings and example usages pass this field correctly.

Also applies to: 3067-3067


3683-3689: Include intent in PatchedJudgeRequest
Adding intent to the partial update schema is appropriate. Verify that patch handlers accept and apply intent updates without breaking existing workflows.

examples/create_judge.py (3)

1-3: Verify import paths
Static analysis flags import errors for root, root.generated, and root.skills. Ensure the package is installed in the environment and that these paths are correct according to the project’s packaging layout.

🧰 Tools
🪛 Pylint (3.3.7)

[convention] 2-2: Line too long (102/100)

(C0301)


[convention] 1-1: Missing module docstring

(C0114)


[error] 1-1: Unable to import 'root'

(E0401)


[error] 2-2: Unable to import 'root.generated.openapi_client.models.evaluator_reference_request'

(E0401)


[error] 3-3: Unable to import 'root.skills'

(E0401)


13-17: Use create method for judge creation
The client.judges.create(...) invocation correctly passes name, intent, and evaluator_references.


19-27: Confirm run method existence
The example uses client.judges.run(...) to execute the judge. Verify that the client library exposes a run method (or adjust to execute/execute_create) that maps to the /judges/{id}/execute/ endpoint.

🧰 Tools
🪛 Pylint (3.3.7)

[convention] 25-25: Line too long (137/100)

(C0301)

src/root/judges.py (2)

15-17: LGTM! Import additions align with new functionality.

The new imports for JudgeRequest and AJudgeRequest are correctly added to support the new judge creation methods.

Also applies to: 21-21


275-306: LGTM! Synchronous create method implementation is solid.

The method follows the established patterns in the codebase with proper:

  • Type hints and parameter documentation
  • Request object construction
  • API client usage
  • Response wrapping with client context

The protected member access warning from static analysis is a false positive - _wrap is a class method being called appropriately within the same module.

🧰 Tools
🪛 Pylint (3.3.7)

[warning] 303-303: Access to a protected member _wrap of a client class

(W0212)

@kkjarvis kkjarvis requested a review from juho-y June 4, 2025 11:47
Copy link
Contributor

@juho-y juho-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🏅

@juho-y juho-y merged commit adaee5f into main Jun 5, 2025
6 checks passed
@juho-y juho-y deleted the judge-create-endpoint branch June 5, 2025 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants