Adding a verification system to verify benchmark results #111

thomasRoglin · 2025-07-03T09:32:09Z

This PR aims to introduce a verification system inside the autohecbench.py allowing the benchmark results to be flagged as correct or not in the summary output.

In the subset.json, each benchmark entry now includes a verification section :

[
    verifcation_type,
    [param1, param2, param3 ... ]
]

For now, only 2 verification types are supported :

no_verification: means that no verification is implemented / configured
verification_token: Takes two parameters: [success_token, fail_token].
The benchmark output is considered valid if it contains the success_token at least once and the fail_token does not appear.

In the autohecbench.py script, we introduce a new argument --verify that will activate the verification.
When enabled, the run() function checks the verification type of the benchmark and performs the related verification.

thomasRoglin added 3 commits July 3, 2025 11:27

Add a verification system to autohecbench script

10e1303

Use enums for benchmark status instead of string

ed66181

Add verification info to subset.json

2344bf5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Adding a verification system to verify benchmark results #111

Adding a verification system to verify benchmark results #111

Uh oh!

thomasRoglin commented Jul 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Adding a verification system to verify benchmark results #111

Are you sure you want to change the base?

Adding a verification system to verify benchmark results #111

Uh oh!

Conversation

thomasRoglin commented Jul 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant