Allow Dynamic Values in the 'Contains' Evaluator #2738

mmabrouk · 2025-08-13T09:00:26Z

mmabrouk
Aug 13, 2025
Maintainer

Current Limitation

Right now, the contains evaluator is limited to checking for a single, static string across all test cases. For example, we can set it to check if the model's output always contains "thank you," but we can't change that expected string from one test case to the next.

Proposed Change

We propose allowing the contains evaluator to use variables from the test set. This would let us specify a different string to check for in each individual test case by adding a column to our dataset.

For example, we could have a dataset like this:

[
  {
    "user_message": "Here is my email.",
    "expected_phrase": "thank you"
  },
  {
    "user_message": "I have a bug.",
    "expected_phrase": "sorry"
  }
]

The evaluator would then check if the LLM output contains the value from the expected_phrase variable for that specific row ({{testcase.expected_phrase}}).

Benefit

This would make the evaluator much more flexible and powerful. It would allow us to create more specific, "unit test" style evaluations where the expected content changes depending on the input, leading to more accurate testing.

Original Request by Henri

https://agenta-hq.slack.com/archives/C05JDQWKD6E/p1755012840660029

The one area i've stumbed on is the evaluator. I've made a dataset with input and correct_answer and want to check the text in correct_answer is in the response from the LLM. I tried to do that like this but I think I can't use variables here (see screenshot)
...would anyone be able to put me on the right track please? (edited)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow Dynamic Values in the 'Contains' Evaluator #2738

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Allow Dynamic Values in the 'Contains' Evaluator #2738

Uh oh!

mmabrouk Aug 13, 2025 Maintainer

Current Limitation

Proposed Change

Benefit

Original Request by Henri

Replies: 0 comments

mmabrouk
Aug 13, 2025
Maintainer