Replies: 4 comments 1 reply
-
Evals, or evaluations, are tests that are used to measure how well a machine learning model is performing. They are typically designed to assess the model's ability to complete a specific task or solve a specific problem. In general, it is not desirable for an evaluation to have a 0% success rate, as this means that the model is not able to perform the task at all. However, there may be cases where a 0% success rate is acceptable, depending on the specific circumstances of the evaluation. For example, if a human can easily solve the problem being tested, but the machine learning model cannot, it may still be useful to evaluate the model's performance in order to identify areas where it needs improvement. In this case, a 0% success rate would indicate that the model needs significant work in order to be able to perform the task as well as a human can. Ultimately, the decision of whether to allow evaluations with a 0% success rate will depend on the goals and objectives of the specific project or application. |
Beta Was this translation helpful? Give feedback.
-
Evals" are like tests for robots to see how well they can understand and do things. Sometimes, a test might be too hard for the robot and it can't do it, so it gets a score of zero. It's okay if a robot gets a zero score sometimes, but if it keeps getting zero scores all the time, then we might need to change the test or help the robot get better. |
Beta Was this translation helpful? Give feedback.
-
Just copy pasting answers from ChatGPT doesn't help here. I'd suggest to refrain from doing so in the future. |
Beta Was this translation helpful? Give feedback.
-
ChatGPT said: Just Chill! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Are Evals with 0% success rate allowed - if I believe GPT should be able to solve it and if a human can solve it easily?
Beta Was this translation helpful? Give feedback.
All reactions