- "A \"run\" is an instance of an evaluation that you would like to track metrics against. You could have multiple runs of the same evaluation. This is typically done is a CI/CD context where the same evaluation would run at regular intervals. Since LLMs are probabilitic in nature, they could produce different outputs for the same query and context. It is a good idea to run the evaluations regularly to understand the variations of outputs produced by your LLMs. In addition, runs give you the ability to choose different metrics for each run. \n",
0 commit comments