Evaluation of a QA System Clarification #6256
-
I am creating a pure retriever. I wish to evaluate its performance. I have question-answer pairs, including the location of the answer in the text, for a subset of documents. I wish to use these QA pairs to evaluate my retriever. I don't want to use the haystack annotation tool, so I can not generate the required SQuAD json file this way. I could create it by hand, but what do I have to fill in in the "id" field? The evaluation tutorial (https://haystack.deepset.ai/tutorials/05_evaluation) mentions this: "Alternative: Define queries and labels directly", and goes on to create a MultiLabel object. Again, there is this Id field here that I don't understand how to fill out. If anyone could help me figure this out, I would be very thankful! Also more generally, I am somewhat confused on the role of doc_index, label_index and add_eval_data(). If anyone could explain or link to an explanation of what exactly these indices are doing and how that fits into the story, that would also be greatly appreciated. My sincere thanks in advance to anyone taking the time to help me. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Managed to figure it out. I just put random numbers in the id slot and it works. |
Beta Was this translation helpful? Give feedback.
Managed to figure it out. I just put random numbers in the id slot and it works.