Need help reviewing my configuration #2889
-
Hey everyone i have built a chatbot with haystack by following the tutorials on the website and i now got some question about the correctness of my setup. I'm using the rest api with a pipeline yml file. There, I defined a retriever of type I have written a python script to upload documents to elasticsearch. This script is divided in two sections. The first section writes general documents to elasticsearch with
After that i update the embedding with
In the second section i write FAQ data to elasticsearch which follows this tutorial.
The reasons i am confused with this setup:
Then there are other general question that i have:
Best, |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 9 replies
-
To answer your first question
The |
Beta Was this translation helpful? Give feedback.
-
I'm not entirely sure what you mean by improving the results in this case. A reason I could see for having two separate indices for general data and FAQ data is if you have two separate query pipelines for general data and FAQ data.
No, you do not have to use two separate retrievers and readers. Yes, you could write both the general documents and FAQ data to the same document store using the same index. If you do this then your query pipeline will return answers that could either be from the general data or the FAQ data. |
Beta Was this translation helpful? Give feedback.
To answer your first question
The
BM25Retriever
actually does not use vector embeddings when retrieving documents. It's based off of TF-IDF (more details can be found here). So in short this combination is intended to work, but you are currently not using the embeddings for document retrieval. If you never plan on using aDensePassageRetriever
in your query pipeline then you do not need to rundocument_store.updat…