Replies: 1 comment 1 reply
-
Hello, @Peveld! Two quick ideas:
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, my idea for a RAG Project was to feed only valid documents into the question to the LLM. So I'm trying to find significant differences in the returned score. I followed the tutorial and have something like:
`from haystack.document_stores.faiss import FAISSDocumentStore
from haystack import Document
faissDocumentStore = FAISSDocumentStore(faiss_index_factory_str="Flat", sql_url="sqlite:////tmp/faiss_document_store.db")
documents = [Document(content="The english channel is 30 kilometers wide."), Document(content="la le li la di da")]
faissDocumentStore.write_documents(documents)
from haystack.nodes import EmbeddingRetriever
retriever = EmbeddingRetriever(
document_store=faissDocumentStore, embedding_model="sentence-transformers/multi-qa-mpnet-base-dot-v1"
)
faissDocumentStore.update_embeddings(retriever)
myquery = "How wide is the english channel?"
docs = retriever.retrieve(query=myquery, top_k=2)
print(docs)`
However the difference in Score is pretty small 0.5734 vs 0.5290. On my real text base I find nearly same score values for perfectly matching docs and perfectly not matching ones. My fantasy was to provide a general threshold... Do I understand something wrong or is there maybe a better approach?
Beta Was this translation helpful? Give feedback.
All reactions