-
To Whom It May Concern, I'm currently working on choosing a Document Store/Vector Database that can support traditional searches via either filtered bm25 retrieval + cross-encoder ranking or filtered bi-encoder retrieval + cross-encoder ranking and ANN lookup via a vector/tensor. It'd also be nice to add a QA functionality, but that's a value add, not a necessity. I found your library through Dmitry Kan's Vector Podcast and am impressed with its capabilities. However, after examining 8+ Documents Stores/Vector Databases, I'm still left with three choices. I initially thought Weaviate checked most of the boxes above. However, it doesn't offer filtered bm25 retrieval, though this will be added in 1.18, Q1 2023, based on weaviate/weaviate#2393! Also, based on weaviate/weaviate#2111, it doesn't look like cross-encoders are currently supported by Weaviate. Does Weaviate + Haystack support this feature? It seems like a feature Weaviate is considering, but it doesn't appear to be in the 1.17 or 1.18 roadmap. Pinecone doesn't seem to provide filtered bm25, but filtered TF-IDF retrieval, and doesn't appear to support cross-encoder ranking either, but please correct me if I'm wrong here. Does Pinecone + Haystack support this feature? Finally, Vespa seems to check every option above, but it isn't integrated into Haystack. I was curious why this was the case. It's backed and battle-tested by Yahoo, has had the above features for a year, etc. I feel like I must be missing something obvious. My gut is telling me to go with Haystack + Weaviate, but my mind is telling me to go with Vespa with their python API and use some of the Haystack features in document preparation, summarization, etc. Would Haystack Cloud be able to provide the above functionality? Any advice you can provide here would be excellent. Thank you for your time and attention to this matter. I hope this discussion finds you well and that you have a great rest of the week. God bless. Very Respectfully, |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Hey @CMobley7, I'll copy my Discord response to you for others who only read discussion boards. "Unfortunately, there was some confusion about the support for Weaviate BM25 filters, and we mistakenly announced support for them. Haystack's idea is combining many components' capabilities to complete your overall goal/task. In some cases, you can add reranking independently of the store. I've done this myself in Wikipedia Assistant to improve responses further. Re: Vespa - we heard great things about it, and so far, we have received rare requests from our community. That is why it is not yet supported. We might consider it now that this signal is increasing. My colleagues and I would like to learn more about your insights into Vespa and your particular use case. Perhaps that's the best approach to take this forward. Are you available for a call?" |
Beta Was this translation helpful? Give feedback.
-
I'll upvote that I'm curious about Vespa support too. |
Beta Was this translation helpful? Give feedback.
Hey @CMobley7, I'll copy my Discord response to you for others who only read discussion boards. "Unfortunately, there was some confusion about the support for Weaviate BM25 filters, and we mistakenly announced support for them. Haystack's idea is combining many components' capabilities to complete your overall goal/task. In some cases, you can add reranking independently of the store. I've done this myself in Wikipedia Assistant to improve responses further. Re: Vespa - we heard great things about it, and so far, we have received rare requests from our community. That is why it is not yet supported. We might consider it now that this signal is increasing. My colleagues and I would like to…