Replies: 4 comments 6 replies
-
To persist and load back a
By using Elasticsearch, you can handle large datasets efficiently and avoid the limitations and potential issues associated with using pickle for persistence [1]. |
Beta Was this translation helpful? Give feedback.
-
what if i don't want to use elastic search? |
Beta Was this translation helpful? Give feedback.
-
The bottleneck occurs in texts_processed and vectorizer = BM25Okapi(texts_processed, bm25_params) in the from_texts method. You can persist the vectorizer and avoid repeatedly processing texts. |
Beta Was this translation helpful? Give feedback.
-
Hi, I opened a PR supporting an alternative and persistable implementation based on bm25s #28123. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Checked other resources
Commit to Help
Example Code
Description
I am working on a hybrid search implementation, and use this code to do keyword search:
However, this is not scalable to large dataset due to pickle. What's the right way to persist keyword index?
System Info
langchain==0.2.15
langchain-aws==0.1.7
langchain-community==0.2.10
langchain-core==0.2.36
langchain-google-vertexai==1.0.8
langchain-milvus==0.1.4
langchain-openai==0.1.23
langchain-text-splitters==0.2.2
Beta Was this translation helpful? Give feedback.
All reactions