using bm25 or e5_large on disconnected environments. #41190
-
Hi. I will use pymilvus for an application. If I'm not wrong pymilvus downloads automatically the models for bm25 and e5_large on runtime. My issue is that my environment will be disconnected from internet so it will not be possible to download it on runtime. I couldn't find how to download them before using them. My plan is to build a container with everything needed where I will execute my python application using pymilvus. How can I download all requirements to use bm25 and e5_large before using them? Thank you. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 6 replies
-
BM25 is built-in in Milvus v2.5: https://milvus.io/docs/full-text-search.md#Full-Text-Search-BM25, you don't need bm25 in client-side. |
Beta Was this translation helpful? Give feedback.
-
For Milvus 2.6, we support integrating with other inference engines, like huggingface or VLLM So if you know how to deploy VLLM offline and install embedding models it could be much easier and more performant |
Beta Was this translation helpful? Give feedback.
The SentenceTransformerEmbeddingFunction calls sentence_transformers.SentenceTransformer to get the model.
This line will automatically download the model:
https://github.com/milvus-io/milvus-model/blob/2fe97ac8ba923fec5d7e3afa8b2b9e9c25ebbce0/src/pymilvus/model/dense/sentence_transformer.py#L29
It just equal to this: