EmbeddingRetriever - Load model from memory #4154
-
Hello, is there a way to load the transformer model into a EmbeddingRetriever from memory? Basically, the idea is to load from disk upfront so the queries run faster (instead of loading from disk every time). Otherwise I just should make up another arquitecture where I may keep up an EmbeddingRetriever object and use it to run forecoming queries... |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
The best pattern I could recommend based on my self-experience is to load any node that, in the background, loads any transformer models to be held in memory some way. You may choose to use Ray, or a global object, or any solution that meets your needs. You should be aware that transformers are not thread-safe, you will need to handle them or use some sort of multiprocessing pool. |
Beta Was this translation helpful? Give feedback.
Hi @wilsonlimaneto
The best pattern I could recommend based on my self-experience is to load any node that, in the background, loads any transformer models to be held in memory some way. You may choose to use Ray, or a global object, or any solution that meets your needs.
You should be aware that transformers are not thread-safe, you will need to handle them or use some sort of multiprocessing pool.