Request for a better support for RAG with large datasets #40725

gland1 · 2025-03-18T08:59:02Z

gland1
Mar 18, 2025

The issue we see is as follows:
Suppose you'd like to build a RAG system with milvus using 1B dataset.
The source content for each vector needs to be retrieved during/after performing an ANN search.
If you keep the source content along with the vector in one collection, for example like langchain does , you'll end up fetching huge amount of date from object storage to querynode local drive.
The current workaround we use is to keep it separately on the object storage and fetch it directly after retrieving the required vector id by ann search. If it was possible for milvus to avoid loading such field during load(I understand this is available in beta) and then fetch the source content by a simple get api by milvus, or any other solution by milvus that would better support the RAG case with big datasets.

yhmo · 2025-03-18T09:35:50Z

yhmo
Mar 18, 2025
Collaborator

"fetch the source content by a simple get api by milvus" --------------------- I believe the query() interface already works for this purpose
I think what you expected is:

load a few fields for ANN search, the vector field, the ID field, etc.
ANN search to get the topk IDs
get content by the topk IDs

In v2.5, the approach:

client.load_collection(collection_name, load_fields=["vector", "id"])
client.search(collection_name)
client.query(collection_name, filter=f"id in [ids]", output_field=["content"])

4 replies

gland1 Mar 18, 2025
Author

so query api will work even though I didnt load the content field?

yhmo Mar 18, 2025
Collaborator

You can choose mmap for scalar fields, scalar data file is automatically downloaded into querynode's local disk and mmap is created to reduce the memory usage.
...
queryNode:
mmap:
vectorField: false
vectorIndex: false
scalarField: true
scalarIndex: true
....

If mmap is disabled, query node will read the scalar content(a small chunk file that contains the content) from s3/minio, the latency of query() mainly depends on the IO.

gland1 Mar 18, 2025
Author

I've tried your suggestion , after loading only 2 fields, the query method returns:
code=65535, message=field content is not loaded)>

yhmo Mar 19, 2025
Collaborator

Ok, I'm wrong. As @xiaofan-luan mentioned, only loaded fields can be retrieved. The new feature of v2.6 might meet your requirements.

xiaofan-luan · 2025-03-18T16:17:34Z

xiaofan-luan
Mar 18, 2025
Maintainer

The issue we see is as follows: Suppose you'd like to build a RAG system with milvus using 1B dataset. The source content for each vector needs to be retrieved during/after performing an ANN search. If you keep the source content along with the vector in one collection, for example like langchain does , you'll end up fetching huge amount of date from object storage to querynode local drive. The current workaround we use is to keep it separately on the object storage and fetch it directly after retrieving the required vector id by ann search. If it was possible for milvus to avoid loading such field during load(I understand this is available in beta) and then fetch the source content by a simple get api by milvus, or any other solution by milvus that would better support the RAG case with big datasets.

Data not loaded won't be able to retrieve or filters.

In Milvus 2.6, we will support a new data type named TEXT.

The text datatype won't be able to load or filterins. it only support tree operations

retrieve by id
text match when enable
convert to bm25 sparse index

3 replies

gland1 Mar 19, 2025
Author

thanks for the info.. I believe I saw TEXT field already in the master branch can we start and check it?

xiaofan-luan Mar 19, 2025
Maintainer

it's not ready for integration test.
Our plan is to test it in the next 2-3 weeks and put it to production next month

gland1 Mar 19, 2025
Author

ok, thanks

xiaofan-luan · 2025-03-18T16:20:09Z

xiaofan-luan
Mar 18, 2025
Maintainer

building a 1B rag case is cool!

0 replies

Request for a better support for RAG with large datasets #40725

Uh oh!

gland1 Mar 18, 2025

Replies: 3 comments · 7 replies

Uh oh!

yhmo Mar 18, 2025 Collaborator

Uh oh!

gland1 Mar 18, 2025 Author

Uh oh!

Uh oh!

yhmo Mar 18, 2025 Collaborator

Uh oh!

Uh oh!

gland1 Mar 18, 2025 Author

Uh oh!

yhmo Mar 19, 2025 Collaborator

Uh oh!

xiaofan-luan Mar 18, 2025 Maintainer

Uh oh!

gland1 Mar 19, 2025 Author

Uh oh!

xiaofan-luan Mar 19, 2025 Maintainer

Uh oh!

gland1 Mar 19, 2025 Author

Uh oh!

xiaofan-luan Mar 18, 2025 Maintainer

gland1
Mar 18, 2025

Replies: 3 comments 7 replies

yhmo
Mar 18, 2025
Collaborator

gland1 Mar 18, 2025
Author

yhmo Mar 18, 2025
Collaborator

gland1 Mar 18, 2025
Author

yhmo Mar 19, 2025
Collaborator

xiaofan-luan
Mar 18, 2025
Maintainer

gland1 Mar 19, 2025
Author

xiaofan-luan Mar 19, 2025
Maintainer

gland1 Mar 19, 2025
Author

xiaofan-luan
Mar 18, 2025
Maintainer