Slow Retrieval Performance: 5 Minutes for 1000 Queries on FLAT Index (2.4M Docs, 113k Questions) #40973
Unanswered
Bhagyashreet20
asked this question in
Q&A and General discussion
Replies: 1 comment 1 reply
-
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Milvus team 👋,
I'm currently running a retrieval pipeline using Milvus and encountered some performance issues I'd like help with.
Setup Details:
Collection size: 2.4 million document chunks
Chunk size: 256 tokens
Embedding model: text-embedding-3-small (OpenAI)
Embedding dimension: 1536
Index type: FLAT
Query size: 113,000 questions total (tested with 1,000 for benchmarking)
Query batch size: 1,000
Deployment: standalone
Problem:
When I run a top-K retrieval (e.g., top-5 or top-10) using just 1000 questions, it takes approximately 4 minutes to fetch results using the FLAT index.
This seems unexpectedly slow, especially considering the relatively small batch size and document count (2.4M).
Questions:
Is this expected behavior with a FLAT index on this scale?
What is the typical or expected retrieval latency for 1000 queries in similar settings?
Would switching to an approximate index like IVF_FLAT, HNSW, or DISKANN help improve this latency?
Are there any tuning parameters I can set (e.g., nprobe, cache settings) to speed up FLAT search?
Any guidance or best practices to improve performance would be greatly appreciated! 🙏
I have attached the script for reference too
Beta Was this translation helpful? Give feedback.
All reactions