Skip to content

Recall increase while having same HNSW parameters #42108

Discussion options

You must be logged in to vote

768D 1M = 3GB
768D 10M = 30GB
1536D 50K = 300MB
1536D 500K = 3GB
1536D 5M = 30GB

In milvus, segments can be merged into 1GB per segment. Each segment has an independent index with the same index parameters.
300MB = 1 segment
3GB = 3 segments
30GB = 30 segments

A search request will scan each segment and retrieve the topk nearest items from each segment. Finally, N * topk items are merged into one topk result set.
Let's say, ideally, each segment's recall rate is 95%. Assume topk = 100.

If there is only 1 segment, 100 items are retrieved immediately, the recall rate of a search request is 95%.
If there are 30 segments, 30 * 100 items are retrieved in the first step. In the second step, mil…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by nuvotex-tk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants