Recall increase while having same HNSW parameters #42108
-
Hello, Benchmark results where achieved using the following configuration:
HNSW parameters used:
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
768D 1M = 3GB In milvus, segments can be merged into 1GB per segment. Each segment has an independent index with the same index parameters. A search request will scan each segment and retrieve the topk nearest items from each segment. Finally, N * topk items are merged into one topk result set. If there is only 1 segment, 100 items are retrieved immediately, the recall rate of a search request is 95%. |
Beta Was this translation helpful? Give feedback.
768D 1M = 3GB
768D 10M = 30GB
1536D 50K = 300MB
1536D 500K = 3GB
1536D 5M = 30GB
In milvus, segments can be merged into 1GB per segment. Each segment has an independent index with the same index parameters.
300MB = 1 segment
3GB = 3 segments
30GB = 30 segments
A search request will scan each segment and retrieve the topk nearest items from each segment. Finally, N * topk items are merged into one topk result set.
Let's say, ideally, each segment's recall rate is 95%. Assume topk = 100.
If there is only 1 segment, 100 items are retrieved immediately, the recall rate of a search request is 95%.
If there are 30 segments, 30 * 100 items are retrieved in the first step. In the second step, mil…