Inconsistent search results #42717
Replies: 4 comments 7 replies
-
Possible reason: there are duplicate primary keys in the collection. |
Beta Was this translation helpful? Give feedback.
-
why do data don't have a primary key? |
Beta Was this translation helpful? Give feedback.
-
A new problem was discovered. Two different servers only had different memory sizes. This situation would occur with the smaller memory, but the data volume was only 3,000. |
Beta Was this translation helpful? Give feedback.
-
Thank you everyone, I found the problem, I need to flush after inserting.
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
After inserting the data, the search can find the correct data, but after a while, the retrieved data is incorrect.
When inserting the data again, it can be retrieved again.
Why is this the case?
def _search(self, query_text, top_k=3, mode="vector", score_threshold=0.0, partition_name=None,ranker_params=[0.8,0.2]): self.client.load_collection(collection_name=self.collection_name) print("Running hybrid search...") query_vector = self._embed_text(query_text) dense_req = AnnSearchRequest( data=[query_vector], anns_field="dense_vector", param={"metric_type": "IP","index_type":"HNSW","params": {"M": 8, "efConstruction": 64}}, limit=top_k ) sparse_req = AnnSearchRequest( data=[query_text], anns_field="sparse_vector", param={"metric_type": "BM25", "params": {"drop_ratio_build": 0.2}}, limit=top_k ) # 使用加权排名策略 ranker_weight = WeightedRanker(ranker_params[0], ranker_params[1]) # 使用 RRFRanker ranker_rrf=RRFRanker(100) search_kwargs = { "collection_name": self.collection_name, "reqs": [dense_req, sparse_req], "ranker": ranker_weight, "limit": top_k, "output_fields": ["pk", "text", "metadata"], "consistency_level": "Strong", "partition_names": ["partition_a","partition_b"] } results = self.client.hybrid_search(**search_kwargs)
Beta Was this translation helpful? Give feedback.
All reactions