JAVA SDK Milvus迭代查询查询全量数据不准确 #42557
Answered
by
yhmo
phstan
asked this question in
Q&A and General discussion
-
测试代码: import io.milvus.orm.iterator.QueryIterator;
import io.milvus.response.QueryResultsWrapper;
import io.milvus.v2.client.ConnectConfig;
import io.milvus.v2.client.MilvusClientV2;
import io.milvus.v2.common.ConsistencyLevel;
import io.milvus.v2.service.collection.request.LoadCollectionReq;
import io.milvus.v2.service.vector.request.QueryIteratorReq;
import io.milvus.v2.service.vector.request.QueryReq;
import io.milvus.v2.service.vector.response.QueryResp;
import java.util.Collections;
import java.util.List;
public class TestDemo {
private MilvusClientV2 client;
public static void main(String[] args) throws InterruptedException {
TestDemo test = new TestDemo();
test.testInitClient("http://xxxx:19530", null);
test.testCount("default", "all_field_type_collection");
test.testQueryIterator("default", "all_field_type_collection", "id_field");
}
void testInitClient(String uri, String token) {
client = new MilvusClientV2(ConnectConfig.builder().uri(uri).token(token).connectTimeoutMs(60 * 1000L).build());
}
void testCount(String databaseName, String collectionName) throws InterruptedException {
client.useDatabase(databaseName);
LoadCollectionReq loadCollectionReq = LoadCollectionReq.builder().collectionName(collectionName).build();
client.loadCollection(loadCollectionReq);
QueryResp query = client.query(QueryReq.builder().databaseName(databaseName).collectionName(collectionName).filter("")
.outputFields(Collections.singletonList("count(*)")).build());
System.out.println(query.getQueryResults().get(0).getEntity());
}
void testQueryIterator(String databaseName, String collectionName, String pkName) throws InterruptedException {
client.useDatabase(databaseName);
LoadCollectionReq loadCollectionReq = LoadCollectionReq.builder().collectionName(collectionName).build();
client.loadCollection(loadCollectionReq);
QueryIteratorReq request = QueryIteratorReq.builder().databaseName(databaseName).collectionName(collectionName)
.outputFields(Collections.singletonList("*")).consistencyLevel(ConsistencyLevel.STRONG).build();
QueryIterator iterator = null;
try {
iterator = client.queryIterator(request);
long count = 0;
while (true) {
List<QueryResultsWrapper.RowRecord> rowRecords = iterator.next();
if (rowRecords == null || rowRecords.isEmpty()) {
break;
}
System.out.println("record batch size: " + rowRecords.size());
count = count + rowRecords.size();
}
System.out.println("count: " + count);
} finally {
if (iterator != null) {
iterator.close();
}
}
}
} 输出结果:
|
Beta Was this translation helpful? Give feedback.
Answered by
yhmo
Jun 6, 2025
Replies: 3 comments 6 replies
-
query_iterator是以主键来取数据的,可能是因为表里有几条重复主键。
|
Beta Was this translation helpful? Give feedback.
0 replies
-
看起来有重复主键的数据,milvus不会对插入的数据去重,这个是一个已知问题。 |
Beta Was this translation helpful? Give feedback.
5 replies
-
birdwatcher可以检查重复主键数据 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
正如 @xiaofan-luan 所说的,birdwatcher工具可以检查
这个工具在这里有介绍:https://milvus.io/docs/birdwatcher_overview.md
具体做法:
Download the birdwatcher(it is an engineer tool for milvus) from here: https://github.com/milvus-io/birdwatcher/releases
Download the binary according to your OS.
Extract the tar package.
Enter the extracted folder, start the birdwatcher executable binary from your terminal: ./bin/birdwatcher
It is a command-line tool, you will enter its command mode.
Type "connect --etcd 127.0.0.1:2379" to connect the etcd service, replace the ip with your etcd ip.
Use "show collections" to show all collections, find the collection id
Use command "inspect-pk --collection 458519828685301464 --minioAddr 1…