cpu-gpu, FLAT, IVF_FLAT, GPU_IVF_FLAT 检索速度几乎没有差别 #41030

wabye430 · 2025-04-01T08:45:09Z

wabye430
Apr 1, 2025

安装官方文档安装的docker，Milvus-gpu
https://milvus.io/docs/install_standalone-docker-compose-gpu.md

同样的文本数据，10万条文本，使用bge-m3生成1024维向量写入Milvus，index_type无论是FLAT, IVF_FLAT, GPU_IVF_FLAT，同样query生成的向量去检索，检索速度基本一样，有时200ms，有时400ms，各种方式都试过了，不知道什么原因

#插入数据
client = MilvusClient(uri="http://localhost:19530", token="root:Milvus", db_name='default')
collection_name = "zhaobiao_content_cpu"

删除集合

client.drop_collection(collection_name=collection_name)

创建集合

schema = MilvusClient.create_schema()
schema.add_field(field_name="contentId", datatype=DataType.INT64, is_primary=True)
schema.add_field(field_name="province", datatype=DataType.VARCHAR, max_length=10)
schema.add_field(field_name="contentType", datatype=DataType.INT8)
schema.add_field(field_name="updateTime", datatype=DataType.INT32)
schema.add_field(field_name="dense_vector", datatype=DataType.FLOAT_VECTOR, dim=1024)
schema.add_field(field_name="sparse_vector", datatype=DataType.SPARSE_FLOAT_VECTOR)
index_params = client.prepare_index_params()
index_params.add_index(field_name="sparse_vector", index_type="SPARSE_INVERTED_INDEX", metric_type="IP")

index_params.add_index(field_name="dense_vector", index_type="GPU_IVF_FLAT", metric_type="IP", nlist=1024)

index_params.add_index(field_name="dense_vector", index_type="IVF_FLAT", metric_type="IP", nlist=1024)
client.create_collection(
collection_name=collection_name,
schema=schema,
index_params=index_params,
consistency_level="Strong"
)
state = client.get_load_state(collection_name=collection_name)
print(state)

加载模型

embedding_model = BGEM3FlagModel('huggingface/bge-m3', use_fp16=False,
pooling_method='cls',
query_max_length=8192,
passage_max_length=8192,
devices='cuda')
insert_data("/raw_data/content_info_2025-03-26.json", embedding_model, batch_size=20)

model = BGEM3FlagModel('/home/huwenqiang/huggingface/bge-m3', use_fp16=False,
pooling_method='cls',
query_max_length=100,
passage_max_length=8192,
devices='cuda')

k = 100
for _ in range(100):
t0 = time.time()
query = "兽药,兽医药, 动物药品, 兽用药物"
query_embeddings = model.encode_queries([query], max_length=30, return_dense=True, return_sparse=True,
return_colbert_vecs=False, )
query_dense_vector = query_embeddings["dense_vecs"]
res = client.search(collection_name=collection_name, data=query_dense_vector, anns_field='dense_vector',
search_params={"metric_type": "IP", "nprobe": 10}, limit=k, output_fields=['contentId'])
t1 = time.time()
print(f"passed time: {t1-t0}")

Answered by yhmo

Apr 1, 2025

"有时200ms，有时400ms" ----------- 因为你设置了consistency_level="Strong"
换成Bounded

client.create_collection(
  collection_name=collection_name,
  schema=schema,
  index_params=index_params,
  consistency_level="Bounded"
)

View full answer

yhmo · 2025-04-01T09:48:11Z

yhmo
Apr 1, 2025
Collaborator

"有时200ms，有时400ms" ----------- 因为你设置了consistency_level="Strong"
换成Bounded

client.create_collection(
  collection_name=collection_name,
  schema=schema,
  index_params=index_params,
  consistency_level="Bounded"
)

1 reply

wabye430 Apr 1, 2025
Author

感谢，确实是这个原因

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cpu-gpu, FLAT, IVF_FLAT, GPU_IVF_FLAT 检索速度几乎没有差别 #41030

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

cpu-gpu, FLAT, IVF_FLAT, GPU_IVF_FLAT 检索速度几乎没有差别 #41030

Uh oh!

Uh oh!

wabye430 Apr 1, 2025

删除集合

创建集合

index_params.add_index(field_name="dense_vector", index_type="GPU_IVF_FLAT", metric_type="IP", nlist=1024)

加载模型

Replies: 1 comment · 1 reply

Uh oh!

yhmo Apr 1, 2025 Collaborator

Uh oh!

wabye430 Apr 1, 2025 Author

wabye430
Apr 1, 2025

Replies: 1 comment 1 reply

yhmo
Apr 1, 2025
Collaborator

wabye430 Apr 1, 2025
Author