Skip to content

bm25 recall case #41014

Mar 31, 2025 · 2 comments · 2 replies
Discussion options

You must be logged in to vote

Since the text content is Chinese, you need to set the tokenizer to be "Chinese". https://milvus.io/docs/analyzer-overview.md
Try this script:


from pymilvus import (
    MilvusClient, DataType, Function, FunctionType,
)

import random

client = MilvusClient(
    uri="http://localhost:19530",
    token="root:Milvus"
)
print(client.get_server_version())

collection_name = "BBB"

schema = client.create_schema()

schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True, auto_id=True)
schema.add_field(field_name="text",
                 datatype=DataType.VARCHAR,
                 max_length=1000,
                 enable_analyzer=True,
                 analyzer_params={"type"…

Replies: 2 comments 2 replies

Comment options

You must be logged in to vote
0 replies
Answer selected by zhu3359
Comment options

You must be logged in to vote
2 replies
@zhu3359
Comment options

@xiaofan-luan
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants