Skip to content

Error in embedding document using BM25EmbeddingFunction #40954

Discussion options

You must be logged in to vote

@shalini0311
Downgrade your scipy version to 1.14.1.

I believe there is a behavior change for scipy.sparse.vsstack in scipy 1.15.0. The output of scipy.sparse.vsstack.tocsr() is changed.
The milvus_model.sparse.BM25EmbeddingFunction calls scipy.sparse.vsstack.tocsr() to generate a sparse array.

Use this script to test.
With scipy 1.14.1, it works fine.
With scipy 1.15.0, it throws "not enough values to unpack" error.

import numpy as np
from scipy.sparse import csr_array, vstack

sparse_embs = []

values = [1.0687022900763359, 1.4973262032085561]
rows = [0, 0]
cols = [1, 0]
sparse = csr_array((values, (rows, cols)), shape=(1, 4)).astype(np.float32)
sparse_embs.append(sparse)

values = [1.3…

Replies: 4 comments 10 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@yhmo
Comment options

yhmo Mar 27, 2025
Collaborator

Comment options

You must be logged in to vote
9 replies
@xiaofan-luan
Comment options

@yhmo
Comment options

yhmo Mar 31, 2025
Collaborator

@shalini0311
Comment options

@yhmo
Comment options

yhmo Apr 1, 2025
Collaborator

Answer selected by shalini0311
@shalini0311
Comment options

Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants