Milvus OpenAi embedding function #42867

anshdavid · 2025-06-20T01:19:53Z

anshdavid
Jun 20, 2025

Hello guys, I just know if the embedding function for OpenAI uses batches or are they single api calls ?

Answered by yhmo

Jun 20, 2025

Insert request is received by proxy node, the proxy node calls Function.ProcessInsert() to generate embeddings here:

milvus/internal/proxy/task_insert.go

Line 177 in 6798fdc

if err := exec.ProcessInsert(ctx, it.insertMsg); err != nil {

The implementation of TextEmbeddingFunction.ProcessInsert() is here:

milvus/internal/util/function/text_embedding_function.go

Line 227 in 6798fdc

texts := inputs[0].GetScalars().GetStringData().GetData()

The TextEmbeddingFunction.ProcessInsert() calls EmbeddingsProvider.CallEmbedding() here:

milvus/internal/util/function/text_embedding_function.go

Line 241 in 6798fdc

embds, err := runner.embProvider.CallEmbedding

View full answer

yhmo · 2025-06-20T02:21:31Z

yhmo
Jun 20, 2025
Collaborator

Insert request is received by proxy node, the proxy node calls Function.ProcessInsert() to generate embeddings here:

milvus/internal/proxy/task_insert.go

Line 177 in 6798fdc

if err := exec.ProcessInsert(ctx, it.insertMsg); err != nil {

The implementation of TextEmbeddingFunction.ProcessInsert() is here:

milvus/internal/util/function/text_embedding_function.go

Line 227 in 6798fdc

texts := inputs[0].GetScalars().GetStringData().GetData()

The TextEmbeddingFunction.ProcessInsert() calls EmbeddingsProvider.CallEmbedding() here:

milvus/internal/util/function/text_embedding_function.go

Line 241 in 6798fdc

embds, err := runner.embProvider.CallEmbedding(texts, InsertMode)

Before the TextEmbeddingFunction.ProcessInsert() is called(with a test text "check"), the TextEmbeddingFunction.Check() is called to verify the embedding provider:

milvus/internal/util/function/text_embedding_function.go

Line 136 in 6798fdc

embds, err := runner.embProvider.CallEmbedding([]string{"check"}, InsertMode)

In the implementation of OpenAIEmbeddingProvider, there is a member named "maxBatch", default value is 128:

milvus/internal/util/function/openai_embedding_provider.go

Line 128 in 6798fdc

maxBatch: 128,

The OpenAIEmbeddingProvider calls OpenAI client API to generate embeddings batch by batch in this loop:

milvus/internal/util/function/openai_embedding_provider.go

Line 158 in 6798fdc

    
           resp, err := provider.client.Embedding(provider.modelName, texts[i:end], int(provider.embedDimParam), provider.user, provider.timeoutSec)

So, if you insert N rows in an insert request, there will be (math.Floor(N/128) + 2) calls of OpenAI API.

1 reply

anshdavid Jun 20, 2025
Author

@yhmo thank you for the answer and guiding me through the code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Milvus OpenAi embedding function #42867

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Milvus OpenAi embedding function #42867

Uh oh!

anshdavid Jun 20, 2025

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

yhmo Jun 20, 2025 Collaborator

Uh oh!

anshdavid Jun 20, 2025 Author

anshdavid
Jun 20, 2025

Replies: 1 comment 1 reply

yhmo
Jun 20, 2025
Collaborator

anshdavid Jun 20, 2025
Author