Colbert Interaction correct?

Thank you very much for BGE-M3!

I am implementing something similar, i found a line in your code that puzzles me a bit:

https://github.com/FlagOpen/FlagEmbedding/blob/2225aacb54cf9e807aa116dfffeb0cceb291b38b/FlagEmbedding/finetune/embedder/encoder_only/m3/modeling.py#L227

might it be that the colbert interaction is incorrect?

the einsum includes the CLS token:
```python
token_scores = torch.einsum('qin,pjn->qipj', q_reps, p_reps)
```

the scaling mask does not:
```python
q_mask[:, 1:].sum(-1, keepdim=True)
```

in the limit n->\infy this works out correctly, for small sequence length this can become significant.

What do you think?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Colbert Interaction correct? #1515

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Colbert Interaction correct? #1515

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions