-
Notifications
You must be signed in to change notification settings - Fork 800
Open
Description
Thank you very much for BGE-M3!
I am implementing something similar, i found a line in your code that puzzles me a bit:
token_scores = torch.einsum('qin,pjn->qipj', q_reps, p_reps) |
might it be that the colbert interaction is incorrect?
the einsum includes the CLS token:
token_scores = torch.einsum('qin,pjn->qipj', q_reps, p_reps)
the scaling mask does not:
q_mask[:, 1:].sum(-1, keepdim=True)
in the limit n->\infy this works out correctly, for small sequence length this can become significant.
What do you think?
Metadata
Metadata
Assignees
Labels
No labels