Skip to content

Training a Dense Retriever Model: Embedding Retriever (sentence_transformers) #4247

Discussion options

You must be logged in to vote

Hey @nachoperezzv, the training data is different than SQuAD format. Instead, it should have:

question: the question string
pos_doc: the positive document string
neg_doc: the negative document string
score: the score margin

As the error message states, {'question', 'pos_doc'} fields are necessary for 'mnrl' loss. You can see the API Reference here for details.

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@nachoperezzv
Comment options

@bilgeyucel
Comment options

@nachoperezzv
Comment options

@bilgeyucel
Comment options

Answer selected by nachoperezzv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants