-
Notifications
You must be signed in to change notification settings - Fork 104
Open
Description
Hey guys !
I had fun reading the paper and thanks for open-sourcing the model.
In the paper, you guys mentioned where [COL] and [VAL] are special tokens for indicating the start of attribute names and values respectively.
Meaning that [COL]
and [VAL]
are special tokens that are to be added to the tokenizer. In the repo https://github.com/megagonlabs/ditto/blob/master/ditto_light/dataset.py#L12, you guys are not adding this as special tokens to the vocabulary of the pre-trained tokenizer.
Any reason why?
Metadata
Metadata
Assignees
Labels
No labels