You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Sources/Tokenizers/Tokenizer.swift
+50-1Lines changed: 50 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -17,6 +17,7 @@ enum TokenizerError: Error {
17
17
case malformedVocab
18
18
case chatTemplate(String)
19
19
case tooLong(String)
20
+
case mismatchedConfig(String)
20
21
}
21
22
22
23
publicprotocolTokenizingModel{
@@ -530,6 +531,49 @@ class T5Tokenizer : UnigramTokenizer {}
530
531
531
532
letsentencePieceUnderline="▁"
532
533
534
+
// Hack for Llama tokenizers, see https://github.com/huggingface/transformers/blob/bcb841f0073fcd7a4fb88ea8064313c17dcab04a/src/transformers/models/llama/tokenization_llama_fast.py#L181
0 commit comments