Skip to content

Releases: huggingface/tokenizers

Python v0.2.0

20 Jan 14:24
Compare
Choose a tag to compare

In this release, we fixed some inconsistencies between the BPETokenizer and the original python version of this tokenizer. If you created your own vocabulary using this Tokenizer, you will need to either train a new one, or use a modified version, where you set the PreTokenizer back to Whitespace (instead of WhitespaceSplit).

Python v0.1.1

12 Jan 07:37
Compare
Choose a tag to compare
  • Fix a bug where special tokens get split while encoding

Python v0.1.0

10 Jan 18:49
Compare
Choose a tag to compare
Bump python version for release

v0.0.13

08 Jan 18:43
Compare
Choose a tag to compare
Hotfix Python bindings for 32-bit systems

v0.0.12

07 Jan 02:05
Compare
Choose a tag to compare
Bump version for release

v0.0.11

27 Dec 15:44
Compare
Choose a tag to compare

Fixes the sdist build for Python

v0.0.10

26 Dec 19:56
Compare
Choose a tag to compare
Bump for release

v0.0.9

23 Dec 22:31
Compare
Choose a tag to compare
Bump version for release

v0.0.8

20 Dec 15:27
Compare
Choose a tag to compare
Bump version and update Readme

v0.0.7

17 Dec 23:43
Compare
Choose a tag to compare
Bump version for release