You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: crates/bpe/README.md
+13-7Lines changed: 13 additions & 7 deletions
Original file line number
Diff line number
Diff line change
@@ -183,8 +183,8 @@ On average it is about ~4 faster, since the short-cuts usually pay off.
183
183
184
184
## Benchmarks
185
185
186
-
We ran several benchmarks to compare performance of different encoders and a tiktoken implementation.
187
-
For the tiktoken implementation we used [tiktoken-rs](https://crates.io/crates/tiktoken-rs) library, a wrapper around OpenAI's tiktoken implementation.
186
+
We ran several benchmarks to compare performance of different encoders, and tiktoken and Huggingface tokenizers.
187
+
We used [tiktoken-rs](https://crates.io/crates/tiktoken-rs), a wrapper around OpenAI's tiktoken implementation, and Huggingface's [tokenizers](https://crates.io/crates/tokenizers).
188
188
Note that tiktoken does not run BPE on the full input text.
189
189
Instead it splits it into large chunks using a regex and runs BPE on the individual chunks.
190
190
We have not tried to see if that approach is compatible with our BPE implementation.
@@ -225,13 +225,13 @@ The backtracking encoder, the fastest encoder that still returns correct results
225
225
The fully dynamic programming solution and the heap implementation are still quite competitive to TikToken (especially for smaller inputs).
226
226
If the requirement of correct BPE output can be relaxed, then the Greedy approach or the minimal encoding approach are the clear winners.
@@ -246,7 +246,7 @@ The graph below shows encoding runtime vs slice length.
246
246
The overall runtime of byte-by-byte incremental encoder for encoding the full text is comparable to the runtime of the backtracking encoder, with only a constant factor overhead.
247
247
Note that this is a huge win for incremental use cases, which would otherwise require retokenization after each append, resulting in a quadratic slowdown.
0 commit comments