Skip to content

Commit b3bac7e

Browse files
Add Huggingface benchmark
1 parent ee843cd commit b3bac7e

18 files changed

+600250
-168
lines changed

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
members = [
44
"crates/*",
5+
"crates/bpe/benchmarks",
56
]
67
resolver = "2"
78

crates/bpe/Cargo.toml

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -12,12 +12,6 @@ categories = ["algorithms", "data-structures", "encoding", "science"]
1212
crate-type = ["lib", "staticlib"]
1313
bench = false
1414

15-
[[bench]]
16-
name = "performance"
17-
path = "benches/performance.rs"
18-
harness = false
19-
test = false
20-
2115
[features]
2216
rand = ["dep:rand"]
2317
tiktoken-rs = ["dep:tiktoken-rs"]
@@ -33,4 +27,3 @@ tiktoken-rs = { version = "0.5", optional = true }
3327

3428
[dev-dependencies]
3529
bpe = { path = ".", features = ["rand", "tiktoken-rs"] }
36-
criterion = "0.5"

crates/bpe/README.md

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -183,8 +183,8 @@ On average it is about ~4 faster, since the short-cuts usually pay off.
183183

184184
## Benchmarks
185185

186-
We ran several benchmarks to compare performance of different encoders and a tiktoken implementation.
187-
For the tiktoken implementation we used [tiktoken-rs](https://crates.io/crates/tiktoken-rs) library, a wrapper around OpenAI's tiktoken implementation.
186+
We ran several benchmarks to compare performance of different encoders, and tiktoken and Huggingface tokenizers.
187+
We used [tiktoken-rs](https://crates.io/crates/tiktoken-rs), a wrapper around OpenAI's tiktoken implementation, and Huggingface's [tokenizers](https://crates.io/crates/tokenizers).
188188
Note that tiktoken does not run BPE on the full input text.
189189
Instead it splits it into large chunks using a regex and runs BPE on the individual chunks.
190190
We have not tried to see if that approach is compatible with our BPE implementation.
@@ -225,13 +225,13 @@ The backtracking encoder, the fastest encoder that still returns correct results
225225
The fully dynamic programming solution and the heap implementation are still quite competitive to TikToken (especially for smaller inputs).
226226
If the requirement of correct BPE output can be relaxed, then the Greedy approach or the minimal encoding approach are the clear winners.
227227

228-
![encoding runtime comparison](./benches/result/encoding-o200k.svg)
228+
![encoding runtime comparison](./images/performance-encoding.svg)
229229

230230
The graph below shows encoding results for input that is particularly challenging for tiktoken.
231231
The input consists of random ranges taken from the continuous list of all Unicode code points excluding whitespace.
232232
This inhibits tiktoken ability to split the input before applying BPE revealing its quadratic runtime complexity.
233233

234-
![worst-case encoding runtime comparison](./benches/result/worstcase-o200k.svg)
234+
![worst-case encoding runtime comparison](./images/performance-worstcase.svg)
235235

236236
### Incremental encoding
237237

@@ -246,7 +246,7 @@ The graph below shows encoding runtime vs slice length.
246246
The overall runtime of byte-by-byte incremental encoder for encoding the full text is comparable to the runtime of the backtracking encoder, with only a constant factor overhead.
247247
Note that this is a huge win for incremental use cases, which would otherwise require retokenization after each append, resulting in a quadratic slowdown.
248248

249-
![appending runtime comparison](./benches/result/appending-o200k.svg)
249+
![appending runtime comparison](./images/performance-appending.svg)
250250

251251
### Interval counting
252252

@@ -264,10 +264,16 @@ The graph below shows counting runtime vs slice length.
264264
The runtime of the backtracking encoder grows with the length of the slice.
265265
The interval encoder counts any interval in typically constant time.
266266

267-
![counting runtime comparison](./benches/result/counting-o200k.svg)
267+
![counting runtime comparison](./images/performance-counting.svg)
268268

269269
### Running the benchmarks
270270

271+
Benchmarks are located in a separate crate in the `benchmarks` directory.
272+
273+
```sh
274+
cd benchmarks
275+
```
276+
271277
Run the benchmark as follows (required [cargo-criterion](https://crates.io/crates/cargo-criterion) installed):
272278

273279
```sh
@@ -280,5 +286,5 @@ Open the full report which should be located in `target/criterion/reports/index.
280286
Update the figures in this repo as follows (requires `rsvg-convert` from `librsvg` installed):
281287

282288
```sh
283-
script/copy-benchmark-results
289+
script/copy-results
284290
```

crates/bpe/benchmarks/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
target/

crates/bpe/benchmarks/Cargo.toml

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
[package]
2+
name = "bpe-benches"
3+
edition = "2021"
4+
5+
[[bench]]
6+
name = "performance"
7+
path = "performance.rs"
8+
harness = false
9+
test = false
10+
11+
[dev-dependencies]
12+
bpe = { path = "../../bpe", features = ["rand", "tiktoken-rs"] }
13+
bpe-openai = { path = "../../bpe-openai" }
14+
criterion = "0.5"
15+
rand = "0.8"
16+
tiktoken-rs = "0.5"
17+
tokenizers = "0.20"

crates/bpe/benchmarks/criterion.toml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# save report in this directory, even if a custom target directory is set
2+
criterion_home = "./target/criterion"
3+
4+
# The colors table allows users to configure the colors used by the charts
5+
# cargo-criterion generates.
6+
[colors]
7+
# Color-blind friendly color scheme from https://personal.sron.nl/~pault/.
8+
comparison_colors = [
9+
{r = 51, g = 34, b = 136 }, # indigo
10+
{r = 136, g = 204, b = 238 }, # cyan
11+
{r = 68, g = 170, b = 153 }, # teal
12+
{r = 17, g = 119, b = 51 }, # green
13+
{r = 153, g = 153, b = 51 }, # olive
14+
{r = 221, g = 204, b = 119 }, # sand
15+
{r = 204, g = 102, b = 119 }, # rose
16+
{r = 136, g = 34, b = 85 }, # wine
17+
{r = 170, g = 68, b = 153 }, # purple
18+
]

0 commit comments

Comments
 (0)