forked from openai/tiktoken
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
I use tiktoken as a dependency and discovered a bug.
Script to reproduce:
import tiktoken
bad_string = "X" * 1000000
encoder = tiktoken.get_encoding("cl100k_base")
token_count = len(encoder.encode(bad_string))
print(f"Token count: {token_count}")
Result of running the script:
thread '<unnamed>' panicked at src/lib.rs:250:33:
called `Result::unwrap()` on an `Err` value: RuntimeError(StackOverflow)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
File "/Users/biobootloader/code/butler/reproduce_crash.py", line 5, in <module>
token_count = len(encoder.encode(bad_string))
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/biobootloader/code/butler/.venv/lib/python3.12/site-packages/tiktoken/core.py", line 124, in encode
return self._core_bpe.encode(text, allowed_special)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: RuntimeError(StackOverflow)
python reproduce_crash.py 0.16s user 0.03s system 91% cpu 0.197 total
can you figure this out and fix?
mentatbot
Metadata
Metadata
Assignees
Labels
No labels