Skip to content

iriacardiel/playBPE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

playBPE

A playground to understand BPE (Byte-Pair Encoding) Tokenization

alt text

Set up the environment

  1. Create virtual environment: python -m venv venv

  2. Activate virtual environment: source venv/bin/activate

  3. Install python dependencies: pip install -r requirements.txt

Tokenizer Tester (Python Script)

Run tokenizer_tester.py

python scripts/tokenizer_tester.py

alt text

Tokenizer Visualizer (Streamlit App)

Run tokenizer_viewer.py

streamlit run scripts/tokenizer_viewer.py

alt text

Go to http://localhost:8501 in your browser.

alt text

About

A playground to understand BPE.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages