Skip to content

sbalaji1996/tokenizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tokenizer

A command-line-based tokenizer written in Rust. Takes in a file and tokenizes each word, while also counting the number of times that word appears in the text file.

usage

/target/debug/tokenizer $TEXT_FILE.txt$

todo

  • convert hardcoded text into CL input
  • configure tool to print output json file to the location of your choosing
  • configure tool to allow user to configure delimeters
  • refactor such that the code meets the style guide
  • implement benchmark tests and optimize performance accordingly

About

Simple CL-based tokenizer written in Rust.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages