Skip to content

feat: add supertokenizers #236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
May 26, 2025

update lock file

bae0193
Select commit
Loading
Failed to load commit list.
Merged

feat: add supertokenizers #236

update lock file
bae0193
Select commit
Loading
Failed to load commit list.
Codecov / codecov/patch failed May 22, 2025 in 0s

84.38% of diff hit (target 93.17%)

View this Pull Request on Codecov

84.38% of diff hit (target 93.17%)

Annotations

Check warning on line 86 in model2vec/distill/distillation.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/distill/distillation.py#L86

Added line #L86 was not covered by tests

Check warning on line 13 in model2vec/tokenizer/model.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/model.py#L13

Added line #L13 was not covered by tests

Check warning on line 31 in model2vec/tokenizer/model.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/model.py#L28-L31

Added lines #L28 - L31 were not covered by tests

Check warning on line 35 in model2vec/tokenizer/model.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/model.py#L33-L35

Added lines #L33 - L35 were not covered by tests

Check warning on line 37 in model2vec/tokenizer/model.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/model.py#L37

Added line #L37 was not covered by tests

Check warning on line 29 in model2vec/tokenizer/normalizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/normalizer.py#L29

Added line #L29 was not covered by tests

Check warning on line 29 in model2vec/tokenizer/pretokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/pretokenizer.py#L24-L29

Added lines #L24 - L29 were not covered by tests

Check warning on line 31 in model2vec/tokenizer/pretokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/pretokenizer.py#L31

Added line #L31 was not covered by tests

Check warning on line 40 in model2vec/tokenizer/pretokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/pretokenizer.py#L40

Added line #L40 was not covered by tests

Check warning on line 47 in model2vec/tokenizer/pretokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/pretokenizer.py#L43-L47

Added lines #L43 - L47 were not covered by tests

Check warning on line 50 in model2vec/tokenizer/pretokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/pretokenizer.py#L49-L50

Added lines #L49 - L50 were not covered by tests

Check warning on line 52 in model2vec/tokenizer/pretokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/pretokenizer.py#L52

Added line #L52 was not covered by tests

Check warning on line 85 in model2vec/tokenizer/tokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/tokenizer.py#L85

Added line #L85 was not covered by tests

Check warning on line 157 in model2vec/tokenizer/tokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/tokenizer.py#L157

Added line #L157 was not covered by tests

Check warning on line 161 in model2vec/tokenizer/tokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/tokenizer.py#L160-L161

Added lines #L160 - L161 were not covered by tests

Check warning on line 263 in model2vec/tokenizer/tokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/tokenizer.py#L263

Added line #L263 was not covered by tests

Check warning on line 287 in model2vec/tokenizer/tokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/tokenizer.py#L287

Added line #L287 was not covered by tests

Check warning on line 312 in model2vec/tokenizer/tokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/tokenizer.py#L312

Added line #L312 was not covered by tests

Check warning on line 323 in model2vec/tokenizer/tokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/tokenizer.py#L323

Added line #L323 was not covered by tests

Check warning on line 343 in model2vec/tokenizer/tokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/tokenizer.py#L342-L343

Added lines #L342 - L343 were not covered by tests

Check warning on line 350 in model2vec/tokenizer/tokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/tokenizer.py#L349-L350

Added lines #L349 - L350 were not covered by tests

Check warning on line 352 in model2vec/tokenizer/tokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/tokenizer.py#L352

Added line #L352 was not covered by tests

Check warning on line 377 in model2vec/tokenizer/tokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/tokenizer.py#L374-L377

Added lines #L374 - L377 were not covered by tests

Check warning on line 379 in model2vec/tokenizer/tokenizer.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

model2vec/tokenizer/tokenizer.py#L379

Added line #L379 was not covered by tests

Check warning on line 25 in tests/test_distillation.py

See this annotation in the file changed.

@codecov codecov / codecov/patch

tests/test_distillation.py#L25

Added line #L25 was not covered by tests