Skip to content

Vishnunkumar/transformerslite

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

transformerslite

Train simple lite transformer models in few lines of code

Implementation

from transformerslite import pipeline
from datasets import load_dataset

# mandatory to provide valid and train files for now
data = load_dataset('csv', data_files={
    "train": "hg.csv",
    "valid": "hg2.csv"
})


training_pipeline = pipeline.SeqClassifier(data, 
                                           epochs=4, 
                                           max_input_length=32, 
                                           batch_size=1,
                                           learning_rate=0.0001, 
                                           num_class=2)
trainer, tokenizer = training_pipeline.model()
trainer.train()
  • Sequence to Sequence Modeling t5-small
from transformerslite import pipeline
from datasets import load_dataset

# mandatory to provide valid and train files for now
data = load_dataset('csv', data_files={
    "train": "hg.csv",
    "valid": "hg2.csv"
})


training_pipeline = pipeline.T5Seq2Seq(data,
                                       max_input_length=32,
                                       max_target_length=32, 
                                       prefix='seq: ',
                                       epochs=4, 
                                       batch_size=1,
                                       learning_rate=0.0001)

trainer, tokenizer = training_pipeline.model()
trainer.train()

A spellchecker application is hosted on huggingface spaces which is finetuned on randomly modified 50000 sentences with errors imputed. Do try it out here

About

Data pre-processing for transformer models using simple python wrapper

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages