Skip to content

Always labels are tokenizing instead of text column, Kindly fix the issue facing  #75

@suryapa1

Description

@suryapa1

exp = multifit.from_pretrained("de_multifit_paper_version")
cls_dataset = exp.arch.dataset(Path('data/de_sentiment'), exp.pretrain_lm.tokenizer)
cls_dataset.load_clas_databunch(bs=exp.finetune_lm.bs).show_batch()

data/de_sentiment , path has train.csv/test.csv with labels, text as columns, even by shuffling as well show batch is tokenizing, Not sure why it is populaitng as such, any help is greatly apprecisted.,

My problem statement that is trying is as follows:

  1. Get german pretrained using multifit.from_pretrained("de_multifit_paper_version")
  2. create custom classifer dataset and fine tune on top of german pretrained
  3. classify custom dataset

Any example is greatly appreciated as well,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions