DPO training not working

I cannot get DPO training to run (conda, mlx-lm==0.22.1):

ERROR 1
========
File ".../mlx-conda/lib/python3.12/site-packages/sillm/dpo.py", line 97, in <module>
    template = sillm.Template(model.tokenizer, template=args.template)
TypeError: Template.__init__() got an unexpected keyword argument 'template'

ERROR 2
========
File ".../mlx-conda/lib/python3.12/site-packages/sillm/dpo.py", line 137, in <module>
    model.train(dataset_training,
TypeError: TrainableLLM.train() got an unexpected keyword argument 'grad_checkpoint'


ERROR 3
========
  File ".../mlx-conda/lib/python3.12/site-packages/sillm/training/dpo.py", line 108, in forward
    logits, _ = model(inputs)
    ^^^^^^^^^
ValueError: too many values to unpack (expected 2)

ERROR 4
=======

File ".../mlx-conda/lib/python3.12/site-packages/sillm/training/trainer.py", line 214, in train
    loss_value, reward, num_tokens = step(batch)
                                     ^^^^^^^^^^^
ValueError: [compile] Attempting to compile a function with uncaptured inputs is not allowed.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DPO training not working #16

ERROR 1

ERROR 2

ERROR 3

ERROR 4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

DPO training not working #16

Description

ERROR 1

ERROR 2

ERROR 3

ERROR 4

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions