Skip to content

Llama3.1 models do not allow configuring max_seq_len #2202

@akashc1

Description

@akashc1

Llama 3.1 model builders hardcode the max context length, even though the component builders allow specifying it:

And since the QLoRA versions also use these, it affects that too. This prevents anyone from specifying the model's max_seq_len from a config or CLI. E.g. this config will throw an error:

output_dir: /tmp/torchtune/llama3_1_8B/lora # /tmp may be deleted by your system. Change it to your preference.
max_seq_len: 8192

# Tokenizer
tokenizer:
  _component_: torchtune.models.llama3.llama3_tokenizer
  path: /models/meta-llama/Llama-3.1-8B-Instruct/original/tokenizer.model
  max_seq_len: ${max_seq_len}

# Model Arguments
model:
  _component_: torchtune.models.llama3_1.lora_llama3_1_8b
  lora_attn_modules: ['q_proj', 'v_proj', 'output_proj']
  apply_lora_to_mlp: True
  apply_lora_to_output: False
  lora_rank: 8  # higher increases accuracy and memory
  lora_alpha: 16  # usually alpha=2*rank
  lora_dropout: 0.0
  max_seq_len: ${max_seq_len}
[rank4]: Traceback (most recent call last):
[rank4]:   File "/torchtune/recipes/lora_finetune_distributed.py", line 938, in <module>
[rank4]:     sys.exit(recipe_main())
[rank4]:   File "/torchtune/torchtune/config/_parse.py", line 99, in wrapper
[rank4]:     sys.exit(recipe_main(conf))
[rank4]:   File "/torchtune/recipes/lora_finetune_distributed.py", line 932, in recipe_main
[rank4]:     recipe.setup(cfg=cfg)
[rank4]:   File "/torchtune/recipes/lora_finetune_distributed.py", line 272, in setup
[rank4]:     self._model = self._setup_model(
[rank4]:   File "/torchtune/recipes/lora_finetune_distributed.py", line 453, in _setup_model
[rank4]:     model = config.instantiate(cfg_model)
[rank4]:   File "/torchtune/torchtune/config/_instantiate.py", line 112, in instantiate
[rank4]:     return _instantiate_node(OmegaConf.to_object(config), *args)
[rank4]:   File "/torchtune/torchtune/config/_instantiate.py", line 33, in _instantiate_node
[rank4]:     return _create_component(_component_, args, kwargs)
[rank4]:   File "/torchtune/torchtune/config/_instantiate.py", line 22, in _create_component
[rank4]:     return _component_(*args, **kwargs)
[rank4]: TypeError: lora_llama3_1_8b() got an unexpected keyword argument 'max_seq_len'

In my workload, and I'm sure for others as well, I need to specify the context length differently.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriagedThis issue has been assigned an owner and appropriate label

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions