Llama3.1 models do not allow configuring `max_seq_len`

Llama 3.1 model builders hardcode the max context length, even though the component builders allow specifying it:
- [Llama 3.1 8B](https://github.com/pytorch/torchtune/blob/v0.5.0/torchtune/models/llama3_1/_model_builders.py#L34)
- [Llama 3.1 70B](https://github.com/pytorch/torchtune/blob/v0.5.0/torchtune/models/llama3_1/_model_builders.py#L55)
- [Llama 3.1 405B](https://github.com/pytorch/torchtune/blob/v0.5.0/torchtune/models/llama3_1/_model_builders.py#L76)
- [Llama 3.1 8B LoRA](https://github.com/pytorch/torchtune/blob/v0.5.0/torchtune/models/llama3_1/_model_builders.py#L128)
- [Llama 3.1 70B LoRA](https://github.com/pytorch/torchtune/blob/v0.5.0/torchtune/models/llama3_1/_model_builders.py#L185)
- [Llama 3.1 405B LoRA](https://github.com/pytorch/torchtune/blob/v0.5.0/torchtune/models/llama3_1/_model_builders.py#L239)

And since the `QLoRA` versions also use these, it affects that too. This prevents anyone from specifying the model's `max_seq_len` from a config or CLI. E.g. this config will throw an error:
``` yaml
output_dir: /tmp/torchtune/llama3_1_8B/lora # /tmp may be deleted by your system. Change it to your preference.
max_seq_len: 8192

# Tokenizer
tokenizer:
  _component_: torchtune.models.llama3.llama3_tokenizer
  path: /models/meta-llama/Llama-3.1-8B-Instruct/original/tokenizer.model
  max_seq_len: ${max_seq_len}

# Model Arguments
model:
  _component_: torchtune.models.llama3_1.lora_llama3_1_8b
  lora_attn_modules: ['q_proj', 'v_proj', 'output_proj']
  apply_lora_to_mlp: True
  apply_lora_to_output: False
  lora_rank: 8  # higher increases accuracy and memory
  lora_alpha: 16  # usually alpha=2*rank
  lora_dropout: 0.0
  max_seq_len: ${max_seq_len}
```

```
[rank4]: Traceback (most recent call last):
[rank4]:   File "/torchtune/recipes/lora_finetune_distributed.py", line 938, in <module>
[rank4]:     sys.exit(recipe_main())
[rank4]:   File "/torchtune/torchtune/config/_parse.py", line 99, in wrapper
[rank4]:     sys.exit(recipe_main(conf))
[rank4]:   File "/torchtune/recipes/lora_finetune_distributed.py", line 932, in recipe_main
[rank4]:     recipe.setup(cfg=cfg)
[rank4]:   File "/torchtune/recipes/lora_finetune_distributed.py", line 272, in setup
[rank4]:     self._model = self._setup_model(
[rank4]:   File "/torchtune/recipes/lora_finetune_distributed.py", line 453, in _setup_model
[rank4]:     model = config.instantiate(cfg_model)
[rank4]:   File "/torchtune/torchtune/config/_instantiate.py", line 112, in instantiate
[rank4]:     return _instantiate_node(OmegaConf.to_object(config), *args)
[rank4]:   File "/torchtune/torchtune/config/_instantiate.py", line 33, in _instantiate_node
[rank4]:     return _create_component(_component_, args, kwargs)
[rank4]:   File "/torchtune/torchtune/config/_instantiate.py", line 22, in _create_component
[rank4]:     return _component_(*args, **kwargs)
[rank4]: TypeError: lora_llama3_1_8b() got an unexpected keyword argument 'max_seq_len'
```

In my workload, and I'm sure for others as well, I need to specify the context length differently.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Llama3.1 models do not allow configuring `max_seq_len` #2202

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Llama3.1 models do not allow configuring max_seq_len #2202

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Llama3.1 models do not allow configuring `max_seq_len` #2202