Skip to content

Commit ba3fb3b

Browse files
authored
Cap Mistral's context length at 2k (#495)
Temporary fix to prevent multiple TB of memory allocated just to attention masks
1 parent 19b3bc8 commit ba3fb3b

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

transformer_lens/loading_from_pretrained.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -815,7 +815,7 @@ def convert_hf_model_config(model_name: str, **kwargs):
815815
"n_heads": 32,
816816
"d_mlp": 14336,
817817
"n_layers": 32,
818-
"n_ctx": 32768,
818+
"n_ctx": 2048, # Capped due to memory issues
819819
"d_vocab": 32000,
820820
"act_fn": "silu",
821821
"normalization_type": "RMS",

0 commit comments

Comments
 (0)