Skip to content

Commit 0b97023

Browse files
authored
overriding model length for zephyr 7b alpha (#398)
1 parent 4339cf9 commit 0b97023

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

model-engine/model_engine_server/domain/use_cases/llm_model_endpoint_use_cases.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -194,6 +194,7 @@
194194
# Can also see 13B, 34B there too
195195
"llama-2": {"max_model_len": None, "max_num_batched_tokens": 4096},
196196
"mistral": {"max_model_len": 8000, "max_num_batched_tokens": 8000},
197+
"zephyr": {"max_model_len": 32768, "max_num_batched_tokens": 32768},
197198
}
198199

199200

0 commit comments

Comments
 (0)