@@ -317,7 +317,7 @@ Specified using `--task generate`.
317
317
| ` ArcticForCausalLM ` | Arctic | ` Snowflake/snowflake-arctic-base ` , ` Snowflake/snowflake-arctic-instruct ` , etc. | | ✅︎ | ✅︎ |
318
318
| ` BaiChuanForCausalLM ` | Baichuan2, Baichuan | ` baichuan-inc/Baichuan2-13B-Chat ` , ` baichuan-inc/Baichuan-7B ` , etc. | ✅︎ | ✅︎ | ✅︎ |
319
319
| ` BailingMoeForCausalLM ` | Ling | ` inclusionAI/Ling-lite-1.5 ` , ` inclusionAI/Ling-plus ` , etc. | ✅︎ | ✅︎ | ✅︎ |
320
- | ` BambaForCausalLM ` | Bamba | ` ibm-ai-platform/Bamba-9B-fp8 ` , ` ibm-ai-platform/Bamba-9B ` | ✅︎ | ✅︎ | |
320
+ | ` BambaForCausalLM ` | Bamba | ` ibm-ai-platform/Bamba-9B-fp8 ` , ` ibm-ai-platform/Bamba-9B ` | ✅︎ | ✅︎ | ✅︎ |
321
321
| ` BloomForCausalLM ` | BLOOM, BLOOMZ, BLOOMChat | ` bigscience/bloom ` , ` bigscience/bloomz ` , etc. | | ✅︎ | |
322
322
| ` BartForConditionalGeneration ` | BART | ` facebook/bart-base ` , ` facebook/bart-large-cnn ` , etc. | | | |
323
323
| ` ChatGLMModel ` , ` ChatGLMForConditionalGeneration ` | ChatGLM | ` THUDM/chatglm2-6b ` , ` THUDM/chatglm3-6b ` , ` ShieldLM-6B-chatglm3 ` , etc. | ✅︎ | ✅︎ | ✅︎ |
@@ -333,7 +333,7 @@ Specified using `--task generate`.
333
333
| ` ExaoneForCausalLM ` | EXAONE-3 | ` LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct ` , etc. | ✅︎ | ✅︎ | ✅︎ |
334
334
| ` FalconForCausalLM ` | Falcon | ` tiiuae/falcon-7b ` , ` tiiuae/falcon-40b ` , ` tiiuae/falcon-rw-7b ` , etc. | | ✅︎ | ✅︎ |
335
335
| ` FalconMambaForCausalLM ` | FalconMamba | ` tiiuae/falcon-mamba-7b ` , ` tiiuae/falcon-mamba-7b-instruct ` , etc. | | ✅︎ | ✅︎ |
336
- | ` FalconH1ForCausalLM ` | Falcon-H1 | ` tiiuae/Falcon-H1-34B-Base ` , ` tiiuae/Falcon-H1-34B-Instruct ` , etc. | ✅︎ | ✅︎ | |
336
+ | ` FalconH1ForCausalLM ` | Falcon-H1 | ` tiiuae/Falcon-H1-34B-Base ` , ` tiiuae/Falcon-H1-34B-Instruct ` , etc. | ✅︎ | ✅︎ | ✅︎ |
337
337
| ` GemmaForCausalLM ` | Gemma | ` google/gemma-2b ` , ` google/gemma-1.1-2b-it ` , etc. | ✅︎ | ✅︎ | ✅︎ |
338
338
| ` Gemma2ForCausalLM ` | Gemma 2 | ` google/gemma-2-9b ` , ` google/gemma-2-27b ` , etc. | ✅︎ | ✅︎ | ✅︎ |
339
339
| ` Gemma3ForCausalLM ` | Gemma 3 | ` google/gemma-3-1b-it ` , etc. | ✅︎ | ✅︎ | ✅︎ |
@@ -346,7 +346,7 @@ Specified using `--task generate`.
346
346
| ` GPTNeoXForCausalLM ` | GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM | ` EleutherAI/gpt-neox-20b ` , ` EleutherAI/pythia-12b ` , ` OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 ` , ` databricks/dolly-v2-12b ` , ` stabilityai/stablelm-tuned-alpha-7b ` , etc. | | ✅︎ | ✅︎ |
347
347
| ` GraniteForCausalLM ` | Granite 3.0, Granite 3.1, PowerLM | ` ibm-granite/granite-3.0-2b-base ` , ` ibm-granite/granite-3.1-8b-instruct ` , ` ibm/PowerLM-3b ` , etc. | ✅︎ | ✅︎ | ✅︎ |
348
348
| ` GraniteMoeForCausalLM ` | Granite 3.0 MoE, PowerMoE | ` ibm-granite/granite-3.0-1b-a400m-base ` , ` ibm-granite/granite-3.0-3b-a800m-instruct ` , ` ibm/PowerMoE-3b ` , etc. | ✅︎ | ✅︎ | ✅︎ |
349
- | ` GraniteMoeHybridForCausalLM ` | Granite 4.0 MoE Hybrid | ` ibm-granite/granite-4.0-tiny-preview ` , etc. | ✅︎ | ✅︎ | |
349
+ | ` GraniteMoeHybridForCausalLM ` | Granite 4.0 MoE Hybrid | ` ibm-granite/granite-4.0-tiny-preview ` , etc. | ✅︎ | ✅︎ | ✅︎ |
350
350
| ` GraniteMoeSharedForCausalLM ` | Granite MoE Shared | ` ibm-research/moe-7b-1b-active-shared-experts ` (test model) | ✅︎ | ✅︎ | ✅︎ |
351
351
| ` GritLM ` | GritLM | ` parasail-ai/GritLM-7B-vllm ` . | ✅︎ | ✅︎ | |
352
352
| ` Grok1ModelForCausalLM ` | Grok1 | ` hpcai-tech/grok-1 ` . | ✅︎ | ✅︎ | ✅︎ |
@@ -358,14 +358,14 @@ Specified using `--task generate`.
358
358
| ` JambaForCausalLM ` | Jamba | ` ai21labs/AI21-Jamba-1.5-Large ` , ` ai21labs/AI21-Jamba-1.5-Mini ` , ` ai21labs/Jamba-v0.1 ` , etc. | ✅︎ | ✅︎ | |
359
359
| ` LlamaForCausalLM ` | Llama 3.1, Llama 3, Llama 2, LLaMA, Yi | ` meta-llama/Meta-Llama-3.1-405B-Instruct ` , ` meta-llama/Meta-Llama-3.1-70B ` , ` meta-llama/Meta-Llama-3-70B-Instruct ` , ` meta-llama/Llama-2-70b-hf ` , ` 01-ai/Yi-34B ` , etc. | ✅︎ | ✅︎ | ✅︎ |
360
360
| ` MambaForCausalLM ` | Mamba | ` state-spaces/mamba-130m-hf ` , ` state-spaces/mamba-790m-hf ` , ` state-spaces/mamba-2.8b-hf ` , etc. | | ✅︎ | |
361
- | ` Mamba2ForCausalLM ` | Mamba2 | ` mistralai/Mamba-Codestral-7B-v0.1 ` , etc. | | ✅︎ | |
361
+ | ` Mamba2ForCausalLM ` | Mamba2 | ` mistralai/Mamba-Codestral-7B-v0.1 ` , etc. | | ✅︎ | ✅︎ |
362
362
| ` MiniCPMForCausalLM ` | MiniCPM | ` openbmb/MiniCPM-2B-sft-bf16 ` , ` openbmb/MiniCPM-2B-dpo-bf16 ` , ` openbmb/MiniCPM-S-1B-sft ` , etc. | ✅︎ | ✅︎ | ✅︎ |
363
363
| ` MiniCPM3ForCausalLM ` | MiniCPM3 | ` openbmb/MiniCPM3-4B ` , etc. | ✅︎ | ✅︎ | ✅︎ |
364
364
| ` MistralForCausalLM ` | Mistral, Mistral-Instruct | ` mistralai/Mistral-7B-v0.1 ` , ` mistralai/Mistral-7B-Instruct-v0.1 ` , etc. | ✅︎ | ✅︎ | ✅︎ |
365
365
| ` MixtralForCausalLM ` | Mixtral-8x7B, Mixtral-8x7B-Instruct | ` mistralai/Mixtral-8x7B-v0.1 ` , ` mistralai/Mixtral-8x7B-Instruct-v0.1 ` , ` mistral-community/Mixtral-8x22B-v0.1 ` , etc. | ✅︎ | ✅︎ | ✅︎ |
366
366
| ` MPTForCausalLM ` | MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter | ` mosaicml/mpt-7b ` , ` mosaicml/mpt-7b-storywriter ` , ` mosaicml/mpt-30b ` , etc. | | ✅︎ | ✅︎ |
367
367
| ` NemotronForCausalLM ` | Nemotron-3, Nemotron-4, Minitron | ` nvidia/Minitron-8B-Base ` , ` mgoin/Nemotron-4-340B-Base-hf-FP8 ` , etc. | ✅︎ | ✅︎ | ✅︎ |
368
- | ` NemotronHForCausalLM ` | Nemotron-H | ` nvidia/Nemotron-H-8B-Base-8K ` , ` nvidia/Nemotron-H-47B-Base-8K ` , ` nvidia/Nemotron-H-56B-Base-8K ` , etc. | ✅︎ | ✅︎ | |
368
+ | ` NemotronHForCausalLM ` | Nemotron-H | ` nvidia/Nemotron-H-8B-Base-8K ` , ` nvidia/Nemotron-H-47B-Base-8K ` , ` nvidia/Nemotron-H-56B-Base-8K ` , etc. | ✅︎ | ✅︎ | ✅︎ |
369
369
| ` OLMoForCausalLM ` | OLMo | ` allenai/OLMo-1B-hf ` , ` allenai/OLMo-7B-hf ` , etc. | | ✅︎ | ✅︎ |
370
370
| ` OLMo2ForCausalLM ` | OLMo2 | ` allenai/OLMo-2-0425-1B ` , etc. | | ✅︎ | ✅︎ |
371
371
| ` OLMoEForCausalLM ` | OLMoE | ` allenai/OLMoE-1B-7B-0924 ` , ` allenai/OLMoE-1B-7B-0924-Instruct ` , etc. | | ✅︎ | ✅︎ |
@@ -390,7 +390,7 @@ Specified using `--task generate`.
390
390
| ` XverseForCausalLM ` | XVERSE | ` xverse/XVERSE-7B-Chat ` , ` xverse/XVERSE-13B-Chat ` , ` xverse/XVERSE-65B-Chat ` , etc. | ✅︎ | ✅︎ | ✅︎ |
391
391
| ` MiniMaxM1ForCausalLM ` | MiniMax-Text | ` MiniMaxAI/MiniMax-M1-40k ` , ` MiniMaxAI/MiniMax-M1-80k ` , etc. | | | |
392
392
| ` MiniMaxText01ForCausalLM ` | MiniMax-Text | ` MiniMaxAI/MiniMax-Text-01 ` , etc. | | | |
393
- | ` Zamba2ForCausalLM ` | Zamba2 | ` Zyphra/Zamba2-7B-instruct ` , ` Zyphra/Zamba2-2.7B-instruct ` , ` Zyphra/Zamba2-1.2B-instruct ` , etc. | | | |
393
+ | ` Zamba2ForCausalLM ` | Zamba2 | ` Zyphra/Zamba2-7B-instruct ` , ` Zyphra/Zamba2-2.7B-instruct ` , ` Zyphra/Zamba2-1.2B-instruct ` , etc. | | | ✅︎ |
394
394
395
395
!!! note
396
396
Currently, the ROCm version of vLLM supports Mistral and Mixtral only for context lengths up to 4096.
0 commit comments