@@ -316,7 +316,7 @@ Specified using `--task generate`.
316
316
| ` AquilaForCausalLM ` | Aquila, Aquila2 | ` BAAI/Aquila-7B ` , ` BAAI/AquilaChat-7B ` , etc. | ✅︎ | ✅︎ | ✅︎ |
317
317
| ` ArcticForCausalLM ` | Arctic | ` Snowflake/snowflake-arctic-base ` , ` Snowflake/snowflake-arctic-instruct ` , etc. | | ✅︎ | ✅︎ |
318
318
| ` BaiChuanForCausalLM ` | Baichuan2, Baichuan | ` baichuan-inc/Baichuan2-13B-Chat ` , ` baichuan-inc/Baichuan-7B ` , etc. | ✅︎ | ✅︎ | ✅︎ |
319
- | ` BambaForCausalLM ` | Bamba | ` ibm-ai-platform/Bamba-9B-fp8 ` , ` ibm-ai-platform/Bamba-9B ` | ✅︎ | ✅︎ | |
319
+ | ` BambaForCausalLM ` | Bamba | ` ibm-ai-platform/Bamba-9B-fp8 ` , ` ibm-ai-platform/Bamba-9B ` | ✅︎ | ✅︎ | ✅︎ |
320
320
| ` BloomForCausalLM ` | BLOOM, BLOOMZ, BLOOMChat | ` bigscience/bloom ` , ` bigscience/bloomz ` , etc. | | ✅︎ | |
321
321
| ` BartForConditionalGeneration ` | BART | ` facebook/bart-base ` , ` facebook/bart-large-cnn ` , etc. | | | |
322
322
| ` ChatGLMModel ` , ` ChatGLMForConditionalGeneration ` | ChatGLM | ` THUDM/chatglm2-6b ` , ` THUDM/chatglm3-6b ` , ` ShieldLM-6B-chatglm3 ` , etc. | ✅︎ | ✅︎ | ✅︎ |
@@ -332,7 +332,7 @@ Specified using `--task generate`.
332
332
| ` ExaoneForCausalLM ` | EXAONE-3 | ` LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct ` , etc. | ✅︎ | ✅︎ | ✅︎ |
333
333
| ` FalconForCausalLM ` | Falcon | ` tiiuae/falcon-7b ` , ` tiiuae/falcon-40b ` , ` tiiuae/falcon-rw-7b ` , etc. | | ✅︎ | ✅︎ |
334
334
| ` FalconMambaForCausalLM ` | FalconMamba | ` tiiuae/falcon-mamba-7b ` , ` tiiuae/falcon-mamba-7b-instruct ` , etc. | | ✅︎ | ✅︎ |
335
- | ` FalconH1ForCausalLM ` | Falcon-H1 | ` tiiuae/Falcon-H1-34B-Base ` , ` tiiuae/Falcon-H1-34B-Instruct ` , etc. | ✅︎ | ✅︎ | |
335
+ | ` FalconH1ForCausalLM ` | Falcon-H1 | ` tiiuae/Falcon-H1-34B-Base ` , ` tiiuae/Falcon-H1-34B-Instruct ` , etc. | ✅︎ | ✅︎ | ✅︎ |
336
336
| ` GemmaForCausalLM ` | Gemma | ` google/gemma-2b ` , ` google/gemma-1.1-2b-it ` , etc. | ✅︎ | ✅︎ | ✅︎ |
337
337
| ` Gemma2ForCausalLM ` | Gemma 2 | ` google/gemma-2-9b ` , ` google/gemma-2-27b ` , etc. | ✅︎ | ✅︎ | ✅︎ |
338
338
| ` Gemma3ForCausalLM ` | Gemma 3 | ` google/gemma-3-1b-it ` , etc. | ✅︎ | ✅︎ | ✅︎ |
@@ -345,7 +345,7 @@ Specified using `--task generate`.
345
345
| ` GPTNeoXForCausalLM ` | GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM | ` EleutherAI/gpt-neox-20b ` , ` EleutherAI/pythia-12b ` , ` OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 ` , ` databricks/dolly-v2-12b ` , ` stabilityai/stablelm-tuned-alpha-7b ` , etc. | | ✅︎ | ✅︎ |
346
346
| ` GraniteForCausalLM ` | Granite 3.0, Granite 3.1, PowerLM | ` ibm-granite/granite-3.0-2b-base ` , ` ibm-granite/granite-3.1-8b-instruct ` , ` ibm/PowerLM-3b ` , etc. | ✅︎ | ✅︎ | ✅︎ |
347
347
| ` GraniteMoeForCausalLM ` | Granite 3.0 MoE, PowerMoE | ` ibm-granite/granite-3.0-1b-a400m-base ` , ` ibm-granite/granite-3.0-3b-a800m-instruct ` , ` ibm/PowerMoE-3b ` , etc. | ✅︎ | ✅︎ | ✅︎ |
348
- | ` GraniteMoeHybridForCausalLM ` | Granite 4.0 MoE Hybrid | ` ibm-granite/granite-4.0-tiny-preview ` , etc. | ✅︎ | ✅︎ | |
348
+ | ` GraniteMoeHybridForCausalLM ` | Granite 4.0 MoE Hybrid | ` ibm-granite/granite-4.0-tiny-preview ` , etc. | ✅︎ | ✅︎ | ✅︎ |
349
349
| ` GraniteMoeSharedForCausalLM ` | Granite MoE Shared | ` ibm-research/moe-7b-1b-active-shared-experts ` (test model) | ✅︎ | ✅︎ | ✅︎ |
350
350
| ` GritLM ` | GritLM | ` parasail-ai/GritLM-7B-vllm ` . | ✅︎ | ✅︎ | |
351
351
| ` Grok1ModelForCausalLM ` | Grok1 | ` hpcai-tech/grok-1 ` . | ✅︎ | ✅︎ | ✅︎ |
@@ -357,14 +357,14 @@ Specified using `--task generate`.
357
357
| ` JambaForCausalLM ` | Jamba | ` ai21labs/AI21-Jamba-1.5-Large ` , ` ai21labs/AI21-Jamba-1.5-Mini ` , ` ai21labs/Jamba-v0.1 ` , etc. | ✅︎ | ✅︎ | |
358
358
| ` LlamaForCausalLM ` | Llama 3.1, Llama 3, Llama 2, LLaMA, Yi | ` meta-llama/Meta-Llama-3.1-405B-Instruct ` , ` meta-llama/Meta-Llama-3.1-70B ` , ` meta-llama/Meta-Llama-3-70B-Instruct ` , ` meta-llama/Llama-2-70b-hf ` , ` 01-ai/Yi-34B ` , etc. | ✅︎ | ✅︎ | ✅︎ |
359
359
| ` MambaForCausalLM ` | Mamba | ` state-spaces/mamba-130m-hf ` , ` state-spaces/mamba-790m-hf ` , ` state-spaces/mamba-2.8b-hf ` , etc. | | ✅︎ | |
360
- | ` Mamba2ForCausalLM ` | Mamba2 | ` mistralai/Mamba-Codestral-7B-v0.1 ` , etc. | | ✅︎ | |
360
+ | ` Mamba2ForCausalLM ` | Mamba2 | ` mistralai/Mamba-Codestral-7B-v0.1 ` , etc. | | ✅︎ | ✅︎ |
361
361
| ` MiniCPMForCausalLM ` | MiniCPM | ` openbmb/MiniCPM-2B-sft-bf16 ` , ` openbmb/MiniCPM-2B-dpo-bf16 ` , ` openbmb/MiniCPM-S-1B-sft ` , etc. | ✅︎ | ✅︎ | ✅︎ |
362
362
| ` MiniCPM3ForCausalLM ` | MiniCPM3 | ` openbmb/MiniCPM3-4B ` , etc. | ✅︎ | ✅︎ | ✅︎ |
363
363
| ` MistralForCausalLM ` | Mistral, Mistral-Instruct | ` mistralai/Mistral-7B-v0.1 ` , ` mistralai/Mistral-7B-Instruct-v0.1 ` , etc. | ✅︎ | ✅︎ | ✅︎ |
364
364
| ` MixtralForCausalLM ` | Mixtral-8x7B, Mixtral-8x7B-Instruct | ` mistralai/Mixtral-8x7B-v0.1 ` , ` mistralai/Mixtral-8x7B-Instruct-v0.1 ` , ` mistral-community/Mixtral-8x22B-v0.1 ` , etc. | ✅︎ | ✅︎ | ✅︎ |
365
365
| ` MPTForCausalLM ` | MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter | ` mosaicml/mpt-7b ` , ` mosaicml/mpt-7b-storywriter ` , ` mosaicml/mpt-30b ` , etc. | | ✅︎ | ✅︎ |
366
366
| ` NemotronForCausalLM ` | Nemotron-3, Nemotron-4, Minitron | ` nvidia/Minitron-8B-Base ` , ` mgoin/Nemotron-4-340B-Base-hf-FP8 ` , etc. | ✅︎ | ✅︎ | ✅︎ |
367
- | ` NemotronHForCausalLM ` | Nemotron-H | ` nvidia/Nemotron-H-8B-Base-8K ` , ` nvidia/Nemotron-H-47B-Base-8K ` , ` nvidia/Nemotron-H-56B-Base-8K ` , etc. | ✅︎ | ✅︎ | |
367
+ | ` NemotronHForCausalLM ` | Nemotron-H | ` nvidia/Nemotron-H-8B-Base-8K ` , ` nvidia/Nemotron-H-47B-Base-8K ` , ` nvidia/Nemotron-H-56B-Base-8K ` , etc. | ✅︎ | ✅︎ | ✅︎ |
368
368
| ` OLMoForCausalLM ` | OLMo | ` allenai/OLMo-1B-hf ` , ` allenai/OLMo-7B-hf ` , etc. | | ✅︎ | ✅︎ |
369
369
| ` OLMo2ForCausalLM ` | OLMo2 | ` allenai/OLMo-2-0425-1B ` , etc. | | ✅︎ | ✅︎ |
370
370
| ` OLMoEForCausalLM ` | OLMoE | ` allenai/OLMoE-1B-7B-0924 ` , ` allenai/OLMoE-1B-7B-0924-Instruct ` , etc. | | ✅︎ | ✅︎ |
@@ -389,7 +389,7 @@ Specified using `--task generate`.
389
389
| ` XverseForCausalLM ` | XVERSE | ` xverse/XVERSE-7B-Chat ` , ` xverse/XVERSE-13B-Chat ` , ` xverse/XVERSE-65B-Chat ` , etc. | ✅︎ | ✅︎ | ✅︎ |
390
390
| ` MiniMaxM1ForCausalLM ` | MiniMax-Text | ` MiniMaxAI/MiniMax-M1-40k ` , ` MiniMaxAI/MiniMax-M1-80k ` , etc. | | | |
391
391
| ` MiniMaxText01ForCausalLM ` | MiniMax-Text | ` MiniMaxAI/MiniMax-Text-01 ` , etc. | | | |
392
- | ` Zamba2ForCausalLM ` | Zamba2 | ` Zyphra/Zamba2-7B-instruct ` , ` Zyphra/Zamba2-2.7B-instruct ` , ` Zyphra/Zamba2-1.2B-instruct ` , etc. | | | |
392
+ | ` Zamba2ForCausalLM ` | Zamba2 | ` Zyphra/Zamba2-7B-instruct ` , ` Zyphra/Zamba2-2.7B-instruct ` , ` Zyphra/Zamba2-1.2B-instruct ` , etc. | | | ✅︎ |
393
393
394
394
!!! note
395
395
Currently, the ROCm version of vLLM supports Mistral and Mixtral only for context lengths up to 4096.
0 commit comments