Skip to content

Commit 7a956e3

Browse files
authored
update docs to show model len / context windows (#401)
* update docs to show model len / context windows * make title clearer * make title clearer pt2
1 parent b1023a7 commit 7a956e3

File tree

1 file changed

+26
-26
lines changed

1 file changed

+26
-26
lines changed

docs/model_zoo.md

Lines changed: 26 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -2,32 +2,32 @@
22

33
Scale hosts the following models in the LLM Engine Model Zoo:
44

5-
| Model Name | Inference APIs Available | Fine-tuning APIs Available | Inference Frameworks Available |
6-
| --------------------- | ------------------------ | -------------------------- | ------------------------------ |
7-
| `llama-7b` ||| deepspeed, text-generation-inference |
8-
| `llama-2-7b` ||| text-generation-inference, vllm |
9-
| `llama-2-7b-chat` || | text-generation-inference, vllm |
10-
| `llama-2-13b` || | text-generation-inference, vllm |
11-
| `llama-2-13b-chat` || | text-generation-inference, vllm |
12-
| `llama-2-70b` ||| text-generation-inference, vllm |
13-
| `llama-2-70b-chat` || | text-generation-inference, vllm |
14-
| `falcon-7b` || | text-generation-inference, vllm |
15-
| `falcon-7b-instruct` || | text-generation-inference, vllm |
16-
| `falcon-40b` || | text-generation-inference, vllm |
17-
| `falcon-40b-instruct` || | text-generation-inference, vllm |
18-
| `mpt-7b` || | deepspeed, text-generation-inference, vllm |
19-
| `mpt-7b-instruct` ||| deepspeed, text-generation-inference, vllm |
20-
| `flan-t5-xxl` || | deepspeed, text-generation-inference |
21-
| `mistral-7b` ||| vllm |
22-
| `mistral-7b-instruct` ||| vllm |
23-
| `codellama-7b` ||| text-generation-inference, vllm |
24-
| `codellama-7b-instruct` ||| text-generation-inference, vllm |
25-
| `codellama-13b` ||| text-generation-inference, vllm |
26-
| `codellama-13b-instruct` ||| text-generation-inference, vllm |
27-
| `codellama-34b` ||| text-generation-inference, vllm |
28-
| `codellama-34b-instruct` ||| text-generation-inference, vllm |
29-
| `zephyr-7b-alpha` || | text-generation-inference, vllm |
30-
| `zephyr-7b-beta` || | text-generation-inference, vllm |
5+
| Model Name | Inference APIs Available | Fine-tuning APIs Available | Inference Frameworks Available | Inference max total tokens (prompt + response) |
6+
| --------------------- | ------------------------ | -------------------------- | ------------------------------ | ------------------------------ |
7+
| `llama-7b` ||| deepspeed, text-generation-inference | 2048 |
8+
| `llama-2-7b` ||| text-generation-inference, vllm | 4096|
9+
| `llama-2-7b-chat` || | text-generation-inference, vllm | 4096|
10+
| `llama-2-13b` || | text-generation-inference, vllm | 4096|
11+
| `llama-2-13b-chat` || | text-generation-inference, vllm | 4096|
12+
| `llama-2-70b` ||| text-generation-inference, vllm | 4096|
13+
| `llama-2-70b-chat` || | text-generation-inference, vllm | 4096|
14+
| `falcon-7b` || | text-generation-inference, vllm | 2048 |
15+
| `falcon-7b-instruct` || | text-generation-inference, vllm | 2048 |
16+
| `falcon-40b` || | text-generation-inference, vllm | 2048 |
17+
| `falcon-40b-instruct` || | text-generation-inference, vllm | 2048 |
18+
| `mpt-7b` || | deepspeed, text-generation-inference, vllm | 2048 |
19+
| `mpt-7b-instruct` ||| deepspeed, text-generation-inference, vllm | 2048 |
20+
| `flan-t5-xxl` || | deepspeed, text-generation-inference | 2048 |
21+
| `mistral-7b` ||| vllm | 8000 |
22+
| `mistral-7b-instruct` ||| vllm | 8000 |
23+
| `codellama-7b` ||| text-generation-inference, vllm | 16384 |
24+
| `codellama-7b-instruct` ||| text-generation-inference, vllm | 16384 |
25+
| `codellama-13b` ||| text-generation-inference, vllm | 16384 |
26+
| `codellama-13b-instruct` ||| text-generation-inference, vllm | 16384 |
27+
| `codellama-34b` ||| text-generation-inference, vllm | 16384 |
28+
| `codellama-34b-instruct` ||| text-generation-inference, vllm | 16384 |
29+
| `zephyr-7b-alpha` || | text-generation-inference, vllm | 32768 |
30+
| `zephyr-7b-beta` || | text-generation-inference, vllm | 32768 |
3131

3232
## Usage
3333

0 commit comments

Comments
 (0)