Skip to content

Commit f743e6b

Browse files
committed
froce use V0 in serve tests
Signed-off-by: Linkun Chen <github@lkchen.net>
1 parent 5a204e3 commit f743e6b

File tree

4 files changed

+15
-0
lines changed

4 files changed

+15
-0
lines changed

release/llm_tests/serve/configs/model_config/llama_3dot1_8b_lora.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,10 @@ model_loading_config:
44

55
accelerator_type: A10G
66

7+
runtime_env:
8+
env_vars:
9+
VLLM_USE_V1: "0"
10+
711
engine_kwargs:
812
max_model_len: 2048
913
enable_lora: true

release/llm_tests/serve/configs/model_config/llama_3dot1_8b_quantized_tp1.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,9 @@ model_loading_config:
33

44
accelerator_type: A10G
55

6+
runtime_env:
7+
env_vars:
8+
VLLM_USE_V1: "0"
9+
610
engine_kwargs:
711
max_model_len: 8192

release/llm_tests/serve/configs/model_config/llama_3dot1_8b_tp2.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,10 @@ model_loading_config:
33

44
accelerator_type: A10G
55

6+
runtime_env:
7+
env_vars:
8+
VLLM_USE_V1: "0"
9+
610
engine_kwargs:
711
max_model_len: 8192
812
tensor_parallel_size: 2

release/llm_tests/serve/configs/serve_llama_3dot2_1b_s3.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,9 @@ applications:
88
accelerator_type: A10G
99
engine_kwargs:
1010
max_model_len: 8192
11+
runtime_env:
12+
env_vars:
13+
VLLM_USE_V1: "0"
1114
import_path: ray.serve.llm:build_openai_app
1215
name: llm-endpoint
1316
route_prefix: /

0 commit comments

Comments
 (0)