Skip to content

Commit cfa2b6f

Browse files
committed
rocm: use enforce-eager to avoid OOM errors
1 parent 51867fe commit cfa2b6f

File tree

11 files changed

+13
-11
lines changed

11 files changed

+13
-11
lines changed

Qwen/Qwen2.5-7B-Instruct/accuracy/server-rocm.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ trust-remote-code: true
33
tensor-parallel-size: 1
44
max-model-len: 16384
55
# override
6-
gpu_memory_utilization: 0.8
6+
enforce-eager: true

RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8-dynamic/accuracy/server-rocm.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ trust-remote-code: true
33
tensor-parallel-size: 1
44
max-model-len: 16384
55
# override
6-
gpu_memory_utilization: 0.6
6+
enforce-eager: true

RedHatAI/Mistral-Small-24B-Instruct-2501-FP8-Dynamic/accuracy/server-rocm.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ trust-remote-code: true
33
tensor-parallel-size: 1
44
max-model-len: 16384
55
# override
6-
gpu_memory_utilization: 0.8
6+
enforce-eager: true

RedHatAI/Mistral-Small-24B-Instruct-2501-quantized.w8a8/accuracy/server-rocm.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ trust-remote-code: true
33
tensor-parallel-size: 1
44
max-model-len: 16384
55
# override
6-
gpu_memory_utilization: 0.8
6+
enforce-eager: true

RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-FP8-dynamic/accuracy/server-rocm.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ trust-remote-code: true
33
tensor-parallel-size: 1
44
max-model-len: 16384
55
# override
6-
gpu_memory_utilization: 0.6
6+
enforce-eager: true

RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-quantized.w8a8/accuracy/server-rocm.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ trust-remote-code: true
33
tensor-parallel-size: 1
44
max-model-len: 16384
55
# override
6-
gpu_memory_utilization: 0.6
6+
enforce-eager: true

RedHatAI/Qwen2.5-7B-Instruct-FP8-dynamic/accuracy/server-rocm.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ trust-remote-code: true
33
tensor-parallel-size: 1
44
max-model-len: 16384
55
# override
6-
gpu_memory_utilization: 0.6
6+
enforce-eager: true

RedHatAI/phi-4-FP8-dynamic/accuracy/server-rocm.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ trust-remote-code: true
33
tensor-parallel-size: 1
44
max-model-len: 16384
55
# override
6-
gpu_memory_utilization: 0.6
6+
enforce-eager: true

ibm-granite/granite-3.1-8b-instruct/accuracy/server-rocm.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ trust-remote-code: true
33
tensor-parallel-size: 1
44
max-model-len: 16384
55
# override
6-
gpu_memory_utilization: 0.6
6+
enforce-eager: true

meta-llama/Llama-3.1-8B-Instruct/accuracy/server-rocm.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ trust-remote-code: true
33
tensor-parallel-size: 1
44
max-model-len: 16384
55
# override
6-
gpu_memory_utilization: 0.6
6+
enforce-eager: true
Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
2-
model: "mistralai/Mixtral-8x7B-Instruct-v0.1"
2+
model: 'mistralai/Mixtral-8x7B-Instruct-v0.1'
33
trust-remote-code: true
44
tensor-parallel-size: 2
55
max-model-len: 16384
6+
# override
7+
enforce-eager: true

0 commit comments

Comments
 (0)