Skip to content

Commit aea1eca

Browse files
ggerganovMinh141120
authored andcommitted
metal : batch rows copy in a single threadgroup (ggml-org#14384)
* metal : batch rows copy in a single threadgroup ggml-ci * metal : handle some edge cases when threadgroup size is not a power of 2 ggml-ci
1 parent 63bae38 commit aea1eca

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

ggml/src/ggml-metal/ggml-metal.m

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2571,6 +2571,7 @@ static bool ggml_metal_encode_node(
25712571
nth *= 2;
25722572
}
25732573

2574+
nth = MIN(nth, (int) pipeline.maxTotalThreadsPerThreadgroup);
25742575
nth = MIN(nth, ne00);
25752576

25762577
ggml_metal_kargs_sum_rows args = {

0 commit comments

Comments
 (0)