Skip to content

Commit 36019c3

Browse files
committed
graph : make FA compatible with MLA + add initial Metal kernels (llama/12953)
* graph : make mla compatible with FA * metal : add exp FA kernels for DeepSeek models ggml-ci * llama : minor naming updates ggml-ci * ggml : disable FA for DS head sizes * tests : add FA tests for MLA shapes ggml-ci
1 parent 4e936e2 commit 36019c3

File tree

4 files changed

+99
-6
lines changed

4 files changed

+99
-6
lines changed

ggml/src/ggml-cuda/ggml-cuda.cu

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3237,6 +3237,10 @@ static bool ggml_backend_cuda_device_supports_op(ggml_backend_dev_t dev, const g
32373237
if (op->src[0]->ne[0] == 192) {
32383238
return false;
32393239
}
3240+
if (op->src[0]->ne[0] == 576) {
3241+
// DeepSeek MLA
3242+
return false;
3243+
}
32403244
if (op->src[0]->ne[3] != 1) {
32413245
return false;
32423246
}

0 commit comments

Comments
 (0)