Skip to content

Commit 1f6a41e

Browse files
yeahdongcnJohannesGaessler
authored andcommitted
musa: enable fp16 mma (all) and cublas on qy2 (ggml-org#13842)
* musa: enable fp16 mma (all) and cublas on qy2 Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Update ggml/src/ggml-cuda/ggml-cuda.cu Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * Address review comments Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Address review comments Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: disable MUL_MAT_ID (q2_k × f32) due to precision issues Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
1 parent 7beb268 commit 1f6a41e

File tree

1 file changed

+0
-16
lines changed

1 file changed

+0
-16
lines changed

ggml/src/ggml-cuda/common.cuh

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -262,22 +262,6 @@ static bool fp16_mma_hardware_available(const int cc) {
262262
(GGML_CUDA_CC_IS_MTHREADS(cc) && cc >= GGML_CUDA_CC_QY2);
263263
}
264264

265-
static bool bf16_mma_hardware_available(const int cc) {
266-
return (GGML_CUDA_CC_IS_NVIDIA(cc) && cc >= GGML_CUDA_CC_AMPERE) || GGML_CUDA_CC_IS_CDNA(cc) || cc >= GGML_CUDA_CC_RDNA3;
267-
}
268-
269-
static bool fp32_mma_hardware_available(const int cc) {
270-
return GGML_CUDA_CC_IS_CDNA(cc);
271-
}
272-
273-
static bool bf16_mma_hardware_available(const int cc) {
274-
return (GGML_CUDA_CC_IS_NVIDIA(cc) && cc >= GGML_CUDA_CC_AMPERE) || GGML_CUDA_CC_IS_CDNA(cc) || cc >= GGML_CUDA_CC_RDNA3;
275-
}
276-
277-
static bool fp32_mma_hardware_available(const int cc) {
278-
return GGML_CUDA_CC_IS_CDNA(cc);
279-
}
280-
281265
// Volta technically had FP16 tensor cores but they work very differently compared to Turing and later.
282266
static bool new_mma_available(const int cc) {
283267
return GGML_CUDA_CC_IS_NVIDIA(cc) && ggml_cuda_highest_compiled_arch(cc) >= GGML_CUDA_CC_TURING;

0 commit comments

Comments
 (0)