File tree Expand file tree Collapse file tree 1 file changed +16
-0
lines changed Expand file tree Collapse file tree 1 file changed +16
-0
lines changed Original file line number Diff line number Diff line change 1
1
# SYCL CUTLASS Changelog
2
2
3
+ ## [ Cutlass 3.9.2 SYCL backend Version 0.3] ( https://github.com/codeplay/cutlass-fork/releases/tag/v3.9.2-0.3 ) (2025-06-30)
4
+ - Add support for GEMM FP8 (E5M2 and E4M3)
5
+ - Add example for GEMM FP8 with support for channel-wise and group-wise quantization
6
+ - Add support for Grouped GEMM FP8
7
+ - Improve performance for FP8 to FP16 conversion
8
+ - Add support for epilogue data conversion
9
+ - Add support for FP16 GEMM with FP16 accumulator
10
+ - Add support for BF16 GEMM with BF16 accumulator
11
+ - Add support for mixed dtype GEMM with support for tensor-wise, channel-wise and group-wise quantization
12
+ - Add example of mixed dtype BF16 + INT8 using channel-wise and group-wise quantization
13
+ - Add example of mixed dtype FP16 + INT8 using tensor-wise quantization
14
+ - Add example of mixed dtype FP16 + INT4 using channel-wise and group-wise quantization
15
+ - Add support for zero-point quantization in INT4 and INT8 data types
16
+ - Add support for Flash Attention prefill FP8 with and without KV cache
17
+ - Add support for Flash Attention decode FP8 with and without KV cache
18
+
3
19
## [ Cutlass 3.9.2 SYCL backend Version 0.2] ( https://github.com/codeplay/cutlass-fork/releases/tag/v3.9.2-0.2 ) (2025-05-30)
4
20
- GEMM/StreamK/SplitK with support for FP16 data type
5
21
- Flash attention prefill with Paged KV cache with support for FP16 data type
You can’t perform that action at this time.
0 commit comments