June release changelog (#451)

aacostadiaz · mehdi-goli · web-flow · commit 467a2bbec3fd · 2025-06-30T22:06:40.000+01:00
This PR updates the changelog to reflect the changes and new features
included in the June release

---------

Co-authored-by: Mehdi Goli &lt;mehdi.goli@codeplay.com&gt;
diff --git a/CHANGELOG-SYCL.md b/CHANGELOG-SYCL.md
@@ -1,5 +1,21 @@
 # SYCL CUTLASS Changelog
 
+## [Cutlass 3.9.2 SYCL backend Version 0.3](https://github.com/codeplay/cutlass-fork/releases/tag/v3.9.2-0.3) (2025-06-30)
+- Add support for GEMM FP8 (E5M2 and E4M3)
+- Add example for GEMM FP8 with support for channel-wise and group-wise quantization
+- Add support for Grouped GEMM FP8
+- Improve performance for FP8 to FP16 conversion
+- Add support for epilogue data conversion
+- Add support for FP16 GEMM with FP16 accumulator
+- Add support for BF16 GEMM with BF16 accumulator
+- Add support for mixed dtype GEMM with support for tensor-wise, channel-wise and group-wise quantization
+- Add example of mixed dtype BF16 + INT8 using channel-wise and group-wise quantization
+- Add example of mixed dtype FP16 + INT8 using tensor-wise quantization
+- Add example of mixed dtype FP16 + INT4 using channel-wise and group-wise quantization
+- Add support for zero-point quantization in INT4 and INT8 data types
+- Add support for Flash Attention prefill FP8 with and without KV cache
+- Add support for Flash Attention decode FP8 with and without KV cache
+
 ## [Cutlass 3.9.2 SYCL backend Version 0.2](https://github.com/codeplay/cutlass-fork/releases/tag/v3.9.2-0.2) (2025-05-30)
 - GEMM/StreamK/SplitK with support for FP16 data type
 - Flash attention prefill with Paged KV cache with support for FP16 data type