Skip to content

Commit 92de1de

Browse files
jiawenliu64facebook-github-bot
authored andcommitted
Support zero tensor in FP4 GEMM (#4460)
Summary: Pull Request resolved: #4460 X-link: facebookresearch/FBGEMM#1520 As title. Handle corner cases when M, N, or K = 0 in E2E Reviewed By: jianyuh Differential Revision: D77986266 fbshipit-source-id: 517ba131ba783320162d7955cebc0f1e4b1fc2be
1 parent e835e29 commit 92de1de

File tree

1 file changed

+7
-0
lines changed
  • fbgemm_gpu/experimental/gen_ai/src/quantize/cutlass_extensions

1 file changed

+7
-0
lines changed

fbgemm_gpu/experimental/gen_ai/src/quantize/cutlass_extensions/f4f4bf16.cu

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,13 @@ at::Tensor dispatch_f4f4bf16_kernel(
3434
N % BLOCK_SIZE == 0 && K % BLOCK_SIZE == 0,
3535
"Weight dimensions N and K must be multiples of block size 16");
3636

37+
auto out_sizes = XQ.sizes().vec();
38+
out_sizes.back() = N;
39+
if (M == 0 || N == 0 || K == 0) {
40+
// Use zeros instead of empty for special case where K=0.
41+
return at::zeros(out_sizes, XQ.options().dtype(at::kBFloat16));
42+
}
43+
3744
// MXFP4
3845
if (use_mx) {
3946
if (M <= 128) {

0 commit comments

Comments
 (0)