Skip to content

[BUG] Example 77 backward would fail under compute-sanitizer? #2626

@henrylhtsang

Description

@henrylhtsang

Which component has the problem?

CUTLASS C++

Bug Report

Describe the bug
Hi. I noticed I can get into CUDA IMA in backward when testing Example 77 on Blackwell machine (B200 / GB200). I seem to be able some errors in compute-sanitizer in racecheck and synccheck mode. Note that in the default compute-sanitizer mode it seems fine.

tldr of errors:

Error: Race reported between Write access at void cutlass::device_kernel<cutlass::fmha::kernel::Sm100FmhaBwdKernelTmaWarpSpecialized...

========= Barrier error detected. Missing wait.
=========     at void cutlass::device_kernel<cutlass::fmha::kernel::Sm100FmhaBwdKernelTmaWarpSpecialized

Please let me know if my setup is correct, and if this is indeed an issue.

Steps/Code to reproduce bug

mkdir build 
cd build 
cmake .. -DCUTLASS_NVCC_ARCHS=100a
targets=(
    test_examples_77_blackwell_fmha_bwd_fp16_test_basic
    test_examples_77_blackwell_fmha_bwd_fp16_test_varlen
)

for test in "${targets[@]}"
do
    echo "Running $test"  2>&1 | tee -a ~/cutlass/output.log
    make $test  2>&1 | tee -a ~/cutlass/output.log
    echo "Running compute sanitizer $test"  2>&1 | tee -a ~/cutlass/output.log
    compute-sanitizer make $test  2>&1 | tee -a ~/cutlass/output.log
    echo "Running compute sanitizer memcheck $test"  2>&1 | tee -a ~/cutlass/output.log
    compute-sanitizer  --tool=memcheck make $test  2>&1 | tee -a ~/cutlass/output.log
    echo "Running compute sanitizer racecheck $test"  2>&1 | tee -a ~/cutlass/output.log
    compute-sanitizer  --tool=racecheck make $test  2>&1 | tee -a ~/cutlass/output.log
    echo "Running compute sanitizer synccheck $test"  2>&1 | tee -a ~/cutlass/output.log
    compute-sanitizer  --tool=synccheck make $test  2>&1 | tee -a ~/cutlass/output.log
done

cmake logs: https://gist.github.com/henrylhtsang/f6c02c62008d6deaf36dc71ac2dfb0ac
compute sanitizer logs: https://gist.github.com/henrylhtsang/3938ab718a1c9c31863e858a6d0060ca

Expected behavior
No errors from compute sanitizer.

Environment details (please complete the following information):

  • cutlass commit: 56f0718
  • B200
  • CUDA toolkit V12.8.93
  • Driver Version: 570.124.06

Additional context
cc @sryap

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions