[QST]Errors When Building and Running Blackwell SM120 Examples

Hi CUTLASS team,

I'm encountering errors when building and running the following SM120 examples from cutlass/examples on SM120 GPU:

1. 79_blackwell_geforce_gemm
Build command:
```
nvcc -std=c++17 -gencode=arch=compute_120,code=sm_120 \
  -I /cutlass/include -I /cutlass/tools/util/include -I /cutlass/examples/common \
  -o /cutlass/examples/79_blackwell_geforce_gemm/79a_blackwell_geforce_nvfp4_bf16_gemm \
  /cutlass/examples/79_blackwell_geforce_gemm/79a_blackwell_geforce_nvfp4_bf16_gemm.cu -lcuda
```
Run command:

```
./examples/79_blackwell_geforce_gemm/79a_blackwell_geforce_nvfp4_bf16_gemm --m=2048 --n=2048 --k=2048
```
Runtime error:
```
ERROR : Arch conditional MMA instruction used without targeting appropriate compute capability. Aborting.
```

2. 80_blackwell_geforce_sparse_gemm
Build command:
```
nvcc -std=c++17 -gencode=arch=compute_120,code=sm_120 \
  -I /cutlass/include -I /cutlass/tools/util/include -I /cutlass/examples/common \
  -o /cutlass/examples/80_blackwell_geforce_sparse_gemm/80b_blackwell_geforce_nvfp4_nvfp4_sparse_gemm \
/cutlass/examples/80_blackwell_geforce_sparse_gemm/80b_blackwell_geforce_nvfp4_nvfp4_sparse_gemm.cu -lcuda
```

Run command:
```
./examples/80_blackwell_geforce_sparse_gemm/80a_blackwell_geforce_mxfp8_bf16_sparse_gemm --m=1024 --n=1024 --k=1024
```

Run errors:
```
ptxas ... error   : Feature '.kind::mxf8f6f4' not supported on .target 'sm_120'
ptxas ... error   : Feature '.block_scale' not supported on .target 'sm_120'
ptxas ... error   : Feature '.scale_vec::1X' not supported on .target 'sm_120'
ptxas ... error   : Instruction 'mma with block scale' not supported on .target 'sm_120'
ptxas fatal   : Ptx assembly aborted due to errors
```

Additional Information
I can run `./tools/profiler/cutlass_profiler ....` on sm120 machine without error, while these examples would fail. My CUDA toolkit version is 12.9 and my machine is sm120 arch. Both examples are supposed to target the Blackwell SM120 architecture.

Could you please advise if there are any additional requirements, known issues, or workarounds for running these examples on SM120? Is there a specific CUDA version, driver, or CUTLASS branch required for these kernels to work on Blackwell GPUs? 

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[QST]Errors When Building and Running Blackwell SM120 Examples #2451

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[QST]Errors When Building and Running Blackwell SM120 Examples #2451

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions