Skip to content

[QST] Build for sm100 Blackwell GPUs #2072

@phantaurus

Description

@phantaurus

What is your question?
Hello!

I noticed this:
"Note: The NVIDIA Blackwell SM100 architecture used in the datacenter products has a different compute capability than the one underpinning NVIDIA Blackwell GeForce RTX 50 series GPUs. As a result, kernels compiled for Blackwell SM100 architecture with arch conditional features (using sm100a) are not compatible with RTX 50 series GPUs."

When building cutlass examples, I tried both DCUTLASS_NVCC_ARCHS="100a" and DCUTLASS_NVCC_ARCHS="100".
When setting it to "100", examples such as 70_blackwell_gemm disappeared from the Makefile.

Does this mean that non-datacenter sm_100 Blackwell GPUs do not have the new TensorCore features? If so, do they fall back to Hopper? Can I use the hopper tensorcore examples to get max TFLOPS on sm_100 GPUs? Or does this mean cutlass currently only support sm_100a TensorCore operations?

Thank you so much!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions