-
Couldn't load subscription status.
- Fork 1.5k
Description
What is your question?
Hello!
I noticed this:
"Note: The NVIDIA Blackwell SM100 architecture used in the datacenter products has a different compute capability than the one underpinning NVIDIA Blackwell GeForce RTX 50 series GPUs. As a result, kernels compiled for Blackwell SM100 architecture with arch conditional features (using sm100a) are not compatible with RTX 50 series GPUs."
When building cutlass examples, I tried both DCUTLASS_NVCC_ARCHS="100a" and DCUTLASS_NVCC_ARCHS="100".
When setting it to "100", examples such as 70_blackwell_gemm disappeared from the Makefile.
Does this mean that non-datacenter sm_100 Blackwell GPUs do not have the new TensorCore features? If so, do they fall back to Hopper? Can I use the hopper tensorcore examples to get max TFLOPS on sm_100 GPUs? Or does this mean cutlass currently only support sm_100a TensorCore operations?
Thank you so much!