Skip to content

[FEA] BFloat16x2 Atomics #1616

@HanGuo97

Description

@HanGuo97

Currently, CUTLASS only implements a specialization of atomic_add for half2, but not nv_bfloat162. This in turn limits BlockStripedReduce to specialize in half2 but not nv_bfloat162.

Is there any reason not to provide a specialization for nv_bfloat162? It looks like a very simple change, but maybe I'm missing something. Thanks in advance for the help!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions