[FEA] BFloat16x2 Atomics

Currently, CUTLASS only implements a specialization of `atomic_add` for [`half2`](https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/functional.h#L613), but not `nv_bfloat162`. This in turn limits [BlockStripedReduce](https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/block_striped.h#L241) to specialize in `half2` but not `nv_bfloat162`.

Is there any reason not to provide a specialization for `nv_bfloat162`? It looks like a very simple change, but maybe I'm missing something. Thanks in advance for the help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEA] BFloat16x2 Atomics #1616

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEA] BFloat16x2 Atomics #1616

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions