Skip to content

Industry (not IEEE) standard FTZ and DAZ support, micro-scaling floats #36

@GregoryMorse

Description

@GregoryMorse

Flush-To-Zero (FTZ) where output subnormals are flushed to zero, and Denormal-As-Zero (DAZ) where input subnormals are zeroed to various IEEE functions are worthy of serious consideration for support. Ultimately not just here but in the sibling testfloat-3 project.

These are not defined in IEEE, but especially in the AI era, where performance is considered and the expensive of long-paths or hardware area for subnormal support is commonly saved, the practical relevance cannot go unnoticed. For float32 and beyond (float64, float80, float128), it is extremely common to drop such support. For float16 or micro-scaling, subnormal support can generally be expected.

Also recommending the micro-scaling standard formats be implemented namely here:
https://www.opencompute.org/documents/[ocp-microscaling-formats-mx-v1-0-spec-final-pdf](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf)

This will give the project new relevance based on the direction and changes in industry. IEEE is important but the FTZ and DAZ are de-facto standards at this point, and the OCP MX spec is also considered an industry standard now.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions