Skip to content

Conversation

yhmtsai
Copy link
Member

@yhmtsai yhmtsai commented Apr 11, 2025

This PR adds bfloat16 precision but user can only select one of bfloat16 or half in ginkgo now.
There will be another pr to enable the possibility of having both at the same time, which should mostly contain the convert function/precision chain and some improvement to reduce the copy-paste stuff.

This PR currently use gko::float16 for either half or bfloat16. I think float16 is confusing to people because it is the same term used by half.

summary for different vendor bfloat16,

  • CUDA:
    • they use __nv_bfloat16
    • some ptx requires sm_80 to support (not have the guard in this pr). normal operations require at least sm_80 or cuda 12.2 similar to half
    • only use b16 in .reg, which is unlike f16 for half precision
  • HIP:
    • they have two bfloat16 format (hip_bfloat16 and __hip_bfloat16), hip_bfloat16 is quite early support but likely use float for arithmetic operation internally. __hip_bfloat16 has more native operation on bfloat16. __hip_bfloat16 supports from 5.6.0 but we need at least 6.2.0 to get enough implementation for the operation overload and conversion. before 5.4.0, it does not contain operator=(float).
    • more trouble on finding sqrt. hip tries to use system sqrt only in lambda function in the kernel (works in __global__). I need to add __device__ I guess it limits the searching space because we only provide the sqrt(bfloat16) in device.
  • SYCL:
    • it is in the experimental namespace
    • not proper implementation or they have some default implementation for std::numeric_limits on bfloat16 because I do not get any compilation issue. provide device_numeric_limits now
    • unary operation - on rvalue will gives float before 2025.0.1 because it only accepted non-const reference before 2025.0.1

@yhmtsai yhmtsai self-assigned this Apr 11, 2025
@ginkgo-bot ginkgo-bot added reg:build This is related to the build system. reg:testing This is related to testing. reg:documentation This is related to documentation. type:solver This is related to the solvers type:preconditioner This is related to the preconditioners type:matrix-format This is related to the Matrix formats type:factorization This is related to the Factorizations type:reordering This is related to the matrix(LinOp) reordering reg:helper-scripts This issue/PR is related to the helper scripts mainly concerned with development of Ginkgo. mod:all This touches all Ginkgo modules. labels Apr 11, 2025
@yhmtsai yhmtsai requested review from a team April 11, 2025 16:55
@yhmtsai yhmtsai added the 1:ST:ready-for-review This PR is ready for review label Apr 11, 2025
Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mostly smaller comments for this, rest looks good.

@yhmtsai yhmtsai requested a review from MarcelKoch April 17, 2025 09:13
@yhmtsai yhmtsai added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels May 2, 2025
@MarcelKoch MarcelKoch added this to the Ginkgo 1.10.0 milestone May 6, 2025
@yhmtsai yhmtsai merged commit 93df224 into develop May 6, 2025
11 of 13 checks passed
@yhmtsai yhmtsai deleted the add_bfloat16 branch May 6, 2025 20:16
Copy link

sonarqubecloud bot commented May 7, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

1:ST:ready-to-merge This PR is ready to merge. mod:all This touches all Ginkgo modules. reg:build This is related to the build system. reg:documentation This is related to documentation. reg:helper-scripts This issue/PR is related to the helper scripts mainly concerned with development of Ginkgo. reg:testing This is related to testing. type:factorization This is related to the Factorizations type:matrix-format This is related to the Matrix formats type:preconditioner This is related to the preconditioners type:reordering This is related to the matrix(LinOp) reordering type:solver This is related to the solvers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants