Skip to content

Conversation

yhmtsai
Copy link
Member

@yhmtsai yhmtsai commented Jan 5, 2023

I face some type issue when I develop the half precision computation in ginkgo. I extract some fix here because the current way I tried in half precision computation may not be workable in the end.

There are two main fix in this PR:

  1. the tuple type mismatch of indextype and valuetype
  2. passing the remove_complex<ValueType> with as_device_type because we will use __half from vendor library not gko::half

@yhmtsai yhmtsai self-assigned this Jan 5, 2023
@ginkgo-bot ginkgo-bot added mod:cuda This is related to the CUDA module. mod:hip This is related to the HIP module. type:factorization This is related to the Factorizations type:matrix-format This is related to the Matrix formats type:multigrid This is related to multigrid labels Jan 5, 2023
@yhmtsai
Copy link
Member Author

yhmtsai commented Jan 7, 2023

rebase!

Copy link
Member

@thoasm thoasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
I would like to see a PR description change to detail why you now need to change the type between devices for remove_complex<T> floating point types.

old_row_ptrs, as_cuda_type(old_vals), num_rows, threshold,
new_row_ptrs, lower);
old_row_ptrs, as_cuda_type(old_vals), num_rows,
as_cuda_type(threshold), new_row_ptrs, lower);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need that? As far as I can see, gko::half should be the same on CPU vs. GPU.
Are you changing the type so that gko::half is a different type on CPU and GPU?
Looks like that is your plan in #1257.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will use __half on the GPU but gko::half on CPU.

@yhmtsai yhmtsai added the 1:ST:ready-for-review This PR is ready for review label Feb 8, 2023
@yhmtsai yhmtsai requested review from a team and thoasm February 8, 2023 10:34
Copy link
Member

@upsj upsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@pratikvn pratikvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few extraneous as_device_type in the following kernels:

  1. csr_kernels.cu: 302, 304, 315, 317 and maybe a few other places in this file
  2. cb_gmres_kernels.cu: 351, 414

Probably the same in the hip kernels as well.

I think these places below are also missing a as_device_type ?:

  1. idr_kernels.cu: 102

Does cublas, curand and cusparse have half precision support ? We dont seem to currently use as_device_type while passing the pointers to those wrappers.

Otherwise LGTM!

@yhmtsai
Copy link
Member Author

yhmtsai commented Feb 8, 2023

curandGenerateNormal from cuda does not support half.
Some of cublas and cusparse might have half support. it will use as_culib_type

Copy link
Member

@thoasm thoasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@yhmtsai yhmtsai added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels Feb 9, 2023
@yhmtsai
Copy link
Member Author

yhmtsai commented Feb 9, 2023

@pratikvn for idr curand, I do not convert it to device now because we only use the float double and generate 2x for complex (currently based on std::complex)

@yhmtsai yhmtsai changed the title fix some type issue Fix the wrong type and pass real-number value with device_type to devices Feb 9, 2023
Co-authored-by: Pratik Nayak <pratikvn@protonmail.com>
@yhmtsai yhmtsai merged commit 19b7249 into develop Feb 9, 2023
@yhmtsai yhmtsai deleted the fixtype branch February 9, 2023 22:46
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
10.7% 10.7% Duplication

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

1:ST:ready-to-merge This PR is ready to merge. mod:cuda This is related to the CUDA module. mod:hip This is related to the HIP module. type:factorization This is related to the Factorizations type:matrix-format This is related to the Matrix formats type:multigrid This is related to multigrid

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants