fp16_performance_good on NVIDIA GeForce GTX 1660 Ti #5329

LostRuins · 2024-02-05T02:49:51Z

LostRuins
Feb 5, 2024
Collaborator

Troubleshooting some performance issues that popped up a while ago and we noticed this was added:

const bool fp16_performance_good = min_compute_capability >= CC_VOLTA;

NVIDIA GeForce GTX 1660 Ti proclaims a compute capability of 7.5, but it apparently does not have any tensor cores and absolutely abysmal fp16 performance. I wonder if there is any way to better differentiate when to use it.

slaren · 2024-02-05T18:16:17Z

slaren
Feb 5, 2024
Maintainer

It seems that the 1660 Ti has some kind of hardware support for the WMMA instructions, but not actual tensor cores, so I guess the feature is technically supported, it's just that the performance is very bad. I don't think there is any way to query the availability of tensor cores in the CUDA API other than checking the compute capability, so I don't see any immediate solution other than excluding this GPU specifically by name. In the future we may want to run some performance tests at startup to determine what features to use, but that's probably not going to happen any time soon.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fp16_performance_good on NVIDIA GeForce GTX 1660 Ti #5329

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

fp16_performance_good on NVIDIA GeForce GTX 1660 Ti #5329

Uh oh!

LostRuins Feb 5, 2024 Collaborator

Replies: 1 comment

Uh oh!

slaren Feb 5, 2024 Maintainer

LostRuins
Feb 5, 2024
Collaborator

slaren
Feb 5, 2024
Maintainer