fix compilation problems with MSVC+Cuda 12.9 #27522

chacha21 · 2025-07-08T10:29:46Z

Actually, when ENABLE_CUDA_FIRST_CLASS_LANGUAGE is enabled, the fix it not necessary. However, even when ENABLE_CUDA_FIRST_CLASS_LANGUAGE is enabled, I have checked that the fix is harmless So I propose to keep it simple for now and enable the fix whatever the state of ENABLE_CUDA_FIRST_CLASS_LANGUAGE

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

fix for opencv#27521 Actually, when ENABLE_CUDA_FIRST_CLASS_LANGUAGE is enabled, the fix it not necessary. However, even when ENABLE_CUDA_FIRST_CLASS_LANGUAGE is enabled, I have cjhecked that the fix is harmless So I propsoe to keep it simple for now and enable the fix whatever the state of ENABLE_CUDA_FIRST_CLASS_LANGUAGE

cudawarped · 2025-07-08T11:50:17Z

cmake/OpenCVDetectCUDAUtils.cmake

+	if (NOT (CUDA_VERSION VERSION_LESS "11.2"))
+		set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcudafe --display_error_number --diag-suppress 1394,1388)
+	endif()
+	if(CUDA_VERSION GREATER "12.7")


Which compilation errors do you get with this version of CUDA without the --expt-relaxed-constexpr flag? I'm interested because I can build against CUDA 12.8 on windows without this experimental flag.

I can get a lot of errors like :

D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev/warp/reduce.hpp(140): error : calling a constexpr __host__ function("make_tuple") from a __device__ function("smem_tuple") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this. 5> return make_tuple((volatile T0*) t0, (volatile T1*) t1); 5> ^ 5> detected during: 5> instantiation of "cuda::std::__4::tuple<volatile T0 *, volatile T1 *> cv::cudev::smem_tuple(T0 *, T1 *) [with T0=float, T1=float]" at line 379 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, src_type, work_type>::reduceGrid<BLOCK_SIZE>(work_type *, int) [with src_type=uchar, work_type=float, BLOCK_SIZE=256]" at line 412 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,BLOCK_SIZE,PATCH_X,PATCH_Y,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, MaskPtr, int, int) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, float>, BLOCK_SIZE=256, PATCH_X=4, PATCH_Y=4, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 421 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, float>, Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 460 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::minMaxVal<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 206 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev/grid/reduce.hpp 5> instantiation of "void cv::cudev::gridFindMinMaxVal_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=float]" at line 349 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev/grid/reduce.hpp 5> instantiation of "void cv::cudev::gridFindMinMaxVal(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=float]" at line 137 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev/expr/reduction.hpp 5> instantiation of "void cv::cudev::FindMinMaxValExprBody<SrcPtr>::assignTo(cv::cudev::GpuMat_<T> &, cv::cuda::Stream &) const [with SrcPtr=cv::cudev::GpuMat_<uchar>, T=float]" at line 325 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\ptr2d\detail/gpumat.hpp 5> instantiation of "cv::cudev::GpuMat_<T> &cv::cudev::GpuMat_<T>::operator=(const cv::cudev::Expr<Body> &) [with T=float, Body=cv::cudev::FindMinMaxValExprBody<cv::cudev::GpuMat_<uchar>>]" at line 319 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\ptr2d\detail/gpumat.hpp 5> instantiation of "cv::cudev::GpuMat_<T>::GpuMat_(const cv::cudev::Expr<Body> &) [with T=float, Body=cv::cudev::FindMinMaxValExprBody<cv::cudev::GpuMat_<uchar>>]" at line 130 of D:\github\chacha21\opencv_contrib\modules\cudev\test\test_reduction.cu

You get this with CUDA 12.8 and the latest version of Visual Studio 2022? I only see those errors with CUDA 12.9 without the addition of -Xcompiler /Zc:preprocessor.

I'm building with Ninja are you building directly in VS?

Update: I can't recreate this, can you provide your CMake arguments, the version of VS, CUDA and CMake you are using please. I can build OpenCV without the --expt-relaxed-constexpr flag in VS 17.14.7 with CUDA 12.8, and CMake 4.01 using both the Ninja and VS generator.

You are right, I can't reproduce it either with 12.8. I must have fooled myself among my different experiments.
I will update the PR.

@chacha21 Is that flag needed at all? In my testing only the -Xcompiler /Zc:preprocessor is required to fix the bug. See my proposed fix.

@chacha21 Yes, details below.

-- OpenCV modules:
-- To be built: aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency shape signal stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab wechat_qrcode xfeatures2d ximgproc xobjdetect xphoto
-- Disabled: world
-- Disabled by dependency: -
-- Unavailable: alphamat cannops cvv fastcv freetype hdf java julia matlab ovis python2 sfm viz
-- Applications: tests perf_tests examples apps
-- Documentation: doxygen python
-- Non-free algorithms: NO

The only addition I had to make was the fix in opencv/opencv_contrib#3968 to get cudev to build.

I confirm that on my build system, I do need "--expt-relaxed-constexpr", otherwise the first compilation errors start with opencv_test_cudev

D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev/warp/reduce.hpp(140): error : calling a constexpr __host__ function("make_tuple") from a __device__ function("smem_tuple") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this. 5> return make_tuple((volatile T0*) t0, (volatile T1*) t1); 5> ^ 5> detected during: 5> instantiation of "cuda::std::__4::tuple<volatile T0 *, volatile T1 *> cv::cudev::smem_tuple(T0 *, T1 *) [with T0=float, T1=float]" at line 379 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, src_type, work_type>::reduceGrid<BLOCK_SIZE>(work_type *, int) [with src_type=uchar, work_type=float, BLOCK_SIZE=256]" at line 412 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,BLOCK_SIZE,PATCH_X,PATCH_Y,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, MaskPtr, int, int) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, float>, BLOCK_SIZE=256, PATCH_X=4, PATCH_Y=4, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 421 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, float>, Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 460 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::minMaxVal<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 206 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev/grid/reduce.hpp 5> instantiation of "void cv::cudev::gridFindMinMaxVal_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=float]" at line 349 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev/grid/reduce.hpp 5> instantiation of "void cv::cudev::gridFindMinMaxVal(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=float]" at line 137 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev/expr/reduction.hpp 5> instantiation of "void cv::cudev::FindMinMaxValExprBody<SrcPtr>::assignTo(cv::cudev::GpuMat_<T> &, cv::cuda::Stream &) const [with SrcPtr=cv::cudev::GpuMat_<uchar>, T=float]" at line 325 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\ptr2d\detail/gpumat.hpp 5> instantiation of "cv::cudev::GpuMat_<T> &cv::cudev::GpuMat_<T>::operator=(const cv::cudev::Expr<Body> &) [with T=float, Body=cv::cudev::FindMinMaxValExprBody<cv::cudev::GpuMat_<uchar>>]" at line 319 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\ptr2d\detail/gpumat.hpp 5> instantiation of "cv::cudev::GpuMat_<T>::GpuMat_(const cv::cudev::Expr<Body> &) [with T=float, Body=cv::cudev::FindMinMaxValExprBody<cv::cudev::GpuMat_<uchar>>]" at line 130 of D:\opencv_contrib-4.12.0\modules\cudev\test\test_reduction.cu 5>

CMakeCache.txt

@chacha21 Is that with the fix I linked to?

It is with

if(WIN32) if (NOT (CUDA_VERSION VERSION_LESS "11.2")) set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcudafe --display_error_number --diag-suppress 1394,1388) endif() if(CUDA_VERSION GREATER "12.8") #set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} --expt-relaxed-constexpr) set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcompiler=/Zc:preprocessor) endif() endif()

To fix the cudev error you need to apply the PR I linked to. Without it I get the errors you mentioned above even with the experimental flag (--expt-relaxed-constexpr).

cudawarped · 2025-07-12T07:09:21Z

cmake/OpenCVDetectCUDAUtils.cmake

+	endif()
+	if(CUDA_VERSION GREATER "12.8")
+		set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} --expt-relaxed-constexpr)
+		set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcompiler /Zc:preprocessor)


For consistency with the rest of the file I would use -Xcompiler=/Zc:preprocessor

asmorkalov · 2025-07-16T07:12:23Z

@chacha21 Friendly reminder. Let's drive it to conclusion.

cudawarped · 2025-07-17T13:21:24Z

Here is the file that can be of interest to detect if I did something wrong
test_reduction.cu.cache.zip

Where is this file located?

Can you provide your CMake command line arguments please, I'm having difficulty recreating the issue, I've tried Visual Studio and Ninja.

cudawarped reviewed Jul 8, 2025

View reviewed changes

asmorkalov added category: build/install category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib labels Jul 8, 2025

asmorkalov added this to the 4.13.0 milestone Jul 8, 2025

my mistake : --expt-relaxed-constexpr is not needed for 12.8

c854065

cudawarped mentioned this pull request Jul 12, 2025

cudev: add -Xcompiler=/Zc:preprocessor to fix bug in CUDA 12.9 opencv/opencv_contrib#3968

Open

6 tasks

cudawarped reviewed Jul 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix compilation problems with MSVC+Cuda 12.9 #27522

fix compilation problems with MSVC+Cuda 12.9 #27522

chacha21 commented Jul 8, 2025 •

edited

Loading

Uh oh!

cudawarped Jul 8, 2025 •

edited

Loading

Uh oh!

chacha21 Jul 8, 2025

Uh oh!

cudawarped Jul 8, 2025 •

edited

Loading

Uh oh!

chacha21 Jul 8, 2025

Uh oh!

cudawarped Jul 9, 2025

Uh oh!

cudawarped Jul 12, 2025

Uh oh!

chacha21 Jul 15, 2025

Uh oh!

cudawarped Jul 15, 2025 •

edited

Loading

Uh oh!

chacha21 Jul 15, 2025

Uh oh!

cudawarped Jul 15, 2025 •

edited

Loading

Uh oh!

cudawarped Jul 12, 2025

Uh oh!

asmorkalov commented Jul 16, 2025

Uh oh!

cudawarped commented Jul 17, 2025

Uh oh!

Uh oh!

Uh oh!

fix compilation problems with MSVC+Cuda 12.9 #27522

Are you sure you want to change the base?

fix compilation problems with MSVC+Cuda 12.9 #27522

Conversation

chacha21 commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

cudawarped Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chacha21 Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

cudawarped Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chacha21 Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

cudawarped Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

cudawarped Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

chacha21 Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

cudawarped Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chacha21 Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

cudawarped Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cudawarped Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

asmorkalov commented Jul 16, 2025

Uh oh!

cudawarped commented Jul 17, 2025

Uh oh!

Uh oh!

chacha21 commented Jul 8, 2025 •

edited

Loading

cudawarped Jul 8, 2025 •

edited

Loading

cudawarped Jul 8, 2025 •

edited

Loading

cudawarped Jul 15, 2025 •

edited

Loading

cudawarped Jul 15, 2025 •

edited

Loading