fix compilation problems with MSVC+Cuda 12.9 #27522

cudawarped · 2025-07-08T11:50:17Z

Which compilation errors do you get with this version of CUDA without the --expt-relaxed-constexpr flag? I'm interested because I can build against CUDA 12.8 on windows without this experimental flag.

I can get a lot of errors like :

D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev/warp/reduce.hpp(140): error : calling a constexpr __host__ function("make_tuple") from a __device__ function("smem_tuple") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this. 5> return make_tuple((volatile T0*) t0, (volatile T1*) t1); 5> ^ 5> detected during: 5> instantiation of "cuda::std::__4::tuple<volatile T0 *, volatile T1 *> cv::cudev::smem_tuple(T0 *, T1 *) [with T0=float, T1=float]" at line 379 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, src_type, work_type>::reduceGrid<BLOCK_SIZE>(work_type *, int) [with src_type=uchar, work_type=float, BLOCK_SIZE=256]" at line 412 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,BLOCK_SIZE,PATCH_X,PATCH_Y,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, MaskPtr, int, int) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, float>, BLOCK_SIZE=256, PATCH_X=4, PATCH_Y=4, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 421 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, float>, Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 460 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::minMaxVal<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 206 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev/grid/reduce.hpp 5> instantiation of "void cv::cudev::gridFindMinMaxVal_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=float]" at line 349 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev/grid/reduce.hpp 5> instantiation of "void cv::cudev::gridFindMinMaxVal(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=float]" at line 137 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev/expr/reduction.hpp 5> instantiation of "void cv::cudev::FindMinMaxValExprBody<SrcPtr>::assignTo(cv::cudev::GpuMat_<T> &, cv::cuda::Stream &) const [with SrcPtr=cv::cudev::GpuMat_<uchar>, T=float]" at line 325 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\ptr2d\detail/gpumat.hpp 5> instantiation of "cv::cudev::GpuMat_<T> &cv::cudev::GpuMat_<T>::operator=(const cv::cudev::Expr<Body> &) [with T=float, Body=cv::cudev::FindMinMaxValExprBody<cv::cudev::GpuMat_<uchar>>]" at line 319 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\ptr2d\detail/gpumat.hpp 5> instantiation of "cv::cudev::GpuMat_<T>::GpuMat_(const cv::cudev::Expr<Body> &) [with T=float, Body=cv::cudev::FindMinMaxValExprBody<cv::cudev::GpuMat_<uchar>>]" at line 130 of D:\github\chacha21\opencv_contrib\modules\cudev\test\test_reduction.cu

You get this with CUDA 12.8 and the latest version of Visual Studio 2022? I only see those errors with CUDA 12.9 without the addition of -Xcompiler /Zc:preprocessor.

I'm building with Ninja are you building directly in VS?

Update: I can't recreate this, can you provide your CMake arguments, the version of VS, CUDA and CMake you are using please. I can build OpenCV without the --expt-relaxed-constexpr flag in VS 17.14.7 with CUDA 12.8, and CMake 4.01 using both the Ninja and VS generator.

You are right, I can't reproduce it either with 12.8. I must have fooled myself among my different experiments.
I will update the PR.

@chacha21 Is that flag needed at all? In my testing only the -Xcompiler /Zc:preprocessor is required to fix the bug. See my proposed fix.

@chacha21 Yes, details below.

-- OpenCV modules:
-- To be built: aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency shape signal stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab wechat_qrcode xfeatures2d ximgproc xobjdetect xphoto
-- Disabled: world
-- Disabled by dependency: -
-- Unavailable: alphamat cannops cvv fastcv freetype hdf java julia matlab ovis python2 sfm viz
-- Applications: tests perf_tests examples apps
-- Documentation: doxygen python
-- Non-free algorithms: NO

The only addition I had to make was the fix in opencv/opencv_contrib#3968 to get cudev to build.

I confirm that on my build system, I do need "--expt-relaxed-constexpr", otherwise the first compilation errors start with opencv_test_cudev

D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev/warp/reduce.hpp(140): error : calling a constexpr __host__ function("make_tuple") from a __device__ function("smem_tuple") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this. 5> return make_tuple((volatile T0*) t0, (volatile T1*) t1); 5> ^ 5> detected during: 5> instantiation of "cuda::std::__4::tuple<volatile T0 *, volatile T1 *> cv::cudev::smem_tuple(T0 *, T1 *) [with T0=float, T1=float]" at line 379 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, src_type, work_type>::reduceGrid<BLOCK_SIZE>(work_type *, int) [with src_type=uchar, work_type=float, BLOCK_SIZE=256]" at line 412 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,BLOCK_SIZE,PATCH_X,PATCH_Y,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, MaskPtr, int, int) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, float>, BLOCK_SIZE=256, PATCH_X=4, PATCH_Y=4, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 421 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, float>, Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 460 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp 5> instantiation of "void cv::cudev::grid_reduce_detail::minMaxVal<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 206 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev/grid/reduce.hpp 5> instantiation of "void cv::cudev::gridFindMinMaxVal_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=float]" at line 349 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev/grid/reduce.hpp 5> instantiation of "void cv::cudev::gridFindMinMaxVal(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=float]" at line 137 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev/expr/reduction.hpp 5> instantiation of "void cv::cudev::FindMinMaxValExprBody<SrcPtr>::assignTo(cv::cudev::GpuMat_<T> &, cv::cuda::Stream &) const [with SrcPtr=cv::cudev::GpuMat_<uchar>, T=float]" at line 325 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\ptr2d\detail/gpumat.hpp 5> instantiation of "cv::cudev::GpuMat_<T> &cv::cudev::GpuMat_<T>::operator=(const cv::cudev::Expr<Body> &) [with T=float, Body=cv::cudev::FindMinMaxValExprBody<cv::cudev::GpuMat_<uchar>>]" at line 319 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\ptr2d\detail/gpumat.hpp 5> instantiation of "cv::cudev::GpuMat_<T>::GpuMat_(const cv::cudev::Expr<Body> &) [with T=float, Body=cv::cudev::FindMinMaxValExprBody<cv::cudev::GpuMat_<uchar>>]" at line 130 of D:\opencv_contrib-4.12.0\modules\cudev\test\test_reduction.cu 5>

CMakeCache.txt

@chacha21 Is that with the fix I linked to?

It is with

if(WIN32) if (NOT (CUDA_VERSION VERSION_LESS "11.2")) set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcudafe --display_error_number --diag-suppress 1394,1388) endif() if(CUDA_VERSION GREATER "12.8") #set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} --expt-relaxed-constexpr) set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcompiler=/Zc:preprocessor) endif() endif()

To fix the cudev error you need to apply the PR I linked to. Without it I get the errors you mentioned above even with the experimental flag (--expt-relaxed-constexpr).

cudawarped · 2025-07-12T07:09:21Z

For consistency with the rest of the file I would use -Xcompiler=/Zc:preprocessor

-Original file line number
+Diff line change
@@ Expand Up / @@ -388,8 +388,16 @@ macro(ocv_nvcc_flags) @@
         set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcompiler=-fno-finite-math-only)
       endif()
-      if(WIN32 AND NOT (CUDA_VERSION VERSION_LESS "11.2"))
-        set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcudafe --display_error_number --diag-suppress 1394,1388)
+      if(WIN32)
+    	if (NOT (CUDA_VERSION VERSION_LESS "11.2"))
+    		set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcudafe --display_error_number --diag-suppress 1394,1388)
+    	endif()
+    	if(CUDA_VERSION GREATER "12.7")
+    		set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} --expt-relaxed-constexpr)
+    	endif()
+    	if(CUDA_VERSION GREATER "12.8")
+    		set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcompiler /Zc:preprocessor)
+    	endif()
       endif()
       if(CMAKE_CROSSCOMPILING AND (ARM OR AARCH64))
@@ Expand Down @@

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix compilation problems with MSVC+Cuda 12.9 #27522

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Uh oh!

cudawarped Jul 8, 2025 •

edited

Loading

Uh oh!

chacha21 Jul 8, 2025

Uh oh!

cudawarped Jul 8, 2025 •

edited

Loading

Uh oh!

chacha21 Jul 8, 2025

Uh oh!

cudawarped Jul 9, 2025

Uh oh!

cudawarped Jul 12, 2025

Uh oh!

chacha21 Jul 15, 2025

Uh oh!

cudawarped Jul 15, 2025 •

edited

Loading

Uh oh!

chacha21 Jul 15, 2025

Uh oh!

cudawarped Jul 15, 2025 •

edited

Loading

Uh oh!

cudawarped Jul 12, 2025

Uh oh!

Uh oh!

Uh oh!

fix compilation problems with MSVC+Cuda 12.9 #27522

Uh oh!

fix compilation problems with MSVC+Cuda 12.9 #27522

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Uh oh!

cudawarped Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chacha21 Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

cudawarped Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chacha21 Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

cudawarped Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

cudawarped Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

chacha21 Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

cudawarped Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chacha21 Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

cudawarped Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cudawarped Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cudawarped Jul 8, 2025 •

edited

Loading

cudawarped Jul 8, 2025 •

edited

Loading

cudawarped Jul 15, 2025 •

edited

Loading

cudawarped Jul 15, 2025 •

edited

Loading