-
-
Notifications
You must be signed in to change notification settings - Fork 56.2k
fix compilation problems with MSVC+Cuda 12.9 #27522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 4.x
Are you sure you want to change the base?
Conversation
fix for opencv#27521 Actually, when ENABLE_CUDA_FIRST_CLASS_LANGUAGE is enabled, the fix it not necessary. However, even when ENABLE_CUDA_FIRST_CLASS_LANGUAGE is enabled, I have cjhecked that the fix is harmless So I propsoe to keep it simple for now and enable the fix whatever the state of ENABLE_CUDA_FIRST_CLASS_LANGUAGE
cmake/OpenCVDetectCUDAUtils.cmake
Outdated
if (NOT (CUDA_VERSION VERSION_LESS "11.2")) | ||
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcudafe --display_error_number --diag-suppress 1394,1388) | ||
endif() | ||
if(CUDA_VERSION GREATER "12.7") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which compilation errors do you get with this version of CUDA without the --expt-relaxed-constexpr
flag? I'm interested because I can build against CUDA 12.8 on windows without this experimental flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can get a lot of errors like :
D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev/warp/reduce.hpp(140): error : calling a constexpr __host__ function("make_tuple") from a __device__ function("smem_tuple") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
5> return make_tuple((volatile T0*) t0, (volatile T1*) t1);
5> ^
5> detected during:
5> instantiation of "cuda::std::__4::tuple<volatile T0 *, volatile T1 *> cv::cudev::smem_tuple(T0 *, T1 *) [with T0=float, T1=float]" at line 379 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp
5> instantiation of "void cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, src_type, work_type>::reduceGrid<BLOCK_SIZE>(work_type *, int) [with src_type=uchar, work_type=float, BLOCK_SIZE=256]" at line 412 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp
5> instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,BLOCK_SIZE,PATCH_X,PATCH_Y,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, MaskPtr, int, int) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, float>, BLOCK_SIZE=256, PATCH_X=4, PATCH_Y=4, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 421 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp
5> instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, float>, Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 460 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp
5> instantiation of "void cv::cudev::grid_reduce_detail::minMaxVal<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 206 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev/grid/reduce.hpp
5> instantiation of "void cv::cudev::gridFindMinMaxVal_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=float]" at line 349 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev/grid/reduce.hpp
5> instantiation of "void cv::cudev::gridFindMinMaxVal(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=float]" at line 137 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev/expr/reduction.hpp
5> instantiation of "void cv::cudev::FindMinMaxValExprBody<SrcPtr>::assignTo(cv::cudev::GpuMat_<T> &, cv::cuda::Stream &) const [with SrcPtr=cv::cudev::GpuMat_<uchar>, T=float]" at line 325 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\ptr2d\detail/gpumat.hpp
5> instantiation of "cv::cudev::GpuMat_<T> &cv::cudev::GpuMat_<T>::operator=(const cv::cudev::Expr<Body> &) [with T=float, Body=cv::cudev::FindMinMaxValExprBody<cv::cudev::GpuMat_<uchar>>]" at line 319 of D:\github\chacha21\opencv_contrib\modules\cudev\include\opencv2\cudev\ptr2d\detail/gpumat.hpp
5> instantiation of "cv::cudev::GpuMat_<T>::GpuMat_(const cv::cudev::Expr<Body> &) [with T=float, Body=cv::cudev::FindMinMaxValExprBody<cv::cudev::GpuMat_<uchar>>]" at line 130 of D:\github\chacha21\opencv_contrib\modules\cudev\test\test_reduction.cu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You get this with CUDA 12.8 and the latest version of Visual Studio 2022? I only see those errors with CUDA 12.9 without the addition of -Xcompiler /Zc:preprocessor
.
I'm building with Ninja are you building directly in VS?
Update: I can't recreate this, can you provide your CMake arguments, the version of VS, CUDA and CMake you are using please. I can build OpenCV without the --expt-relaxed-constexpr
flag in VS 17.14.7 with CUDA 12.8, and CMake 4.01 using both the Ninja and VS generator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I can't reproduce it either with 12.8. I must have fooled myself among my different experiments.
I will update the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chacha21 Is that flag needed at all? In my testing only the -Xcompiler /Zc:preprocessor
is required to fix the bug. See my proposed fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chacha21 Yes, details below.
-- OpenCV modules:
-- To be built: aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency shape signal stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab wechat_qrcode xfeatures2d ximgproc xobjdetect xphoto
-- Disabled: world
-- Disabled by dependency: -
-- Unavailable: alphamat cannops cvv fastcv freetype hdf java julia matlab ovis python2 sfm viz
-- Applications: tests perf_tests examples apps
-- Documentation: doxygen python
-- Non-free algorithms: NO
The only addition I had to make was the fix in opencv/opencv_contrib#3968 to get cudev to build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirm that on my build system, I do need "--expt-relaxed-constexpr", otherwise the first compilation errors start with opencv_test_cudev
D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev/warp/reduce.hpp(140): error : calling a constexpr __host__ function("make_tuple") from a __device__ function("smem_tuple") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
5> return make_tuple((volatile T0*) t0, (volatile T1*) t1);
5> ^
5> detected during:
5> instantiation of "cuda::std::__4::tuple<volatile T0 *, volatile T1 *> cv::cudev::smem_tuple(T0 *, T1 *) [with T0=float, T1=float]" at line 379 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp
5> instantiation of "void cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, src_type, work_type>::reduceGrid<BLOCK_SIZE>(work_type *, int) [with src_type=uchar, work_type=float, BLOCK_SIZE=256]" at line 412 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp
5> instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,BLOCK_SIZE,PATCH_X,PATCH_Y,SrcPtr,ResType,MaskPtr>(SrcPtr, ResType *, MaskPtr, int, int) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, float>, BLOCK_SIZE=256, PATCH_X=4, PATCH_Y=4, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 421 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp
5> instantiation of "void cv::cudev::grid_reduce_detail::reduce<Reductor,Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Reductor=cv::cudev::grid_reduce_detail::MinMaxReductor<cv::cudev::grid_reduce_detail::both, uchar, float>, Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 460 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp
5> instantiation of "void cv::cudev::grid_reduce_detail::minMaxVal<Policy,SrcPtr,ResType,MaskPtr>(const SrcPtr &, ResType *, const MaskPtr &, int, int, cudaStream_t) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GlobPtr<uchar>, ResType=float, MaskPtr=cv::cudev::WithOutMask]" at line 206 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev/grid/reduce.hpp
5> instantiation of "void cv::cudev::gridFindMinMaxVal_<Policy,SrcPtr,ResType>(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with Policy=cv::cudev::DefaultGlobReducePolicy, SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=float]" at line 349 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev/grid/reduce.hpp
5> instantiation of "void cv::cudev::gridFindMinMaxVal(const SrcPtr &, cv::cudev::GpuMat_<ResType> &, cv::cuda::Stream &) [with SrcPtr=cv::cudev::GpuMat_<uchar>, ResType=float]" at line 137 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev/expr/reduction.hpp
5> instantiation of "void cv::cudev::FindMinMaxValExprBody<SrcPtr>::assignTo(cv::cudev::GpuMat_<T> &, cv::cuda::Stream &) const [with SrcPtr=cv::cudev::GpuMat_<uchar>, T=float]" at line 325 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\ptr2d\detail/gpumat.hpp
5> instantiation of "cv::cudev::GpuMat_<T> &cv::cudev::GpuMat_<T>::operator=(const cv::cudev::Expr<Body> &) [with T=float, Body=cv::cudev::FindMinMaxValExprBody<cv::cudev::GpuMat_<uchar>>]" at line 319 of D:\opencv_contrib-4.12.0\modules\cudev\include\opencv2\cudev\ptr2d\detail/gpumat.hpp
5> instantiation of "cv::cudev::GpuMat_<T>::GpuMat_(const cv::cudev::Expr<Body> &) [with T=float, Body=cv::cudev::FindMinMaxValExprBody<cv::cudev::GpuMat_<uchar>>]" at line 130 of D:\opencv_contrib-4.12.0\modules\cudev\test\test_reduction.cu
5>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chacha21 Is that with the fix I linked to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is with
if(WIN32)
if (NOT (CUDA_VERSION VERSION_LESS "11.2"))
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcudafe --display_error_number --diag-suppress 1394,1388)
endif()
if(CUDA_VERSION GREATER "12.8")
#set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} --expt-relaxed-constexpr)
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcompiler=/Zc:preprocessor)
endif()
endif()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To fix the cudev error you need to apply the PR I linked to. Without it I get the errors you mentioned above even with the experimental flag (--expt-relaxed-constexpr
).
endif() | ||
if(CUDA_VERSION GREATER "12.8") | ||
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} --expt-relaxed-constexpr) | ||
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} -Xcompiler /Zc:preprocessor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with the rest of the file I would use -Xcompiler=/Zc:preprocessor
@chacha21 Friendly reminder. Let's drive it to conclusion. |
Where is this file located? Can you provide your CMake command line arguments please, I'm having difficulty recreating the issue, I've tried Visual Studio and Ninja. |
fix for #27521
Actually, when ENABLE_CUDA_FIRST_CLASS_LANGUAGE is enabled, the fix it not necessary. However, even when ENABLE_CUDA_FIRST_CLASS_LANGUAGE is enabled, I have checked that the fix is harmless So I propose to keep it simple for now and enable the fix whatever the state of ENABLE_CUDA_FIRST_CLASS_LANGUAGE
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.