Skip to content

Conversation

@bdevorem
Copy link
Contributor

@bdevorem bdevorem commented Aug 1, 2025

resolves #4193

Updates:

  • instead of the pragmas to suppress the deprecated warnings just around the two library includes, I have used Paul's idea/ upstream solution (sort of) to check if the warning happens, and if so, and if libstdc++11 exists, explicitly specify the library path. This avoids needing to suppress the warning, and hopefully will prevent us from being affected by external projects again like this. I can also just always explicitly specify the library path, if that is preferable.
  • There are two additional warnings, but they are coming from external projects, rocMLIR and the upstream libs of cget. These aren't really solvable, but I have an issue open on rocMLIR right now. If I can suppress the cget warnings, I'll make an additional PR. Ethan solved the msgpack one. As for rocMLIR, I have added an ignore deprecated declaration, and it is on the rocMLIR project to handle the warning.

Original:
Two main warnings currently:

  • /workspace/AMDMIGraphX/src/include/migraphx/value.hpp:354:9: warning: 'switch' missing 'default' label [-Wswitch-default]
    • switch statements are missing a default case. However, when given a default case, a new warning then shows: Default label in switch which covers all enumeration values. We have Weverything enabled for clang builds, which allows for contradictory warnings. This is a known issue and is recommended to use Wall + Wextra instead. However, just disabling one of the contradicting flags is a little more narrow in scope, and already has precedence in migraphx. This PR disables the default label warning similarly.
  • /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_tempbuf.h:263:8: warning: 'get_temporary_buffer<migraphx::builtin::param>' is deprecated [-Wdeprecated-declarations]
    • we are using gcc 12 but c++17; gcc 12 uses a stl function that is deprecated in the newer c++, and even removed entirely in even newer c++ (20). Until we upgrade gcc, the narrowest solution seems to be disabling the warning for specific #includes. This lets us avoid disabling the warning for the entire project, which is probably a bad idea for this specific warning. Also note, this is just a stop gap. Should be removed when rocm environments upgrade gcc.

@bdevorem bdevorem requested a review from causten as a code owner August 1, 2025 07:02
@bdevorem bdevorem requested review from causten and Copilot and removed request for causten August 1, 2025 07:02
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses compiler warnings encountered when building with ROCm 7.0, specifically handling two types of warnings that prevent clean builds.

  • Disables contradictory switch-default warnings in Clang builds where -Weverything creates conflicting requirements
  • Suppresses deprecated declaration warnings for get_temporary_buffer in specific header includes until GCC is upgraded

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
cmake/EnableCompilerWarnings.cmake Adds -Wno-switch-default flag to disable contradictory switch warnings in Clang builds
src/include/migraphx/shape.hpp Wraps memory header include with pragma to suppress deprecated declarations warning
src/include/migraphx/algorithm.hpp Wraps algorithm header include with pragma to suppress deprecated declarations warning

@pfultz2
Copy link
Collaborator

pfultz2 commented Aug 1, 2025

/usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_tempbuf.h:263:8: warning: 'get_temporary_buffer<migraphx::builtin::param>' is deprecated [-Wdeprecated-declarations]

* we are using gcc 12 but c++17; gcc 12 uses a stl function that is deprecated in the newer c++, and even removed entirely in even newer c++ (20). Until we upgrade gcc, the narrowest solution seems to be disabling the warning for specific #includes. This lets us avoid disabling the warning for the entire project, which is probably a bad idea for this specific warning. Also note, this is just a stop gap. Should be removed when rocm environments upgrade gcc.

Is there a larger backtrace for this? I am curious why this get instantiated with our migraphx::builtin::param class. Is this being used by another algorithm indirectly?

@bdevorem
Copy link
Contributor Author

bdevorem commented Aug 1, 2025

Is there a larger backtrace for this? I am curious why this get instantiated with our migraphx::builtin::param class. Is this being used by another algorithm indirectly?

@pfultz2 Yeah, below is the trace for algorithm.hpp that I was seeing. That being said, this morning the trace is different. I'm using the same container and I un-rebased, and I am still not able to repro the original trace, so I'm moving this PR back to draft until I have that answered.

In file included from /src/AMDMIGraphX/src/module.cpp:25:
In file included from /src/AMDMIGraphX/src/include/migraphx/algorithm.hpp:27:
In file included from /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/algorithm:61:
In file included from /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_algo.h:61:
/usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_tempbuf.h:263:8: warning: 'get_temporary_buffer<migraphx::builtin::param>' is deprecated [-Wdeprecated-declarations]
  263 |                 std::get_temporary_buffer<value_type>(_M_original_len));
      |                      ^
/usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_algo.h:4996:15: note: in instantiation of member function 'std::_Temporary_buffer<__gnu_cxx::__normal_iterator<migraphx::builtin::param *, std::vector<migraphx::builtin::param>>, migraphx::builtin::param>::_Temporary_buffer' requested here
 4996 |       _TmpBuf __buf(__first, (__last - __first + 1) / 2);
      |               ^
/usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_algo.h:5070:23: note: in instantiation of function template specialization 'std::__stable_sort<__gnu_cxx::__normal_iterator<migraphx::builtin::param *, std::vector<migraphx::builtin::param>>, __gnu_cxx::__ops::_Iter_comp_iter<(lambda at /src/AMDMIGraphX/src/include/migraphx/functional.hpp:195:12)>>' requested here
 5070 |       _GLIBCXX_STD_A::__stable_sort(__first, __last,
      |                       ^
/src/AMDMIGraphX/src/module.cpp:594:10: note: in instantiation of function template specialization 'std::stable_sort<__gnu_cxx::__normal_iterator<migraphx::builtin::param *, std::vector<migraphx::builtin::param>>, (lambda at /src/AMDMIGraphX/src/include/migraphx/functional.hpp:195:12)>' requested here
  594 |     std::stable_sort(
      |          ^
/usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_tempbuf.h:99:5: note: 'get_temporary_buffer<migraphx::builtin::param>' has been explicitly marked deprecated here
   99 |     _GLIBCXX17_DEPRECATED
      |     ^
/usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/x86_64-linux-gnu/c++/12/bits/c++config.h:119:34: note: expanded from macro '_GLIBCXX17_DEPRECATED'
  119 | # define _GLIBCXX17_DEPRECATED [[__deprecated__]]

@bdevorem bdevorem marked this pull request as draft August 1, 2025 16:11
@pfultz2
Copy link
Collaborator

pfultz2 commented Aug 1, 2025

Just to document some of my findings, which you probably saw too. So this is a bug in clang(that hasnt been fixed) when using stable_sort with gcc 12:

llvm/llvm-project#76515

Which is caused by this commit in clang:

llvm/llvm-project@aafad2d

We could disable the warning globally but conditionally check for this case since this is only a problem with gcc 12, which is what was done in the ceph project: https://github.com/ceph/ceph/pull/62622/files

I dont think we need to do the push/pop stuff in cmake, we can just use the check_cxx_source_compiles:

set(CMAKE_REQUIRED_FLAGS "-Werror=deprecated-declarations")
check_cxx_source_compiles("
#include <algorithm>
int main() { std::stable_sort((int *)0, (int*)0); }
" COMPILER_IGNORES_DEPRECATED_DECL_IN_SYSTEM_HEADERS)

@bdevorem
Copy link
Contributor Author

bdevorem commented Aug 5, 2025

We could disable the warning globally but conditionally check for this case since this is only a problem with gcc 12, which is what was done in the ceph project: https://github.com/ceph/ceph/pull/62622/files

I dont think we need to do the push/pop stuff in cmake, we can just use the check_cxx_source_compiles:

set(CMAKE_REQUIRED_FLAGS "-Werror=deprecated-declarations")
check_cxx_source_compiles("
#include <algorithm>
int main() { std::stable_sort((int *)0, (int*)0); }
" COMPILER_IGNORES_DEPRECATED_DECL_IN_SYSTEM_HEADERS)

@pfultz2 I just added a commit that does essentially this, but instead of conditionally turning off the warning, it conditionally specifies to clang where the gcc lib dir is. What are your thoughts? I think turning off the warning entirely is pretty broad, and might hide some potential warnings from us that we may want to see. This way, it specifies exactly what lib to use if the warning exists and libstd11 exists.

fyi not moving this from draft just yet because there are a ton of rocmlir warnings now. I think they might need to make some changes on their end but I need to make sure.

@codecov
Copy link

codecov bot commented Aug 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #4192   +/-   ##
==========================================
  Coverage           ?   92.23%           
==========================================
  Files              ?      553           
  Lines              ?    25628           
  Branches           ?        0           
==========================================
  Hits               ?    23636           
  Misses             ?     1992           
  Partials           ?        0           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bdevorem bdevorem marked this pull request as ready for review August 6, 2025 17:59
@bdevorem bdevorem force-pushed the bdevorem/compiler-warnings branch from d35165a to e5ecc7e Compare August 7, 2025 16:53
@migraphx-bot
Copy link
Collaborator

Test Batch Rate new
c9ff52
Rate old
284037
Diff Compare
torchvision-resnet50 64 3,230.38 3,247.43 -0.52%
torchvision-resnet50_fp16 64 6,935.41 6,948.21 -0.18%
torchvision-densenet121 32 2,438.15 2,450.59 -0.51%
torchvision-densenet121_fp16 32 4,154.94 4,171.75 -0.40%
torchvision-inceptionv3 32 1,626.84 1,634.45 -0.47%
torchvision-inceptionv3_fp16 32 2,747.30 2,761.48 -0.51%
cadene-inceptionv4 16 767.09 771.24 -0.54%
cadene-resnext64x4 16 809.00 814.23 -0.64%
slim-mobilenet 64 7,423.47 7,457.62 -0.46%
slim-nasnetalarge 64 210.01 211.13 -0.53%
slim-resnet50v2 64 3,328.31 3,343.56 -0.46%
bert-mrpc-onnx 8 1,136.22 1,144.72 -0.74%
bert-mrpc-tf 1 440.01 442.76 -0.62%
pytorch-examples-wlang-gru 1 299.29 353.41 -15.32% 🔴
pytorch-examples-wlang-lstm 1 408.22 406.54 0.41%
torchvision-resnet50_1 1 768.20 762.51 0.75%
cadene-dpn92_1 1 390.22 389.54 0.18%
cadene-resnext101_1 1 386.59 393.49 -1.75%
onnx-taau-downsample 1 394.85 396.45 -0.41%
dlrm-criteoterabyte 1 33.67 33.77 -0.30%
dlrm-criteoterabyte_fp16 1 51.11 51.22 -0.21%
agentmodel 1 8,971.67 9,068.49 -1.07%
unet_fp16 2 58.91 59.14 -0.38%
resnet50v1_fp16 1 987.70 966.77 2.17%
resnet50v1_int8 1 1,041.32 1,044.17 -0.27%
bert_base_cased_fp16 64 1,099.66 1,107.24 -0.69%
bert_large_uncased_fp16 32 343.44 345.26 -0.53%
bert_large_fp16 1 197.37 196.54 0.42%
distilgpt2_fp16 16 2,103.52 2,117.52 -0.66%
yolov5s 1 569.20 575.45 -1.09%
tinyllama 1 43.77 43.97 -0.45%
vicuna-fastchat 1 45.02 45.35 -0.72%
whisper-tiny-encoder 1 415.60 417.21 -0.39%
whisper-tiny-decoder 1 409.27 399.93 2.34%
llama2_7b 1 19.11 19.15 -0.22%
qwen1.5-7b 1 23.42 23.53 -0.45%
phi3-3.8b 1 26.59 26.66 -0.29%
mask-rcnn 1 12.40 12.39 0.09%
llama3-8b 1 21.65 21.77 -0.52%
whisper-large-encoder 1 10.16 10.22 -0.53%
whisper-large-decoder 1 96.31 96.06 0.27%
mistral-7b 1 23.62 23.72 -0.44%
FLUX.1-schnell 1 740.81 745.73 -0.66%
nan nan nan nan nan%

This build is not recommended to merge 🔴

@migraphx-bot
Copy link
Collaborator


     ✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

❌bert-mrpc-tf: ERROR - check error outputerror: unknown warning option '-Wnrvo' [-Werror,-Wunknown-warning-option]

error: unknown warning option '-Wnrvo' [-Werror,-Wunknown-warning-option]

2025-08-08 05:05:45.833059: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1754647551.046720 168411 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 62951 MB memory: -> device: 0, name: AMD Instinct MI250X/MI250, pci bus id: 0000:32:00.0
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1754647551.908417 168411 mlir_graph_optimization_pass.cc:401] MLIR V1 optimization pass is not enabled
2025-08-08 05:06:00.435415: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 05:06:00.435588: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 05:06:00.435633: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 05:06:00.435674: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 05:06:00.435703: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 05:06:00.435749: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 05:06:00.435792: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-08 05:06:00.435838: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
2025-08-08 05:06:00.436910: E tensorflow/compiler/mlir/tools/kernel_gen/tf_framework_c_interface.cc:228] INTERNAL: Generating device code failed.
2025-08-08 05:06:00.438002: W tensorflow/core/framework/op_kernel.cc:1829] UNKNOWN: JIT compilation failed.
2025-08-08 05:06:00.438023: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
2025-08-08 05:06:00.438036: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
[[import/loss/output/_21]]
2025-08-08 05:06:00.438053: I tensorflow/core/framework/local_rendezvous.cc:424] Local rendezvous recv item cancelled. Key hash: 11217777527359497193
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1407, in _do_call
return fn(*args)
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1390, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1483, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
(0) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
[[import/loss/output/_21]]
(1) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 359, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 335, in main
y_out = sess.run(y, feed_dict=tf_dict)
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 977, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1220, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1400, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1426, in _do_call
raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.UnknownError: Graph execution error:

Detected at node 'import/bert/embeddings/LayerNorm/moments/SquaredDifference' defined at (most recent call last):
Node: 'import/bert/embeddings/LayerNorm/moments/SquaredDifference'
Detected at node 'import/bert/embeddings/LayerNorm/moments/SquaredDifference' defined at (most recent call last):
Node: 'import/bert/embeddings/LayerNorm/moments/SquaredDifference'
2 root error(s) found.
(0) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
[[import/loss/output/_21]]
(1) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'import/bert/embeddings/LayerNorm/moments/SquaredDifference':


     ✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

     ✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

     ✅ agentmodel: PASSED: MIGraphX meets tolerance

🔴unet: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ resnet50v1: PASSED: MIGraphX meets tolerance

     ✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ bert_large: PASSED: MIGraphX meets tolerance

     ✅ yolov5s: PASSED: MIGraphX meets tolerance

     ✅ tinyllama: PASSED: MIGraphX meets tolerance

     ✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

     ✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

     ✅ llama2_7b: PASSED: MIGraphX meets tolerance

     ✅ qwen1.5-7b: PASSED: MIGraphX meets tolerance

     ✅ phi3-3.8b: PASSED: MIGraphX meets tolerance

🔴mask-rcnn: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ llama3-8b: PASSED: MIGraphX meets tolerance

     ✅ whisper-large-decoder: PASSED: MIGraphX meets tolerance

     ✅ mistral-7b: PASSED: MIGraphX meets tolerance

     ✅ FLUX.1-schnell: PASSED: MIGraphX meets tolerance

@bdevorem
Copy link
Contributor Author

@pfultz2 @causten ping on this PR

@causten causten requested a review from pfultz2 August 20, 2025 20:03
@bdevorem bdevorem force-pushed the bdevorem/compiler-warnings branch from b573411 to 334827c Compare August 21, 2025 22:34
@bdevorem
Copy link
Contributor Author

@pfultz2 I removed the CXX flags line as requested, ready to be merged now

@causten causten merged commit 4f6d340 into develop Aug 26, 2025
48 of 52 checks passed
@causten causten deleted the bdevorem/compiler-warnings branch August 26, 2025 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

compiler warnings on mainline

5 participants