Skip to content

Conversation

@shivadbhavsar
Copy link
Contributor

@shivadbhavsar shivadbhavsar commented Aug 1, 2025

Currently when using the MIGRAPHX_MLIR_DUMP flag, compile will error out due to filename being too long in cases of large fusions.

To avoid this this PR adds logic to:

  1. Abbreviate/shorten op names
  2. If 1. is not enough, name is truncated to make sure program execution does not error out

Note:
Most common filesystems have a 255 byte limit and so the number is hard coded. I was not able to find a reliable programmatic way to determine this number based on the user system, please comment if you know!

Example shortening:
Before: ./mlir_where_slice_reshape_transpose_slice_reshape_transpose_slice_reshape_transpose_dot_add_convert_mul_reshape_reduce_max_reshape_sub_exp_reshape_reduce_sum_reshape_div_convert_dot_transpose_reshape_32x16x512x64s524288x64x1024x1_32x16x64x512s524288x64x1x1024_32x16x512x512_32x16x512x64s524288x64x1024x1.mlir

Now:

./mlir_where_slc_rsp_trp_slc_rsp_trp_slc_rsp_trp_dot_add_cvt_mul_rsp_rmax_rsp_sub_exp_rsp_rsum_rsp_div_cvt_dot_trp_rsp_32x16x512x64s524288x64x1024x1_32x16x64x512s524288x64x1x1024_32x16x512x512_32x16x512x64s524288x64x1024x1.mlir

./mlir_where_slc_rsp_trp_slc_rsp_trp_slc_rsp_trp_dot_add_convert_mul_rsp_softmax_convert_dot_trp_rsp_32x16x512x64s524288x64x1024x1_32x16x64x512s524288x64x1x1024_32x16x512x512_32x16x512x64s524288x64x1024x1.mlir

@shivadbhavsar shivadbhavsar requested a review from causten as a code owner August 1, 2025 22:44
@shivadbhavsar shivadbhavsar requested review from Copilot, pfultz2 and umangyadav and removed request for causten August 1, 2025 22:44
@shivadbhavsar shivadbhavsar self-assigned this Aug 1, 2025

This comment was marked as outdated.

mlir_program::get_symbol_name(m) + "_" + shape::to_sizes_string(to_shapes(sizes)) + ext;
replace_string_inplace(name, ", ", "_");
replace_string_inplace(name, ":", "s");
truncate_file_name(name, ext, 255);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should document somewhere on why this limit exists.

abbreviate_symbol_names(fname);

// if file name is still too long after shortening op names, truncate
if(fname.length() > max_len)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

truncating would mean same filename for different kernels in some cases. We may not see MLIR dump in that case. it's a problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you suggest we do in that case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what would be best approach.
we could do a few things. e.g. maintain a counter and if names are same then suffix them with _0, _1 etc.

or competely redesign kernel name e.g.

for the attention only keep dot_softmax_dot in the same followed by hash value computed from MLIR kernel so that each unique MLIR kernel will get different name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to address some of this. There is definitely room for improvement with the added counter, currently it only exists to make sure file names dont conflict and cause overwrites but the number is meaningless. If this is reasonable for now, lets move forward and leave more improvements for a future update?

Copy link
Contributor

@lakhinderwalia lakhinderwalia Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally, the need for an added counter is debatable here: because the shortened (new) names don't overload any existing name, so a unique file name will stay unique with your new scheme.

mlirRockTuningParamDestroy);

// NOLINTNEXTLINE
static std::atomic<int> dump_counter = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. If you can use a dump_counter, then what prevents from using just that (counter) as a file name?
  2. Of course, we shouldn't dump duplicates (which perhaps get overwritten right now) with increasing counter value -- a possible problem in this scheme.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. nothing stops you, but its not desirable. A half truncated name with a counter will give you more info about what to expect in the file than just a counter
  2. Not sure what you mean. In general we dont dump duplicates (each kernel name will be unique) but if you have 2 really large names and only the ends of the names are different, truncating will cause the names to match. Thats the issue this is solving

Copy link
Contributor

@lakhinderwalia lakhinderwalia Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. nothing stops you, but its not desirable. A half truncated name with a counter will give you more info about what to expect in the file than just a counter

not desirable is very subjective. Algorithmic solution is needed here. One could find a scheme to drastically reduce the filename (length) or just pick the rolling number that you have coded-in, and solve this issue for good! :-)

Unfortunately, the idea of calling transpose as tpose or something like it is not the answer here: because the file names are still very large. Large dimension strings are appended, and this trend to fuse (and thus longer names) isn't going away! We would be forced to visit the file name issue again. Since the name has been munged already, so might have a simple name was my thought.

  1. Not sure what you mean. In general we dont dump duplicates (each kernel name will be unique) but if you have 2 really large names and only the ends of the names are different, truncating will cause the names to match. Thats the issue this is solving

If the names were duplicates, (in the old or your new renaming scheme) we currently overwrite that file. But to append a rolling number to it, the duplicated file name now becomes unique -- so will be dumped more than once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think this needs to be over-engineered, this is a developer-only debug tool and doesnt affect users at all. Main consumers of this tool are our MLIR friends, so I will let @umangyadav , @dhernandez0 and team decide if they like the current scheme or if a completely uninformative name is their preference.

If the full names were duplicates, its the same module and we would not request mlir to compile it more than once. We dont currently overwrite the file because that scenario is not possible as far as I understand.

Copy link

@dhernandez0 dhernandez0 Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from my side, I generally debug one mlir kernel at a time. Because it's generally what we get from you guys (a failing or slow single kernel). So, I don't have a preference about what the filename should look like.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses filename length issues when using the MIGRAPHX_MLIR_DUMP flag by implementing a filename shortening mechanism. The solution prevents compilation errors that occur when generated MLIR dump filenames exceed filesystem limits (typically 255 bytes).

  • Implements operation name abbreviation to reduce filename length
  • Adds filename truncation with counter suffix as fallback when abbreviation isn't sufficient
  • Introduces atomic counter to ensure unique filenames when truncation is required

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

auto sym_names = mlir_program::get_symbol_name(m);
abbreviate_symbol_names(sym_names);

static int max_file_length = 255;
Copy link

Copilot AI Aug 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic number 255 should be defined as a named constant at the top of the file or in a header, making it easier to maintain and understand its purpose across the codebase.

Suggested change
static int max_file_length = 255;
constexpr int max_file_length = kMaxFileNameLength;

Copilot uses AI. Check for mistakes.
if(sym_names.length() + shape_str.length() + ext.length() > max_file_length)
{
auto cnt = "_" + std::to_string(dump_counter++);
auto cutoff = max_file_length - shape_str.length() - ext.length() - cnt.length();
Copy link

Copilot AI Aug 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cutoff calculation could result in a negative value if the combined length of shape_str, ext, and cnt exceeds max_file_length. This could cause undefined behavior in the substr call on line 1137.

Copilot uses AI. Check for mistakes.
@shivadbhavsar shivadbhavsar requested a review from pfultz2 August 15, 2025 22:49
// mlir_x to avoid erroring out
fname =
cutoff > 0 ? sym_names.substr(0, cutoff) + shape_str + cnt + ext : "mlir" + cnt + ext;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a file with a name mlir0 or mlir1 is acceptable, then why not just dump every file with that consistent scheme: mlir_1, mlir_2? That was my point earlier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because if you use this flag with a large model you might get 10s of mlir_ files, you would rather open every single one to see what they are about? I think if would be nice to see straight away see what kind of kernel each file is representing by a quick look at the name. As I mentioned in the other comment, I'm happy to do this if the mlir team would prefer this :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have a consistent naming scheme. The MLIR team is okay with either naming scheme, but we shouldn't have two naming schemes. You can either take the first approach, but you can truncate the long name to the appropriate length and add a numeral to it, if you want to see it straight away, a very personal approach that I find very subjective. So either please truncate the full name, or have a simple naming scheme. We should not add confusion while debugging, IMHO!

Copy link
Contributor

@lakhinderwalia lakhinderwalia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(simple change suggested above).

// mlir_x to avoid erroring out
fname =
cutoff > 0 ? sym_names.substr(0, cutoff) + shape_str + cnt + ext : "mlir" + cnt + ext;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have a consistent naming scheme. The MLIR team is okay with either naming scheme, but we shouldn't have two naming schemes. You can either take the first approach, but you can truncate the long name to the appropriate length and add a numeral to it, if you want to see it straight away, a very personal approach that I find very subjective. So either please truncate the full name, or have a simple naming scheme. We should not add confusion while debugging, IMHO!

@codecov
Copy link

codecov bot commented Aug 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #4197   +/-   ##
========================================
  Coverage    92.23%   92.23%           
========================================
  Files          553      553           
  Lines        25627    25628    +1     
========================================
+ Hits         23635    23636    +1     
  Misses        1992     1992           

see 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

}
fname += cnt + ext;

replace_string_inplace(fname, ", ", "_");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is shortening the file name by replacing 2 chars , with a single _, then this should be handled while accounting with new logic of name truncation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, totally missed the space after the comma

{"transpose", "trp"},
{"slice", "slc"}};

for(auto const& [key, val] : abbrs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a bug lurking here, should the loop iterate over the string reshape before the longer string (for softmax) which also has the same substring inside it. It is just that the std::map would be in an order which works here. But change it to a another data structure, e.g. a hash, and the renaming scheme could be a surprise for us :-)

if(fname.length() + cnt.length() + ext.length() > max_file_length)
{
auto cutoff = max_file_length - ext.length() - cnt.length();
fname = fname.substr(0, cutoff);
Copy link
Contributor

@lakhinderwalia lakhinderwalia Aug 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to agree with the copilot here :-)
correction: it is the cppcheck suggestion.


// On most commonly used file systems, the max file name size is 255 characters
const int max_file_length = 255;
auto cnt = "_" + std::to_string(dump_counter()++);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This nitpicking likely shows my poor programming aesthetics :-)
But, does it need a struct dump_counter above? Or could just a single line defining a static variable with this name also suffice? This is fine, however.

Copy link
Contributor

@lakhinderwalia lakhinderwalia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this PR. (Thanks for addressing the file naming concerns.) But please see the issues that I have highlighted in my comments, and fix as you see fit @shivadbhavsar.

@migraphx-bot
Copy link
Collaborator

Test Batch Rate new
a486ac
Rate old
7a7354
Diff Compare
torchvision-resnet50 64 3,220.69 3,229.81 -0.28%
torchvision-resnet50_fp16 64 6,921.20 6,930.63 -0.14%
torchvision-densenet121 32 2,440.33 2,436.33 0.16%
torchvision-densenet121_fp16 32 4,158.07 4,139.00 0.46%
torchvision-inceptionv3 32 1,627.16 1,627.21 -0.00%
torchvision-inceptionv3_fp16 32 2,740.78 2,747.94 -0.26%
cadene-inceptionv4 16 766.76 767.33 -0.07%
cadene-resnext64x4 16 813.23 813.57 -0.04%
slim-mobilenet 64 7,420.45 7,424.00 -0.05%
slim-nasnetalarge 64 209.93 209.93 0.00%
slim-resnet50v2 64 3,330.30 3,326.94 0.10%
bert-mrpc-onnx 8 1,137.78 1,136.69 0.10%
bert-mrpc-tf 1 445.21 438.33 1.57%
pytorch-examples-wlang-gru 1 296.66 296.29 0.13%
pytorch-examples-wlang-lstm 1 413.94 414.88 -0.23%
torchvision-resnet50_1 1 762.18 768.79 -0.86%
cadene-dpn92_1 1 382.56 386.86 -1.11%
cadene-resnext101_1 1 389.91 391.34 -0.37%
onnx-taau-downsample 1 394.90 393.85 0.27%
dlrm-criteoterabyte 1 33.65 33.66 -0.04%
dlrm-criteoterabyte_fp16 1 51.09 51.09 -0.00%
agentmodel 1 8,700.57 8,794.46 -1.07%
unet_fp16 2 58.99 59.02 -0.05%
resnet50v1_fp16 1 987.57 989.83 -0.23%
resnet50v1_int8 1 1,050.95 1,038.27 1.22%
bert_base_cased_fp16 64 1,099.77 1,101.69 -0.17%
bert_large_uncased_fp16 32 343.82 343.82 -0.00%
bert_large_fp16 1 196.57 194.83 0.90%
distilgpt2_fp16 16 2,102.43 2,102.25 0.01%
yolov5s 1 576.06 576.37 -0.05%
tinyllama 1 43.73 43.75 -0.05%
vicuna-fastchat 1 44.92 45.05 -0.30%
whisper-tiny-encoder 1 416.03 416.67 -0.16%
whisper-tiny-decoder 1 406.91 407.68 -0.19%
llama2_7b 1 19.11 19.10 0.08%
qwen1.5-7b 1 23.44 23.45 -0.04%
phi3-3.8b 1 26.61 26.62 -0.03%
mask-rcnn 1 12.48 12.33 1.20%
llama3-8b 1 21.64 21.64 -0.02%
whisper-large-encoder 1 10.17 10.17 -0.02%
whisper-large-decoder 1 96.06 95.84 0.23%
mistral-7b 1 23.62 23.59 0.13%
FLUX.1-schnell 1 742.09 743.47 -0.18%
nan nan nan nan nan%

This build is not recommended to merge 🔴

@migraphx-bot
Copy link
Collaborator


     ✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

❌bert-mrpc-tf: ERROR - check error outputerror: unknown warning option '-Wnrvo' [-Werror,-Wunknown-warning-option]

error: unknown warning option '-Wnrvo' [-Werror,-Wunknown-warning-option]

2025-08-21 13:16:12.914340: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1755800178.078858 172401 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 62951 MB memory: -> device: 0, name: AMD Instinct MI250X/MI250, pci bus id: 0000:32:00.0
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1755800179.017844 172401 mlir_graph_optimization_pass.cc:401] MLIR V1 optimization pass is not enabled
2025-08-21 13:16:27.978743: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-21 13:16:27.978788: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-21 13:16:27.978831: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-21 13:16:27.979042: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-21 13:16:27.979072: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-21 13:16:27.979116: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-21 13:16:27.979161: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
2025-08-21 13:16:27.979203: E external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:250] bitcode module is required by this HLO module but was not found at ./opencl.bc
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
error: Failure when generating HSACO
2025-08-21 13:16:27.980241: E tensorflow/compiler/mlir/tools/kernel_gen/tf_framework_c_interface.cc:228] INTERNAL: Generating device code failed.
2025-08-21 13:16:27.981494: W tensorflow/core/framework/op_kernel.cc:1829] UNKNOWN: JIT compilation failed.
2025-08-21 13:16:27.981513: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
2025-08-21 13:16:27.981524: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
[[import/loss/output/_21]]
2025-08-21 13:16:27.981539: I tensorflow/core/framework/local_rendezvous.cc:424] Local rendezvous recv item cancelled. Key hash: 11217777527359497193
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1407, in _do_call
return fn(*args)
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1390, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1483, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
(0) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
[[import/loss/output/_21]]
(1) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 359, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 335, in main
y_out = sess.run(y, feed_dict=tf_dict)
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 977, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1220, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1400, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/client/session.py", line 1426, in _do_call
raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.UnknownError: Graph execution error:

Detected at node 'import/bert/embeddings/LayerNorm/moments/SquaredDifference' defined at (most recent call last):
Node: 'import/bert/embeddings/LayerNorm/moments/SquaredDifference'
Detected at node 'import/bert/embeddings/LayerNorm/moments/SquaredDifference' defined at (most recent call last):
Node: 'import/bert/embeddings/LayerNorm/moments/SquaredDifference'
2 root error(s) found.
(0) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
[[import/loss/output/_21]]
(1) UNKNOWN: JIT compilation failed.
[[{{node import/bert/embeddings/LayerNorm/moments/SquaredDifference}}]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'import/bert/embeddings/LayerNorm/moments/SquaredDifference':


     ✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

     ✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

     ✅ agentmodel: PASSED: MIGraphX meets tolerance

     ✅ unet: PASSED: MIGraphX meets tolerance

     ✅ resnet50v1: PASSED: MIGraphX meets tolerance

     ✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ bert_large: PASSED: MIGraphX meets tolerance

     ✅ yolov5s: PASSED: MIGraphX meets tolerance

     ✅ tinyllama: PASSED: MIGraphX meets tolerance

     ✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

     ✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

     ✅ llama2_7b: PASSED: MIGraphX meets tolerance

     ✅ qwen1.5-7b: PASSED: MIGraphX meets tolerance

     ✅ phi3-3.8b: PASSED: MIGraphX meets tolerance

🔴mask-rcnn: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ llama3-8b: PASSED: MIGraphX meets tolerance

     ✅ whisper-large-decoder: PASSED: MIGraphX meets tolerance

     ✅ mistral-7b: PASSED: MIGraphX meets tolerance

     ✅ FLUX.1-schnell: PASSED: MIGraphX meets tolerance

@causten causten merged commit 3979191 into develop Aug 22, 2025
64 of 67 checks passed
@causten causten deleted the mlir_dump_fix branch August 22, 2025 17:36
bdevorem added a commit that referenced this pull request Aug 26, 2025
Currently when using the MIGRAPHX_MLIR_DUMP flag, compile will error out due to filename being too long in cases of large fusions.

To avoid this this PR adds logic to:

Abbreviate/shorten op names
If 1. is not enough, name is truncated to make sure program execution does not error out
Note:
Most common filesystems have a 255 byte limit and so the number is hard coded. I was not able to find a reliable programmatic way to determine this number based on the user system, please comment if you know!

Example shortening:
Before: ./mlir_where_slice_reshape_transpose_slice_reshape_transpose_slice_reshape_transpose_dot_add_convert_mul_reshape_reduce_max_reshape_sub_exp_reshape_reduce_sum_reshape_div_convert_dot_transpose_reshape_32x16x512x64s524288x64x1024x1_32x16x64x512s524288x64x1x1024_32x16x512x512_32x16x512x64s524288x64x1024x1.mlir

Now:
./mlir_where_slc_rsp_trp_slc_rsp_trp_slc_rsp_trp_dot_add_convert_mul_rsp_softmax_convert_dot_trp_rsp_32x16x512x64s524288x64x1024x1_32x16x64x512s524288x64x1x1024_32x16x512x512_32x16x512x64s524288x64x1024x1.mlir
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants