-
Notifications
You must be signed in to change notification settings - Fork 14.4k
Silence MLIRTargetLLVMTests failures pre-ROCm6.4 #147610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
COV6 was made default in 6.4, so check that the ROCm we find isn't older before assuming the tests should pass.
@llvm/pr-subscribers-mlir-llvm Author: Scott Linder (slinder1) ChangesCOV6 was made default in 6.4, so check that the ROCm we find isn't older before assuming the tests should pass. Not sure this is worth the code, and it may make more sense to fail open if e.g. the .info/version file isn't present or doesn't have the version in the expected format. I had already written the code before noting this, so figured I'd just see what others think. Full diff: https://github.com/llvm/llvm-project/pull/147610.diff 1 Files Affected:
diff --git a/mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp b/mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp
index a015e1d7dde62..ef314623956d0 100644
--- a/mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp
+++ b/mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp
@@ -20,6 +20,7 @@
#include "llvm/IRReader/IRReader.h"
#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/MemoryBufferRef.h"
#include "llvm/Support/Path.h"
#include "llvm/Support/TargetSelect.h"
@@ -52,6 +53,27 @@ class MLIRTargetLLVMROCDL : public ::testing::Test {
StringRef rocmPath = ROCDL::getROCMPath();
if (rocmPath.empty())
return false;
+ llvm::SmallString<128> rocmVerPath(rocmPath);
+ llvm::sys::path::append(rocmVerPath, ".info", "version");
+ auto bufOrErr = llvm::MemoryBuffer::getFile(rocmVerPath, /*IsText=*/true);
+ if (!bufOrErr)
+ return false;
+ SmallVector<StringRef, 2> majorMinorRest;
+ bufOrErr.get()->getBuffer().split(majorMinorRest, '.', /*MaxSplit=*/2);
+ if (majorMinorRest.size() != 3)
+ return false;
+ auto asInt = [](StringRef s) -> int {
+ unsigned i;
+ if (s.getAsInteger(/*Radix=*/10, i))
+ return -1;
+ return i;
+ };
+ int major = asInt(majorMinorRest[0]);
+ int minor = asInt(majorMinorRest[1]);
+ if (major < 6)
+ return false;
+ if (major == 6 && minor < 4)
+ return false;
llvm::SmallString<128> lldPath(rocmPath);
llvm::sys::path::append(lldPath, "llvm", "bin", "ld.lld");
return llvm::sys::fs::can_execute(lldPath);
@@ -185,7 +207,7 @@ TEST_F(MLIRTargetLLVMROCDL, SKIP_WITHOUT_AMDGPU(SerializeROCDLToPTX)) {
// Test ROCDL serialization to Binary.
TEST_F(MLIRTargetLLVMROCDL, SKIP_WITHOUT_AMDGPU(SerializeROCDLToBinary)) {
if (!hasROCMTools())
- GTEST_SKIP() << "ROCm installation not found, skipping test.";
+ GTEST_SKIP() << "Compatible ROCm installation not found, skipping test.";
MLIRContext context(registry);
@@ -212,7 +234,7 @@ TEST_F(MLIRTargetLLVMROCDL, SKIP_WITHOUT_AMDGPU(SerializeROCDLToBinary)) {
// Test ROCDL metadata.
TEST_F(MLIRTargetLLVMROCDL, SKIP_WITHOUT_AMDGPU(GetELFMetadata)) {
if (!hasROCMTools())
- GTEST_SKIP() << "ROCm installation not found, skipping test.";
+ GTEST_SKIP() << "Compatible ROCm installation not found, skipping test.";
MLIRContext context(registry);
|
@llvm/pr-subscribers-mlir Author: Scott Linder (slinder1) ChangesCOV6 was made default in 6.4, so check that the ROCm we find isn't older before assuming the tests should pass. Not sure this is worth the code, and it may make more sense to fail open if e.g. the .info/version file isn't present or doesn't have the version in the expected format. I had already written the code before noting this, so figured I'd just see what others think. Full diff: https://github.com/llvm/llvm-project/pull/147610.diff 1 Files Affected:
diff --git a/mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp b/mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp
index a015e1d7dde62..ef314623956d0 100644
--- a/mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp
+++ b/mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp
@@ -20,6 +20,7 @@
#include "llvm/IRReader/IRReader.h"
#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/MemoryBufferRef.h"
#include "llvm/Support/Path.h"
#include "llvm/Support/TargetSelect.h"
@@ -52,6 +53,27 @@ class MLIRTargetLLVMROCDL : public ::testing::Test {
StringRef rocmPath = ROCDL::getROCMPath();
if (rocmPath.empty())
return false;
+ llvm::SmallString<128> rocmVerPath(rocmPath);
+ llvm::sys::path::append(rocmVerPath, ".info", "version");
+ auto bufOrErr = llvm::MemoryBuffer::getFile(rocmVerPath, /*IsText=*/true);
+ if (!bufOrErr)
+ return false;
+ SmallVector<StringRef, 2> majorMinorRest;
+ bufOrErr.get()->getBuffer().split(majorMinorRest, '.', /*MaxSplit=*/2);
+ if (majorMinorRest.size() != 3)
+ return false;
+ auto asInt = [](StringRef s) -> int {
+ unsigned i;
+ if (s.getAsInteger(/*Radix=*/10, i))
+ return -1;
+ return i;
+ };
+ int major = asInt(majorMinorRest[0]);
+ int minor = asInt(majorMinorRest[1]);
+ if (major < 6)
+ return false;
+ if (major == 6 && minor < 4)
+ return false;
llvm::SmallString<128> lldPath(rocmPath);
llvm::sys::path::append(lldPath, "llvm", "bin", "ld.lld");
return llvm::sys::fs::can_execute(lldPath);
@@ -185,7 +207,7 @@ TEST_F(MLIRTargetLLVMROCDL, SKIP_WITHOUT_AMDGPU(SerializeROCDLToPTX)) {
// Test ROCDL serialization to Binary.
TEST_F(MLIRTargetLLVMROCDL, SKIP_WITHOUT_AMDGPU(SerializeROCDLToBinary)) {
if (!hasROCMTools())
- GTEST_SKIP() << "ROCm installation not found, skipping test.";
+ GTEST_SKIP() << "Compatible ROCm installation not found, skipping test.";
MLIRContext context(registry);
@@ -212,7 +234,7 @@ TEST_F(MLIRTargetLLVMROCDL, SKIP_WITHOUT_AMDGPU(SerializeROCDLToBinary)) {
// Test ROCDL metadata.
TEST_F(MLIRTargetLLVMROCDL, SKIP_WITHOUT_AMDGPU(GetELFMetadata)) {
if (!hasROCMTools())
- GTEST_SKIP() << "ROCm installation not found, skipping test.";
+ GTEST_SKIP() << "Compatible ROCm installation not found, skipping test.";
MLIRContext context(registry);
|
int minor = asInt(majorMinorRest[1]); | ||
if (major < 6) | ||
return false; | ||
if (major == 6 && minor < 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Please add inline comments about what happens here and maybe move this to a separate function. Otherwise, it will be very unclear what's going on here, I fear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not blocking on the merits of the patch. Just preventing merging until we agree on the path forward.
Can you expand a bit more on the issue? For example, what happens if you compile with ROCm 6.4 or earlier?
What about adding a CMake flag like MLIR_DEFAULT_ROCM_ABI=600
controlled by the detected ROCm version and substituting it here rengolin@84cb08e#diff-3da7b8747032bb581f119fd49037d06706443c3c5db56d4d28cc74053a9b754dL189 ?
Makes sense, and sorry for the half-baked patch, I just realized the path I was on probably wasn't ideal so wanted to get input before continuing
Note: my patch mistakenly checks for the version where the default switched, not the version when support was added to
Something like this seems reasonable, but technically there are ways to point ROCDL at another ROCM at runtime, right? I suppose in those cases the user just has to set the cmake cache var themselves? To handle this generally might add more complexity than the use-case warrants (or put another way, I'm not sure there is any use-case to begin with, I just wanted my tests clean on an older box). The failing test could also be desirable considering the dependency on the installed ROCM? I'm fine with just dropping this after thinking through it more |
Yes, but:
is also an option as well, so the user could set I think the patch through CMake should be around a 50 line change or so. And in that case we would end up increasing the support matrix which is always nice.
The proper fix in this case is just to pass the correct value of the ABI, not deactivating the test. Which again, shouldn't be that hard, as the ABI value is a configurable value either via a pass otpion or by setting a different value in the attribute. Just in case, nice catch! This is something that definitely needs a fix. |
COV6 was made default in 6.4, so check that the ROCm we find isn't older before assuming the tests should pass.
Not sure this is worth the code, and it may make more sense to fail open if e.g. the .info/version file isn't present or doesn't have the version in the expected format. I had already written the code before noting this, so figured I'd just see what others think.