AMDGPU: Fix runtime unrolling when cascaded GEPs present #147700

macurtis-amd · 2025-07-09T11:31:02Z

Cascaded GEP (i.e. GEP of GEP) are not handled when determining if it is ok to runtime unroll loops.

This change simply uses getUnderlyingObjects to look through cascaded GEPs.

llvmbot · 2025-07-09T11:31:34Z

@llvm/pr-subscribers-llvm-transforms

Author: None (macurtis-amd)

Changes

Cascaded GEP (i.e. GEP of GEP) are not handled when determining if it is ok to runtime unroll loops.

This change simply uses getUnderlyingObjects to look through cascaded GEPs.

Full diff: https://github.com/llvm/llvm-project/pull/147700.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp (+6-3)
(added) llvm/test/Transforms/LoopUnroll/AMDGPU/unroll-runtime.ll (+44)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
index f693580929518..24f4df2aff9d1 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
@@ -216,10 +216,13 @@ void AMDGPUTTIImpl::getUnrollingPreferences(
         // a variable, most likely we will be unable to combine it.
         // Do not unroll too deep inner loops for local memory to give a chance
         // to unroll an outer loop for a more important reason.
-        if (LocalGEPsSeen > 1 || L->getLoopDepth() > 2 ||
-            (!isa<GlobalVariable>(GEP->getPointerOperand()) &&
-             !isa<Argument>(GEP->getPointerOperand())))
+        if (LocalGEPsSeen > 1 || L->getLoopDepth() > 2)
           continue;
+
+        const Value *V = getUnderlyingObject(GEP->getPointerOperand());
+        if (!isa<GlobalVariable>(V) && !isa<Argument>(V))
+          continue;
+
         LLVM_DEBUG(dbgs() << "Allow unroll runtime for loop:\n"
                           << *L << " due to LDS use.\n");
         UP.Runtime = UnrollRuntimeLocal;
diff --git a/llvm/test/Transforms/LoopUnroll/AMDGPU/unroll-runtime.ll b/llvm/test/Transforms/LoopUnroll/AMDGPU/unroll-runtime.ll
new file mode 100644
index 0000000000000..9414a1e24e542
--- /dev/null
+++ b/llvm/test/Transforms/LoopUnroll/AMDGPU/unroll-runtime.ll
@@ -0,0 +1,44 @@
+; RUN: opt -mtriple=amdgcn-unknown-amdhsa -passes=loop-unroll -S %s | FileCheck %s
+
+%struct.wombat = type { %struct.zot, i32, [16 x i32], [16 x i32], i32, i32, [16 x i32], i32 }
+%struct.zot = type { i32, i32, [1024 x i32] }
+
+@global = external addrspace(3) global %struct.wombat
+
+; Ensure that a cascaded GEP for local address space does not inhibit unrolling
+;
+; CHECK-LABEL: @unroll_when_cascaded_gep
+; CHECK: bb:
+; CHECK:   br {{.*}}, label %bb2.unr-lcssa, label %bb.new
+; CHECK: bb.new:
+; CHECK:   %unroll_iter = 
+; CHECK:   br label %bb1
+; CHECK: bb1:
+; CHECK:   br {{.*}}, label %bb2.unr-lcssa.loopexit, label %bb1
+; CHECK: bb2.unr-lcssa.loopexit:
+; CHECK:   br label %bb2.unr-lcssa
+; CHECK: bb2.unr-lcssa:
+; CHECK:   br {{.*}}, label %bb1.epil.preheader, label %bb2
+; CHECK: bb1.epil.preheader:
+; CHECK:   br label %bb1.epil
+; CHECK: bb1.epil:
+; CHECK:   br {{.*}}, label %bb1.epil, label %bb2.epilog-lcssa
+; CHECK: bb2.epilog-lcssa:
+; CHECK:   br label %bb2
+; CHECK: bb2:
+; CHECK:   ret void
+define amdgpu_kernel void @unroll_when_cascaded_gep(i32 %arg) {
+bb:
+  br label %bb1
+
+bb1:                                              ; preds = %bb1, %bb
+  %phi = phi i32 [ 0, %bb ], [ %add, %bb1 ]
+  %getelementptr = getelementptr [1024 x i32], ptr addrspace(3) getelementptr inbounds nuw (i8, ptr addrspace(3) @global, i32 8), i32 0, i32 0
+  %add = add i32 %phi, 1
+  %icmp = icmp eq i32 %phi, %arg
+  br i1 %icmp, label %bb2, label %bb1
+
+bb2:                                              ; preds = %bb1
+  ret void
+}
+

llvmbot · 2025-07-09T11:31:34Z

@llvm/pr-subscribers-backend-amdgpu

Author: None (macurtis-amd)

Changes

Cascaded GEP (i.e. GEP of GEP) are not handled when determining if it is ok to runtime unroll loops.

This change simply uses getUnderlyingObjects to look through cascaded GEPs.

Full diff: https://github.com/llvm/llvm-project/pull/147700.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp (+6-3)
(added) llvm/test/Transforms/LoopUnroll/AMDGPU/unroll-runtime.ll (+44)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
index f693580929518..24f4df2aff9d1 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
@@ -216,10 +216,13 @@ void AMDGPUTTIImpl::getUnrollingPreferences(
         // a variable, most likely we will be unable to combine it.
         // Do not unroll too deep inner loops for local memory to give a chance
         // to unroll an outer loop for a more important reason.
-        if (LocalGEPsSeen > 1 || L->getLoopDepth() > 2 ||
-            (!isa<GlobalVariable>(GEP->getPointerOperand()) &&
-             !isa<Argument>(GEP->getPointerOperand())))
+        if (LocalGEPsSeen > 1 || L->getLoopDepth() > 2)
           continue;
+
+        const Value *V = getUnderlyingObject(GEP->getPointerOperand());
+        if (!isa<GlobalVariable>(V) && !isa<Argument>(V))
+          continue;
+
         LLVM_DEBUG(dbgs() << "Allow unroll runtime for loop:\n"
                           << *L << " due to LDS use.\n");
         UP.Runtime = UnrollRuntimeLocal;
diff --git a/llvm/test/Transforms/LoopUnroll/AMDGPU/unroll-runtime.ll b/llvm/test/Transforms/LoopUnroll/AMDGPU/unroll-runtime.ll
new file mode 100644
index 0000000000000..9414a1e24e542
--- /dev/null
+++ b/llvm/test/Transforms/LoopUnroll/AMDGPU/unroll-runtime.ll
@@ -0,0 +1,44 @@
+; RUN: opt -mtriple=amdgcn-unknown-amdhsa -passes=loop-unroll -S %s | FileCheck %s
+
+%struct.wombat = type { %struct.zot, i32, [16 x i32], [16 x i32], i32, i32, [16 x i32], i32 }
+%struct.zot = type { i32, i32, [1024 x i32] }
+
+@global = external addrspace(3) global %struct.wombat
+
+; Ensure that a cascaded GEP for local address space does not inhibit unrolling
+;
+; CHECK-LABEL: @unroll_when_cascaded_gep
+; CHECK: bb:
+; CHECK:   br {{.*}}, label %bb2.unr-lcssa, label %bb.new
+; CHECK: bb.new:
+; CHECK:   %unroll_iter = 
+; CHECK:   br label %bb1
+; CHECK: bb1:
+; CHECK:   br {{.*}}, label %bb2.unr-lcssa.loopexit, label %bb1
+; CHECK: bb2.unr-lcssa.loopexit:
+; CHECK:   br label %bb2.unr-lcssa
+; CHECK: bb2.unr-lcssa:
+; CHECK:   br {{.*}}, label %bb1.epil.preheader, label %bb2
+; CHECK: bb1.epil.preheader:
+; CHECK:   br label %bb1.epil
+; CHECK: bb1.epil:
+; CHECK:   br {{.*}}, label %bb1.epil, label %bb2.epilog-lcssa
+; CHECK: bb2.epilog-lcssa:
+; CHECK:   br label %bb2
+; CHECK: bb2:
+; CHECK:   ret void
+define amdgpu_kernel void @unroll_when_cascaded_gep(i32 %arg) {
+bb:
+  br label %bb1
+
+bb1:                                              ; preds = %bb1, %bb
+  %phi = phi i32 [ 0, %bb ], [ %add, %bb1 ]
+  %getelementptr = getelementptr [1024 x i32], ptr addrspace(3) getelementptr inbounds nuw (i8, ptr addrspace(3) @global, i32 8), i32 0, i32 0
+  %add = add i32 %phi, 1
+  %icmp = icmp eq i32 %phi, %arg
+  br i1 %icmp, label %bb2, label %bb1
+
+bb2:                                              ; preds = %bb1
+  ret void
+}
+

JonChesterfield

Yep, that's consistent. Thanks very much for running it to ground!

AMDGPU: Fix runtime unrolling when cascaded GEPs present

2697574

llvmbot added backend:AMDGPU llvm:transforms labels Jul 9, 2025

macurtis-amd requested review from arsenm and jayfoad July 9, 2025 11:32

JonChesterfield approved these changes Jul 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AMDGPU: Fix runtime unrolling when cascaded GEPs present #147700

AMDGPU: Fix runtime unrolling when cascaded GEPs present #147700

macurtis-amd commented Jul 9, 2025

Uh oh!

llvmbot commented Jul 9, 2025

Uh oh!

llvmbot commented Jul 9, 2025

Uh oh!

JonChesterfield left a comment

Uh oh!

Uh oh!

AMDGPU: Fix runtime unrolling when cascaded GEPs present #147700

Are you sure you want to change the base?

AMDGPU: Fix runtime unrolling when cascaded GEPs present #147700

Conversation

macurtis-amd commented Jul 9, 2025

Uh oh!

llvmbot commented Jul 9, 2025

Uh oh!

llvmbot commented Jul 9, 2025

Uh oh!

JonChesterfield left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!