Skip to content

Commit 3099b7e

Browse files
authored
[Passes] Move LoopInterchange into optimization pipeline (#145503)
As mentioned in #145071, LoopInterchange should be part of the optimization pipeline rather than the simplification pipeline. This patch moves LoopInterchange into the optimization pipeline. More contexts: - By default, LoopInterchange attempts to improve data locality, however, it also takes increasing vectorization opportunities into account. Given that, it is reasonable to run it as close to vectorization as possible. - I looked into previous changes related to the placement of LoopInterchange, but couldn’t find any strong motivation suggesting that it benefits other simplifications. - As far as I tried some tests (including llvm-test-suite), removing LoopInterchange from the simplification pipeline does not affect other simplifications. Therefore, there doesn't seem to be much value in keeping it there. - The new position reduces compile-time for ThinLTO, probably because it only runs once per function in post-link optimization, rather than both in pre-link and post-link optimization. I haven't encountered any cases where the positional difference affects optimization results, so please feel free to revert if you run into any issues.
1 parent 03cfba4 commit 3099b7e

File tree

2 files changed

+52
-3
lines changed

2 files changed

+52
-3
lines changed

llvm/lib/Passes/PassBuilderPipelines.cpp

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -690,9 +690,6 @@ PassBuilder::buildFunctionSimplificationPipeline(OptimizationLevel Level,
690690

691691
LPM2.addPass(LoopDeletionPass());
692692

693-
if (PTO.LoopInterchange)
694-
LPM2.addPass(LoopInterchangePass());
695-
696693
// Do not enable unrolling in PreLinkThinLTO phase during sample PGO
697694
// because it changes IR to makes profile annotation in back compile
698695
// inaccurate. The normal unroller doesn't pay attention to forced full unroll
@@ -1547,6 +1544,10 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
15471544
// this may need to be revisited once we run GVN before loop deletion
15481545
// in the simplification pipeline.
15491546
LPM.addPass(LoopDeletionPass());
1547+
1548+
if (PTO.LoopInterchange)
1549+
LPM.addPass(LoopInterchangePass());
1550+
15501551
OptimizePM.addPass(createFunctionToLoopPassAdaptor(
15511552
std::move(LPM), /*UseMemorySSA=*/false, /*UseBlockFrequencyInfo=*/false));
15521553

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
; RUN: opt -passes='default<O3>' -enable-loopinterchange -disable-output \
2+
; RUN: -disable-verify -verify-analysis-invalidation=0 \
3+
; RUN: -debug-pass-manager=quiet %s 2>&1 | FileCheck %s
4+
5+
; Test the position of LoopInterchange in the pass pipeline.
6+
7+
; CHECK-NOT: Running pass: LoopInterchangePass
8+
; CHECK: Running pass: ControlHeightReductionPass
9+
; CHECK-NEXT: Running pass: LoopSimplifyPass
10+
; CHECK-NEXT: Running pass: LCSSAPass
11+
; CHECK-NEXT: Running pass: LoopRotatePass
12+
; CHECK-NEXT: Running pass: LoopDeletionPass
13+
; CHECK-NEXT: Running pass: LoopRotatePass
14+
; CHECK-NEXT: Running pass: LoopDeletionPass
15+
; CHECK-NEXT: Running pass: LoopInterchangePass
16+
; CHECK-NEXT: Running pass: LoopDistributePass
17+
; CHECK-NEXT: Running pass: InjectTLIMappings
18+
; CHECK-NEXT: Running pass: LoopVectorizePass
19+
20+
21+
define void @foo(ptr %a, i32 %n) {
22+
entry:
23+
br label %for.i.header
24+
25+
for.i.header:
26+
%i = phi i32 [ 0, %entry ], [ %i.next, %for.i.latch ]
27+
br label %for.j
28+
29+
for.j:
30+
%j = phi i32 [ 0, %for.i.header ], [ %j.next, %for.j ]
31+
%tmp = mul i32 %i, %n
32+
%offset = add i32 %tmp, %j
33+
%idx = getelementptr inbounds i32, ptr %a, i32 %offset
34+
%load = load i32, ptr %idx, align 4
35+
%inc = add i32 %load, 1
36+
store i32 %inc, ptr %idx, align 4
37+
%j.next = add i32 %j, 1
38+
%j.exit = icmp eq i32 %j.next, %n
39+
br i1 %j.exit, label %for.i.latch, label %for.j
40+
41+
for.i.latch:
42+
%i.next = add i32 %i, 1
43+
%i.exit = icmp eq i32 %i.next, %n
44+
br i1 %i.exit, label %for.i.header, label %exit
45+
46+
exit:
47+
ret void
48+
}

0 commit comments

Comments
 (0)