Skip to content

Commit e65108e

Browse files
pccgithub-actions[bot]
authored andcommitted
Automerge: MachineLICM: Merge logic for implicit and explicit definitions.
Anatoly Trosinenko found that when hasSideEffect was set to 0 in the definition of LOADgotAUTH, MultiSource/Benchmarks/Ptrdist/ks/ks test from llvm-test-suite started to crash. The issue was traced down to MachineLICM pass placing LOADgotAUTH right after an unrelated copy to x16 like rewriting this code: ```` bb.0: renamable $x16 = COPY renamable $x12 B %bb.1 bb.1: ... /* use $x16 */ ... renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv /* use $x20 */ ... ```` like the following: ```` bb.0: renamable $x16 = COPY renamable $x12 renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv B %bb.1 bb.1: ... /* use $x16 */ ... /* use $x20 */ ... ``` The issue was caused by inconsistent logic between implicit and explicit operand definitions, where the implicit side was incorrectly skipping checking RUDefs for dead operands, leading to RuledOut not being set for the X16 operand. Because there isn't really a semantic difference between implicit and explicit operands at this point, let's remove the isImplicit check and adjust the logic to do the same thing in both cases: - For implicit operands, we now check and update RUDefs in the same way as explicit operands. - For explicit operands, we now allow dead operands to be skipped. Reviewers: arsenm, s-barannikov, atrosinenko Reviewed By: arsenm, s-barannikov Pull Request: llvm/llvm-project#147624
2 parents e522ece + 8b9bbd9 commit e65108e

File tree

6 files changed

+150
-55
lines changed

6 files changed

+150
-55
lines changed

llvm/lib/CodeGen/MachineLICM.cpp

Lines changed: 5 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -553,24 +553,14 @@ void MachineLICMImpl::ProcessMI(MachineInstr *MI, BitVector &RUDefs,
553553
continue;
554554
}
555555

556-
if (MO.isImplicit()) {
557-
for (MCRegUnit Unit : TRI->regunits(Reg))
558-
RUClobbers.set(Unit);
559-
if (!MO.isDead())
560-
// Non-dead implicit def? This cannot be hoisted.
556+
// FIXME: For now, avoid instructions with multiple defs, unless it's dead.
557+
if (!MO.isDead()) {
558+
if (Def)
561559
RuledOut = true;
562-
// No need to check if a dead implicit def is also defined by
563-
// another instruction.
564-
continue;
560+
else
561+
Def = Reg;
565562
}
566563

567-
// FIXME: For now, avoid instructions with multiple defs, unless
568-
// it's a dead implicit def.
569-
if (Def)
570-
RuledOut = true;
571-
else
572-
Def = Reg;
573-
574564
// If we have already seen another instruction that defines the same
575565
// register, then this is not safe. Two defs is indicated by setting a
576566
// PhysRegClobbers bit.
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
2+
# RUN: llc -mtriple=aarch64 -run-pass machinelicm -verify-machineinstrs -o - %s | FileCheck %s
3+
# RUN: llc -mtriple=aarch64 -passes machinelicm -o - %s | FileCheck %s
4+
5+
---
6+
name: unsafe_to_move
7+
tracksRegLiveness: true
8+
body: |
9+
; CHECK-LABEL: name: unsafe_to_move
10+
; CHECK: bb.0:
11+
; CHECK-NEXT: successors: %bb.1(0x80000000)
12+
; CHECK-NEXT: liveins: $x0
13+
; CHECK-NEXT: {{ $}}
14+
; CHECK-NEXT: $x16 = COPY killed $x0
15+
; CHECK-NEXT: B %bb.1
16+
; CHECK-NEXT: {{ $}}
17+
; CHECK-NEXT: bb.1:
18+
; CHECK-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
19+
; CHECK-NEXT: liveins: $x16
20+
; CHECK-NEXT: {{ $}}
21+
; CHECK-NEXT: $x1 = COPY killed $x16
22+
; CHECK-NEXT: $x2 = MOVi64imm 1024, implicit-def dead $x16
23+
; CHECK-NEXT: $x16 = LDRXroX killed $x1, killed $x2, 0, 0
24+
; CHECK-NEXT: $xzr = SUBSXri $x16, 0, 0, implicit-def $nzcv
25+
; CHECK-NEXT: Bcc 1, %bb.1, implicit $nzcv
26+
; CHECK-NEXT: B %bb.2
27+
; CHECK-NEXT: {{ $}}
28+
; CHECK-NEXT: bb.2:
29+
; CHECK-NEXT: liveins: $x1
30+
; CHECK-NEXT: {{ $}}
31+
; CHECK-NEXT: $x0 = COPY killed $x1
32+
; CHECK-NEXT: RET_ReallyLR
33+
bb.0:
34+
liveins: $x0
35+
$x16 = COPY killed $x0
36+
B %bb.1
37+
38+
bb.1:
39+
liveins: $x16
40+
$x1 = COPY killed $x16
41+
/* MOVi64imm below mimics a pseudo instruction that doesn't have any */
42+
/* unmodelled side effects, but uses x16 as a scratch register. */
43+
$x2 = MOVi64imm 1024, implicit-def dead $x16
44+
$x16 = LDRXroX killed $x1, killed $x2, 0, 0
45+
$xzr = SUBSXri $x16, 0, 0, implicit-def $nzcv
46+
Bcc 1, %bb.1, implicit $nzcv
47+
B %bb.2
48+
49+
bb.2:
50+
liveins: $x1
51+
$x0 = COPY killed $x1
52+
RET_ReallyLR
53+
...
54+
55+
---
56+
name: dead_implicit_def
57+
tracksRegLiveness: true
58+
body: |
59+
; CHECK-LABEL: name: dead_implicit_def
60+
; CHECK: bb.0:
61+
; CHECK-NEXT: successors: %bb.1(0x80000000)
62+
; CHECK-NEXT: liveins: $x0
63+
; CHECK-NEXT: {{ $}}
64+
; CHECK-NEXT: $x12 = COPY killed $x0
65+
; CHECK-NEXT: $x2 = MOVi64imm 1024, implicit-def dead $x16
66+
; CHECK-NEXT: B %bb.1
67+
; CHECK-NEXT: {{ $}}
68+
; CHECK-NEXT: bb.1:
69+
; CHECK-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
70+
; CHECK-NEXT: liveins: $x12, $x2
71+
; CHECK-NEXT: {{ $}}
72+
; CHECK-NEXT: $x1 = COPY killed $x12
73+
; CHECK-NEXT: $x16 = LDRXroX killed $x1, $x2, 0, 0
74+
; CHECK-NEXT: $xzr = SUBSXri $x16, 0, 0, implicit-def $nzcv
75+
; CHECK-NEXT: Bcc 1, %bb.1, implicit $nzcv
76+
; CHECK-NEXT: B %bb.2
77+
; CHECK-NEXT: {{ $}}
78+
; CHECK-NEXT: bb.2:
79+
; CHECK-NEXT: liveins: $x1
80+
; CHECK-NEXT: {{ $}}
81+
; CHECK-NEXT: $x0 = COPY killed $x1
82+
; CHECK-NEXT: RET_ReallyLR
83+
bb.0:
84+
liveins: $x0
85+
$x12 = COPY killed $x0
86+
B %bb.1
87+
88+
bb.1:
89+
liveins: $x12
90+
$x1 = COPY killed $x12
91+
/* MOVi64imm below mimics a pseudo instruction that doesn't have any */
92+
/* unmodelled side effects, but uses x16 as a scratch register. */
93+
$x2 = MOVi64imm 1024, implicit-def dead $x16
94+
$x16 = LDRXroX killed $x1, killed $x2, 0, 0
95+
$xzr = SUBSXri $x16, 0, 0, implicit-def $nzcv
96+
Bcc 1, %bb.1, implicit $nzcv
97+
B %bb.2
98+
99+
bb.2:
100+
liveins: $x1
101+
$x0 = COPY killed $x1
102+
RET_ReallyLR
103+
...

llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,10 @@
44
define amdgpu_kernel void @copy_to_reg_frameindex(ptr addrspace(1) %out, i32 %a, i32 %b, i32 %c) {
55
; CHECK-LABEL: copy_to_reg_frameindex:
66
; CHECK: ; %bb.0: ; %entry
7+
; CHECK-NEXT: s_cmp_lt_u32 0, 16
78
; CHECK-NEXT: ; implicit-def: $vgpr0
89
; CHECK-NEXT: .LBB0_1: ; %loop
910
; CHECK-NEXT: ; =>This Inner Loop Header: Depth=1
10-
; CHECK-NEXT: s_cmp_lt_u32 0, 16
1111
; CHECK-NEXT: s_set_gpr_idx_on 0, gpr_idx(DST)
1212
; CHECK-NEXT: v_mov_b32_e32 v0, 0
1313
; CHECK-NEXT: s_set_gpr_idx_off

llvm/test/CodeGen/AMDGPU/mdt-preserving-crash.ll

Lines changed: 23 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -16,27 +16,28 @@ define protected amdgpu_kernel void @_RSENC_PRInit______________________________
1616
; CHECK-NEXT: v_lshl_add_u32 v0, v0, 1, v0
1717
; CHECK-NEXT: v_cmp_ne_u32_e32 vcc, s4, v0
1818
; CHECK-NEXT: s_and_saveexec_b64 s[4:5], vcc
19-
; CHECK-NEXT: s_cbranch_execz .LBB0_12
19+
; CHECK-NEXT: s_cbranch_execz .LBB0_13
2020
; CHECK-NEXT: ; %bb.1: ; %if.end15
2121
; CHECK-NEXT: s_load_dword s4, s[8:9], 0x0
2222
; CHECK-NEXT: s_waitcnt lgkmcnt(0)
2323
; CHECK-NEXT: s_bitcmp1_b32 s4, 0
2424
; CHECK-NEXT: s_cselect_b64 s[4:5], -1, 0
2525
; CHECK-NEXT: s_and_b64 vcc, exec, s[4:5]
26-
; CHECK-NEXT: s_cbranch_vccnz .LBB0_12
27-
; CHECK-NEXT: .LBB0_2: ; %while.cond.i
28-
; CHECK-NEXT: ; =>This Inner Loop Header: Depth=1
26+
; CHECK-NEXT: s_cbranch_vccnz .LBB0_13
27+
; CHECK-NEXT: ; %bb.2: ; %lor.lhs.false17
2928
; CHECK-NEXT: s_cmp_eq_u32 s4, 0
30-
; CHECK-NEXT: s_cbranch_scc1 .LBB0_2
31-
; CHECK-NEXT: ; %bb.3: ; %if.end60
32-
; CHECK-NEXT: s_cbranch_execz .LBB0_11
33-
; CHECK-NEXT: ; %bb.4: ; %if.end5.i
34-
; CHECK-NEXT: s_cbranch_scc0 .LBB0_11
35-
; CHECK-NEXT: ; %bb.5: ; %if.end5.i314
36-
; CHECK-NEXT: s_cbranch_scc0 .LBB0_11
37-
; CHECK-NEXT: ; %bb.6: ; %if.end5.i338
38-
; CHECK-NEXT: s_cbranch_scc0 .LBB0_11
39-
; CHECK-NEXT: ; %bb.7: ; %if.end5.i362
29+
; CHECK-NEXT: .LBB0_3: ; %while.cond.i
30+
; CHECK-NEXT: ; =>This Inner Loop Header: Depth=1
31+
; CHECK-NEXT: s_cbranch_scc1 .LBB0_3
32+
; CHECK-NEXT: ; %bb.4: ; %if.end60
33+
; CHECK-NEXT: s_cbranch_execz .LBB0_12
34+
; CHECK-NEXT: ; %bb.5: ; %if.end5.i
35+
; CHECK-NEXT: s_cbranch_scc0 .LBB0_12
36+
; CHECK-NEXT: ; %bb.6: ; %if.end5.i314
37+
; CHECK-NEXT: s_cbranch_scc0 .LBB0_12
38+
; CHECK-NEXT: ; %bb.7: ; %if.end5.i338
39+
; CHECK-NEXT: s_cbranch_scc0 .LBB0_12
40+
; CHECK-NEXT: ; %bb.8: ; %if.end5.i362
4041
; CHECK-NEXT: v_mov_b32_e32 v0, 0
4142
; CHECK-NEXT: s_getpc_b64 s[4:5]
4243
; CHECK-NEXT: s_add_u32 s4, s4, _RSENC_gDcd_______________________________@rel32@lo+1157
@@ -46,23 +47,23 @@ define protected amdgpu_kernel void @_RSENC_PRInit______________________________
4647
; CHECK-NEXT: buffer_store_byte v0, v0, s[0:3], 0 offen
4748
; CHECK-NEXT: s_waitcnt vmcnt(1)
4849
; CHECK-NEXT: buffer_store_byte v1, off, s[0:3], 0 offset:257
49-
; CHECK-NEXT: s_cbranch_scc0 .LBB0_11
50-
; CHECK-NEXT: ; %bb.8: ; %if.end5.i400
50+
; CHECK-NEXT: s_cbranch_scc0 .LBB0_12
51+
; CHECK-NEXT: ; %bb.9: ; %if.end5.i400
5152
; CHECK-NEXT: flat_load_ubyte v0, v[0:1]
5253
; CHECK-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0)
5354
; CHECK-NEXT: v_cmp_eq_u16_e32 vcc, 0, v0
5455
; CHECK-NEXT: s_and_b64 exec, exec, vcc
55-
; CHECK-NEXT: s_cbranch_execz .LBB0_11
56-
; CHECK-NEXT: ; %bb.9: ; %if.then404
56+
; CHECK-NEXT: s_cbranch_execz .LBB0_12
57+
; CHECK-NEXT: ; %bb.10: ; %if.then404
5758
; CHECK-NEXT: s_movk_i32 s4, 0x1000
58-
; CHECK-NEXT: .LBB0_10: ; %for.body564
59+
; CHECK-NEXT: .LBB0_11: ; %for.body564
5960
; CHECK-NEXT: ; =>This Inner Loop Header: Depth=1
6061
; CHECK-NEXT: s_sub_i32 s4, s4, 32
6162
; CHECK-NEXT: s_cmp_lg_u32 s4, 0
62-
; CHECK-NEXT: s_cbranch_scc1 .LBB0_10
63-
; CHECK-NEXT: .LBB0_11: ; %UnifiedUnreachableBlock
63+
; CHECK-NEXT: s_cbranch_scc1 .LBB0_11
64+
; CHECK-NEXT: .LBB0_12: ; %UnifiedUnreachableBlock
6465
; CHECK-NEXT: ; divergent unreachable
65-
; CHECK-NEXT: .LBB0_12: ; %UnifiedReturnBlock
66+
; CHECK-NEXT: .LBB0_13: ; %UnifiedReturnBlock
6667
; CHECK-NEXT: s_endpgm
6768
entry:
6869
%runtimeVersionCopy = alloca [128 x i8], align 16, addrspace(5)

llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ define void @test1(ptr nocapture noundef writeonly %dst, i32 noundef signext %i_
7171
; RV32-NEXT: or t1, t1, t3
7272
; RV32-NEXT: andi t1, t1, 1
7373
; RV32-NEXT: slli t2, t2, 1
74+
; RV32-NEXT: csrwi vxrm, 0
7475
; RV32-NEXT: j .LBB0_10
7576
; RV32-NEXT: .LBB0_9: # %for.cond1.for.cond.cleanup3_crit_edge.us
7677
; RV32-NEXT: # in Loop: Header=BB0_10 Depth=1
@@ -93,7 +94,6 @@ define void @test1(ptr nocapture noundef writeonly %dst, i32 noundef signext %i_
9394
; RV32-NEXT: li t3, 0
9495
; RV32-NEXT: neg t4, t2
9596
; RV32-NEXT: and t4, t4, a6
96-
; RV32-NEXT: csrwi vxrm, 0
9797
; RV32-NEXT: li t6, 0
9898
; RV32-NEXT: li t5, 0
9999
; RV32-NEXT: vsetvli s0, zero, e8, m2, ta, ma
@@ -471,6 +471,7 @@ define void @test1(ptr nocapture noundef writeonly %dst, i32 noundef signext %i_
471471
; RV64-NEXT: or t4, t4, t5
472472
; RV64-NEXT: andi t4, t4, 1
473473
; RV64-NEXT: mv t5, a0
474+
; RV64-NEXT: csrwi vxrm, 0
474475
; RV64-NEXT: j .LBB0_6
475476
; RV64-NEXT: .LBB0_5: # %for.cond1.for.cond.cleanup3_crit_edge.us
476477
; RV64-NEXT: # in Loop: Header=BB0_6 Depth=1
@@ -493,7 +494,6 @@ define void @test1(ptr nocapture noundef writeonly %dst, i32 noundef signext %i_
493494
; RV64-NEXT: slli t6, t0, 28
494495
; RV64-NEXT: sub t6, t6, t1
495496
; RV64-NEXT: and t6, t6, a6
496-
; RV64-NEXT: csrwi vxrm, 0
497497
; RV64-NEXT: mv s0, a2
498498
; RV64-NEXT: mv s1, a4
499499
; RV64-NEXT: mv s2, t5

llvm/test/CodeGen/X86/ins_subreg_coalesce-3.ll

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -22,40 +22,41 @@ define void @FontChange(i1 %foo) nounwind {
2222
; CHECK-LABEL: FontChange:
2323
; CHECK: # %bb.0: # %entry
2424
; CHECK-NEXT: testb $1, %dil
25-
; CHECK-NEXT: je .LBB0_9
25+
; CHECK-NEXT: je .LBB0_10
2626
; CHECK-NEXT: .p2align 4
2727
; CHECK-NEXT: .LBB0_1: # %bb366
2828
; CHECK-NEXT: # =>This Inner Loop Header: Depth=1
2929
; CHECK-NEXT: testb $1, %dil
3030
; CHECK-NEXT: jne .LBB0_1
3131
; CHECK-NEXT: # %bb.2: # %bb428
3232
; CHECK-NEXT: testb $1, %dil
33-
; CHECK-NEXT: je .LBB0_9
33+
; CHECK-NEXT: je .LBB0_10
34+
; CHECK-NEXT: # %bb.3:
35+
; CHECK-NEXT: cmpb $0, 0
3436
; CHECK-NEXT: .p2align 4
35-
; CHECK-NEXT: .LBB0_3: # %bb650
37+
; CHECK-NEXT: .LBB0_4: # %bb650
3638
; CHECK-NEXT: # =>This Inner Loop Header: Depth=1
37-
; CHECK-NEXT: cmpb $0, 0
38-
; CHECK-NEXT: je .LBB0_3
39-
; CHECK-NEXT: # %bb.4: # %bb662
39+
; CHECK-NEXT: je .LBB0_4
40+
; CHECK-NEXT: # %bb.5: # %bb662
4041
; CHECK-NEXT: movl 0, %eax
4142
; CHECK-NEXT: movl %eax, %ecx
4243
; CHECK-NEXT: andl $57344, %ecx # imm = 0xE000
4344
; CHECK-NEXT: cmpl $8192, %ecx # imm = 0x2000
44-
; CHECK-NEXT: jne .LBB0_9
45-
; CHECK-NEXT: # %bb.5: # %bb4884
45+
; CHECK-NEXT: jne .LBB0_10
46+
; CHECK-NEXT: # %bb.6: # %bb4884
4647
; CHECK-NEXT: andl $7168, %eax # imm = 0x1C00
4748
; CHECK-NEXT: cmpl $1024, %eax # imm = 0x400
48-
; CHECK-NEXT: jne .LBB0_9
49-
; CHECK-NEXT: # %bb.6: # %bb4932
49+
; CHECK-NEXT: jne .LBB0_10
50+
; CHECK-NEXT: # %bb.7: # %bb4932
5051
; CHECK-NEXT: testb $1, %dil
51-
; CHECK-NEXT: jne .LBB0_9
52-
; CHECK-NEXT: # %bb.7: # %bb4940
52+
; CHECK-NEXT: jne .LBB0_10
53+
; CHECK-NEXT: # %bb.8: # %bb4940
5354
; CHECK-NEXT: movl 0, %eax
5455
; CHECK-NEXT: cmpl $160, %eax
55-
; CHECK-NEXT: je .LBB0_9
56-
; CHECK-NEXT: # %bb.8: # %bb4940
56+
; CHECK-NEXT: je .LBB0_10
57+
; CHECK-NEXT: # %bb.9: # %bb4940
5758
; CHECK-NEXT: cmpl $159, %eax
58-
; CHECK-NEXT: .LBB0_9: # %bb4897
59+
; CHECK-NEXT: .LBB0_10: # %bb4897
5960
; CHECK-NEXT: retq
6061
entry:
6162
br i1 %foo, label %bb298, label %bb49

0 commit comments

Comments
 (0)