Skip to content

Commit 2137ff0

Browse files
[SYCL] Always do internalization in sycl-post-link (#14976)
We already had `Internalize` pass being launched as part of a module cleanup phase in `sycl-post-link`, but it was only invoked when shared libraries/dynamic linking is enabled. However, there are other features that need this internalization and one (and the only, at least for now) is virtual functions. When virtual functions are used in a program it could be necessary to dynamically link several device images containing SYCL kernels together. The problem with that is that there are ITT instrumentation functions added to modules that have external linkage - they cause multiple definitions error when we try to link two kernels together, because both of them are instrumented. This problem is solved by running `Internalize` pass uncoditionally, but the criteria of what we can internalize depends on whether we enable shared libraries/dynamic linking support: - in regular flow, we can internalize any symbol that is not considered to be an entry point, because all device images are self-contained and there are no dependencies between them. - when dynamic linking is enabled, we should not internalize functions that can be imported/exported, even if they are not considered as module entry points. Tests were updated where necessary to expect or ignore linkage changes of some functions. InvokeSIMD pass had to be updated: ESIMD is handled through two-level device code split, meaning two cleanup phases where each of them could mistakenly drop some compiler-generated functions unless they are properly marked in LLVM IR.
1 parent 3b91b0b commit 2137ff0

22 files changed

+305
-87
lines changed

llvm/lib/SYCLLowerIR/LowerInvokeSimd.cpp

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -415,6 +415,11 @@ bool processInvokeSimdCall(CallInst *InvokeSimd,
415415
CallInst *TheTformedCall = cast<CallInst>(VMap[TheCall]);
416416
TheTformedCall->setCalledFunction(SimdF);
417417
fixFunctionName(NewHelper);
418+
// When we will do ESIMD split, that helper will be moved into ESIMD module
419+
// where it has no uses. To prevent it being internalized and killed by DCE
420+
// during post-split cleanup, we need to add this attribtue and set proper
421+
// linkage.
422+
NewHelper->addFnAttr("referenced-indirectly");
418423
}
419424

420425
// 3. Clone and transform __builtin_invoke_simd call:

llvm/lib/SYCLLowerIR/ModuleSplitter.cpp

Lines changed: 37 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -660,11 +660,36 @@ void ModuleDesc::restoreLinkageOfDirectInvokeSimdTargets() {
660660
}
661661
}
662662

663-
// Predicate for Internalize pass.
664-
bool mustPreserveGV(const GlobalValue &GV) {
665-
if (const Function *F = dyn_cast<Function>(&GV))
666-
if (!canBeImportedFunction(*F))
667-
return false;
663+
// Predicate for Internalize pass. The pass is very aggressive and essentially
664+
// tries to internalize absolutely everything. This function serves as "input
665+
// from a linker" that tells the pass what must be preserved in order to make
666+
// the transformation safe.
667+
static bool mustPreserveGV(const GlobalValue &GV) {
668+
if (const Function *F = dyn_cast<Function>(&GV)) {
669+
// When dynamic linking is supported, we internalize everything that can
670+
// not be imported which also means that there is no point of having it
671+
// visible outside of the current module.
672+
if (SupportDynamicLinking)
673+
return canBeImportedFunction(*F);
674+
675+
// Otherwise, we are being even more aggressive: SYCL modules are expected
676+
// to be self-contained, meaning that they have no external dependencies.
677+
// Therefore, we can internalize every function that is not an entry point.
678+
// One exception here is virtual functions: when they are in use, modules
679+
// are not self-contained anymore and some device images has to be linked
680+
// at runtime to resolve all symbols.
681+
// Functions marked with referenced-indirectly attribute is another
682+
// exception: that attribute was originally introduced for function pointers
683+
// and even though its main usage was deprecated and dropped, it is still
684+
// used in invoke_simd (but that use needs to be revisited).
685+
return F->hasFnAttribute("sycl-entry-point") ||
686+
F->hasFnAttribute("indirectly-callable") ||
687+
F->hasFnAttribute("referenced-indirectly");
688+
}
689+
690+
// Otherwise, we don't have enough information about a global and picking a
691+
// safe side saying that all other globals must be preserved (we should have
692+
// cleaned up unused globals during dependency graph analysis already).
668693
return true;
669694
}
670695

@@ -687,12 +712,17 @@ void ModuleDesc::cleanup() {
687712
F.setLinkage(GlobalValue::LinkageTypes::ExternalLinkage);
688713
}
689714

715+
// Callback for internalize can't be a lambda with captures, so we propagate
716+
// necessary information through the module itself.
717+
if (!SupportDynamicLinking)
718+
for (Function *F : EntryPoints.Functions)
719+
F->addFnAttr("sycl-entry-point");
720+
690721
ModuleAnalysisManager MAM;
691722
MAM.registerPass([&] { return PassInstrumentationAnalysis(); });
692723
ModulePassManager MPM;
693724
// Do cleanup.
694-
if (SupportDynamicLinking)
695-
MPM.addPass(InternalizePass(mustPreserveGV));
725+
MPM.addPass(InternalizePass(mustPreserveGV));
696726
MPM.addPass(GlobalDCEPass()); // Delete unreachable globals.
697727
MPM.addPass(StripDeadDebugInfoPass()); // Remove dead debug info.
698728
MPM.addPass(StripDeadPrototypesPass()); // Remove dead func decls.

llvm/test/SYCLLowerIR/ESIMD/lower_invoke_simd.ll

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -74,18 +74,18 @@ define linkonce_odr dso_local x86_regcallcc <16 x float> @SIMD_CALL_HELPER(ptr n
7474

7575
;---- Check that original SIMD_CALL_HELPER retained, because there are
7676
;---- invoke_simd calls where simd target can't be inferred.
77-
; CHECK: define {{.*}} <16 x float> @SIMD_CALL_HELPER(ptr {{.*}}%{{.*}}, <16 x float> %{{.*}}) #[[HELPER_ATTRS:[0-9]+]] !sycl_explicit_simd !0 !intel_reqd_sub_group_size !1
77+
; CHECK: define weak_odr {{.*}} <16 x float> @SIMD_CALL_HELPER(ptr {{.*}}%{{.*}}, <16 x float> %{{.*}}) #[[HELPER_ATTRS:[0-9]+]] !sycl_explicit_simd !0 !intel_reqd_sub_group_size !1
7878
; CHECK: %{{.*}} = call x86_regcallcc <16 x float> %{{.*}}(<16 x float> %{{.*}})
7979
; CHECK: }
8080

8181
;---- Optimized version for the SIMD_CALLEE call
82-
; CHECK: define {{.*}} <16 x float> @[[NAME1]](<16 x float> %{{.*}}) #[[HELPER_ATTRS]]
82+
; CHECK: define weak_odr {{.*}} <16 x float> @[[NAME1]](<16 x float> %{{.*}}) #[[HELPER_ATTRS1:[0-9]+]]
8383
; Verify that indirect call is converted to direct
8484
; CHECK: %{{.*}} = call x86_regcallcc <16 x float> @SIMD_CALLEE(<16 x float> %{{.*}})
8585
; CHECK: }
8686

8787
;---- Optimized version for the ANOTHER_SIMD_CALLEE call
88-
; CHECK: define {{.*}} <16 x float> @[[NAME2]](<16 x float> %{{.*}}) #[[HELPER_ATTRS]]
88+
; CHECK: define weak_odr {{.*}} <16 x float> @[[NAME2]](<16 x float> %{{.*}}) #[[HELPER_ATTRS1]]
8989
; Verify that indirect call is converted to direct
9090
; CHECK: %{{.*}} = call x86_regcallcc <16 x float> @ANOTHER_SIMD_CALLEE(<16 x float> %{{.*}})
9191
; CHECK: }
@@ -95,6 +95,10 @@ declare dso_local x86_regcallcc noundef float @_Z33__regcall3____builtin_invoke_
9595
; Check that VCStackCall attribute is added to the invoke_simd target functions:
9696
attributes #0 = { "sycl-module-id"="invoke_simd.cpp" }
9797
; CHECK: attributes #[[HELPER_ATTRS]] = { "VCStackCall" "sycl-module-id"="invoke_simd.cpp" }
98+
; If we transformed the helper, then it should receive "referenced-indirectly"
99+
; attribute so it is not dropped after Internalize + DCE in post-split module
100+
; cleanup
101+
; CHECK: attributes #[[HELPER_ATTRS1]] = { "VCStackCall" "referenced-indirectly" "sycl-module-id"="invoke_simd.cpp" }
98102

99103
!0 = !{}
100104
!1 = !{i32 16}

llvm/test/tools/sycl-post-link/device-code-split/auto-module-split-1.ll

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,8 @@ entry:
3434
ret void
3535
}
3636

37-
; CHECK-TU1: define dso_local spir_func void @{{.*}}foo{{.*}}()
38-
; CHECK-TU0-NOT: define dso_local spir_func void @{{.*}}foo{{.*}}()
37+
; CHECK-TU1: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
38+
; CHECK-TU0-NOT: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
3939

4040
; CHECK-TU1: call spir_func i32 @{{.*}}bar{{.*}}(i32 1)
4141

@@ -73,8 +73,8 @@ entry:
7373
ret void
7474
}
7575

76-
; CHECK-TU1: define dso_local spir_func void @{{.*}}foo1{{.*}}()
77-
; CHECK-TU0-NOT: define dso_local spir_func void @{{.*}}foo1{{.*}}()
76+
; CHECK-TU1: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
77+
; CHECK-TU0-NOT: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
7878

7979
; Function Attrs: nounwind
8080
define dso_local spir_func void @_Z4foo1v() {
@@ -97,8 +97,8 @@ entry:
9797
ret void
9898
}
9999

100-
; CHECK-TU1-NOT: define dso_local spir_func void @{{.*}}foo2{{.*}}()
101-
; CHECK-TU0: define dso_local spir_func void @{{.*}}foo2{{.*}}()
100+
; CHECK-TU1-NOT: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
101+
; CHECK-TU0: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
102102

103103
; Function Attrs: nounwind
104104
define dso_local spir_func void @_Z4foo2v() {

llvm/test/tools/sycl-post-link/device-code-split/auto-module-split-2.ll

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,8 @@ entry:
3838
ret void
3939
}
4040

41-
; CHECK-TU1: define dso_local spir_func void @{{.*}}foo{{.*}}()
42-
; CHECK-TU0-NOT: define dso_local spir_func void @{{.*}}foo{{.*}}()
41+
; CHECK-TU1: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
42+
; CHECK-TU0-NOT: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
4343

4444
; CHECK-TU1: call spir_func i32 @{{.*}}bar{{.*}}(i32 1)
4545

@@ -77,8 +77,8 @@ entry:
7777
ret void
7878
}
7979

80-
; CHECK-TU1: define dso_local spir_func void @{{.*}}foo1{{.*}}()
81-
; CHECK-TU0-NOT: define dso_local spir_func void @{{.*}}foo1{{.*}}()
80+
; CHECK-TU1: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
81+
; CHECK-TU0-NOT: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
8282

8383
; Function Attrs: nounwind
8484
define dso_local spir_func void @_Z4foo1v() {
@@ -101,8 +101,8 @@ entry:
101101
ret void
102102
}
103103

104-
; CHECK-TU1-NOT: define dso_local spir_func void @{{.*}}foo2{{.*}}()
105-
; CHECK-TU0: define dso_local spir_func void @{{.*}}foo2{{.*}}()
104+
; CHECK-TU1-NOT: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
105+
; CHECK-TU0: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
106106

107107
; Function Attrs: nounwind
108108
define dso_local spir_func void @_Z4foo2v() {

llvm/test/tools/sycl-post-link/device-code-split/auto-module-split-3.ll

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,13 +33,13 @@
3333
;
3434
; CHECK-TU0-IR: @_ZL2GV = internal addrspace(1) constant
3535
; CHECK-TU0-IR: define dso_local spir_kernel void @_ZTSZ4mainE11TU1_kernel0
36-
; CHECK-TU0-IR: define dso_local spir_func i32 @_Z4foo1v
36+
; CHECK-TU0-IR: define {{.*}} spir_func i32 @_Z4foo1v
3737
; CHECK-TU0-IR: define dso_local spir_kernel void @_ZTSZ4mainE11TU1_kernel1
38-
; CHECK-TU0-IR: define dso_local spir_func void @_Z4foo2v
38+
; CHECK-TU0-IR: define {{.*}} spir_func void @_Z4foo2v
3939
;
4040
; CHECK-TU1-IR: define dso_local spir_kernel void @_ZTSZ4mainE10TU0_kernel
41-
; CHECK-TU1-IR: define dso_local spir_func void @_Z3foov
42-
; CHECK-TU1-IR: define dso_local spir_func i32 @_Z4foo3v
41+
; CHECK-TU1-IR: define {{.*}} spir_func void @_Z3foov
42+
; CHECK-TU1-IR: define {{.*}} spir_func i32 @_Z4foo3v
4343

4444
target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024"
4545
target triple = "spir64-unknown-linux"

llvm/test/tools/sycl-post-link/device-code-split/auto-module-split-func-ptr.ll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
; CHECK-IR0: define dso_local spir_kernel void @kernel2
2020
;
2121
; CHECK-IR1: @_Z2f1iTable = weak global ptr @_Z2f1i
22-
; CHECK-IR1: define dso_local spir_func i32 @_Z2f1i
22+
; CHECK-IR1: define {{.*}} i32 @_Z2f1i
2323
; CHECK-IR1: define weak_odr dso_local spir_kernel void @kernel1
2424

2525
target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-n8:16:32:64"

llvm/test/tools/sycl-post-link/device-code-split/basic-module-split.ll

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,8 @@ entry:
3333
ret void
3434
}
3535

36-
; CHECK-TU1: define dso_local spir_func void @{{.*}}foo{{.*}}()
37-
; CHECK-TU0-NOT: define dso_local spir_func void @{{.*}}foo{{.*}}()
36+
; CHECK-TU1: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
37+
; CHECK-TU0-NOT: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
3838

3939
; CHECK-TU1: call spir_func i32 @{{.*}}bar{{.*}}(i32 1)
4040

@@ -72,8 +72,8 @@ entry:
7272
ret void
7373
}
7474

75-
; CHECK-TU1: define dso_local spir_func void @{{.*}}foo1{{.*}}()
76-
; CHECK-TU0-NOT: define dso_local spir_func void @{{.*}}foo1{{.*}}()
75+
; CHECK-TU1: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
76+
; CHECK-TU0-NOT: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
7777

7878
; Function Attrs: nounwind
7979
define dso_local spir_func void @_Z4foo1v() {
@@ -96,8 +96,8 @@ entry:
9696
ret void
9797
}
9898

99-
; CHECK-TU1-NOT: define dso_local spir_func void @{{.*}}foo2{{.*}}()
100-
; CHECK-TU0: define dso_local spir_func void @{{.*}}foo2{{.*}}()
99+
; CHECK-TU1-NOT: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
100+
; CHECK-TU0: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
101101

102102
; Function Attrs: nounwind
103103
define dso_local spir_func void @_Z4foo2v() {

llvm/test/tools/sycl-post-link/device-code-split/complex-indirect-call-chain.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,12 +72,12 @@
7272
; CHECK0-DAG: define spir_func void @BAZ
7373

7474
; CHECK1-DAG: define spir_kernel void @kernel_B
75-
; CHECK1-DAG: define spir_func i32 @foo
75+
; CHECK1-DAG: define {{.*}}spir_func i32 @foo
7676
; CHECK1-DAG: define spir_func i32 @bar
7777
; CHECK1-DAG: define spir_func void @BAZ
7878

7979
; CHECK2-DAG: define spir_kernel void @kernel_A
80-
; CHECK2-DAG: define spir_func void @baz
80+
; CHECK2-DAG: define {{.*}}spir_func void @baz
8181

8282
target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-n8:16:32:64"
8383
target triple = "spir64-unknown-unknown"

llvm/test/tools/sycl-post-link/device-code-split/one-kernel-per-module.ll

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -39,9 +39,9 @@ entry:
3939
ret void
4040
}
4141

42-
; CHECK-MODULE2: define dso_local spir_func void @{{.*}}foo{{.*}}()
43-
; CHECK-MODULE1-NOT: define dso_local spir_func void @{{.*}}foo{{.*}}()
44-
; CHECK-MODULE0-NOT: define dso_local spir_func void @{{.*}}foo{{.*}}()
42+
; CHECK-MODULE2: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
43+
; CHECK-MODULE1-NOT: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
44+
; CHECK-MODULE0-NOT: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
4545

4646
; CHECK-MODULE2: call spir_func i32 @{{.*}}bar{{.*}}(i32 1)
4747

@@ -82,9 +82,9 @@ entry:
8282
ret void
8383
}
8484

85-
; CHECK-MODULE2-NOT: define dso_local spir_func void @{{.*}}foo1{{.*}}()
86-
; CHECK-MODULE1: define dso_local spir_func void @{{.*}}foo1{{.*}}()
87-
; CHECK-MODULE0-NOT: define dso_local spir_func void @{{.*}}foo1{{.*}}()
85+
; CHECK-MODULE2-NOT: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
86+
; CHECK-MODULE1: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
87+
; CHECK-MODULE0-NOT: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
8888

8989
; Function Attrs: nounwind
9090
define dso_local spir_func void @_Z4foo1v() {
@@ -109,9 +109,9 @@ entry:
109109
ret void
110110
}
111111

112-
; CHECK-MODULE2-NOT: define dso_local spir_func void @{{.*}}foo2{{.*}}()
113-
; CHECK-MODULE1-NOT: define dso_local spir_func void @{{.*}}foo2{{.*}}()
114-
; CHECK-MODULE0: define dso_local spir_func void @{{.*}}foo2{{.*}}()
112+
; CHECK-MODULE2-NOT: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
113+
; CHECK-MODULE1-NOT: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
114+
; CHECK-MODULE0: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
115115

116116
; Function Attrs: nounwind
117117
define dso_local spir_func void @_Z4foo2v() {

0 commit comments

Comments
 (0)