Skip to content

Commit a9fada6

Browse files
authored
[SYCL] Fix NVPTX compilation with the new offload driver (#19039)
A typical SYCL compilation for NVPTX with the default `nvptx(64)?-nvidia-cuda` triple compiles for an older SM_50 architecture, relying on forward compatibility and JIT to run on newer devices. Thus compilation for NVPTX with the new offload driver relies on us generating a fat binary containing the textual PTX as well as the compiled object. This is a kind of LTO though not officially considered so. A recent pulldown broke the generation of textual assembly with the `-S` flag, and so we were later passing a compiled ELF binary into `ptxas` which predictably went wrong. Fixes #18432.
1 parent b0523f4 commit a9fada6

File tree

5 files changed

+3
-10
lines changed

5 files changed

+3
-10
lines changed

clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1691,7 +1691,7 @@ Expected<StringRef> linkDevice(ArrayRef<StringRef> InputFiles,
16911691
case Triple::ppc64:
16921692
case Triple::ppc64le:
16931693
case Triple::systemz:
1694-
return generic::clang(InputFiles, Args);
1694+
return generic::clang(InputFiles, Args, IsSYCLKind);
16951695
case Triple::spirv32:
16961696
case Triple::spirv64:
16971697
case Triple::spir:
@@ -1724,7 +1724,7 @@ Expected<StringRef> linkDevice(ArrayRef<StringRef> InputFiles,
17241724
return generic::clang(InputFiles, Args, IsSYCLKind);
17251725
default:
17261726
if (Triple.str() == "native_cpu" && IsSYCLKind)
1727-
return generic::clang(InputFiles, Args);
1727+
return generic::clang(InputFiles, Args, IsSYCLKind);
17281728

17291729
return createStringError(Triple.getArchName() +
17301730
" linking is not supported");

sycl/test-e2e/DeviceImageDependencies/NewOffloadDriver/free_function_kernels.cpp

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
// Ensure -fsycl-allow-device-dependencies can work with free function kernels.
22

3-
// REQUIRES: aspect-usm_shared_allocations, pdtracker
4-
// PDTRACKER: https://github.com/intel/llvm/issues/18432
3+
// REQUIRES: aspect-usm_shared_allocations
54
// RUN: %{build} -o %t.out --offload-new-driver -fsycl-allow-device-image-dependencies
65
// RUN: %{run} %t.out
76

sycl/test-e2e/NewOffloadDriver/multisource.cpp

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,6 @@
55
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
66
//
77
//===----------------------------------------------------------------------===//
8-
// REQUIRES: pdtracker
9-
// PDTRACKER: https://github.com/intel/llvm/issues/18432
108
// Separate kernel sources and host code sources
119
// Test with `--offload-new-driver`
1210
// RUN: %{build} --offload-new-driver -c -o %t.kernel.o -DINIT_KERNEL -DCALC_KERNEL

sycl/test-e2e/NewOffloadDriver/split-per-source-main.cpp

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
// REQUIRES: pdtracker
2-
// PDTRACKER: https://github.com/intel/llvm/issues/18432
31
// RUN: %{build} -Wno-error=unused-command-line-argument -fsycl-device-code-split=per_source -I %S/Inputs -o %t.out %S/Inputs/split-per-source-second-file.cpp \
42
// RUN: --offload-new-driver -fsycl-dead-args-optimization
53
// RUN: %{run} %t.out

sycl/test-e2e/NewOffloadDriver/sycl-external-with-optional-features.cpp

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
// REQUIRES: pdtracker
2-
// PDTRACKER: https://github.com/intel/llvm/issues/18432
31
// Test with `--offload-new-driver`
42
// RUN: %{build} -DSOURCE1 --offload-new-driver -c -o %t1.o
53
// RUN: %{build} -DSOURCE2 --offload-new-driver -c -o %t2.o

0 commit comments

Comments
 (0)