Skip to content

Commit 9a1612a

Browse files
author
iclsrc
committed
Merge from 'sycl' to 'sycl-web'
2 parents 23c3b27 + 62c36e9 commit 9a1612a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+380
-397
lines changed

clang/include/clang/Basic/AttrDocs.td

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -378,7 +378,7 @@ outlining job:
378378

379379
int foo(int x) { return ++x; }
380380

381-
using namespace cl::sycl;
381+
using namespace sycl;
382382
queue Q;
383383
buffer<int, 1> a(range<1>{1024});
384384
Q.submit([&](handler& cgh) {
@@ -3790,7 +3790,7 @@ cannot be optimized out due to reachability analysis or by any other
37903790
optimization.
37913791

37923792
This attribute allows to pass name and address of the function to a special
3793-
``cl::sycl::intel::get_device_func_ptr`` API call which extracts the device
3793+
``sycl::intel::get_device_func_ptr`` API call which extracts the device
37943794
function pointer for the specified function.
37953795

37963796
.. code-block:: c++

llvm/lib/SYCLLowerIR/LowerWGScope.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -780,7 +780,7 @@ PreservedAnalyses SYCLLowerWGScopePass::run(Function &F,
780780
I = I->getNextNode()) {
781781
auto *AllocaI = dyn_cast<AllocaInst>(I);
782782
// Allocas marked with "work_item_scope" are those originating from
783-
// cl::sycl::private_memory<T> variables, which must be in private memory.
783+
// sycl::private_memory<T> variables, which must be in private memory.
784784
// No shadows/materialization is needed for them because they can be
785785
// updated only within PFWIs
786786
if (AllocaI && !AllocaI->getMetadata(WI_SCOPE_MD))

sycl/doc/EnvironmentVariables.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ compiler and runtime.
88
| Environment variable | Values | Description |
99
| -------------------- | ------ | ----------- |
1010
| `SYCL_BE` (deprecated) | `PI_OPENCL`, `PI_LEVEL_ZERO`, `PI_CUDA` | Force SYCL RT to consider only devices of the specified backend during the device selection. We are planning to deprecate `SYCL_BE` environment variable in the future. The specific grace period is not decided yet. Please use the new env var `SYCL_DEVICE_FILTER` instead. |
11-
| `SYCL_DEVICE_TYPE` (deprecated) | CPU, GPU, ACC, HOST | Force SYCL to use the specified device type. If unset, default selection rules are applied. If set to any unlisted value, this control has no effect. If the requested device type is not found, a `cl::sycl::runtime_error` exception is thrown. If a non-default device selector is used, a device must satisfy both the selector and this control to be chosen. This control only has effect on devices created with a selector. We are planning to deprecate `SYCL_DEVICE_TYPE` environment variable in the future. The specific grace period is not decided yet. Please use the new env var `SYCL_DEVICE_FILTER` instead. |
11+
| `SYCL_DEVICE_TYPE` (deprecated) | CPU, GPU, ACC, HOST | Force SYCL to use the specified device type. If unset, default selection rules are applied. If set to any unlisted value, this control has no effect. If the requested device type is not found, a `sycl::runtime_error` exception is thrown. If a non-default device selector is used, a device must satisfy both the selector and this control to be chosen. This control only has effect on devices created with a selector. We are planning to deprecate `SYCL_DEVICE_TYPE` environment variable in the future. The specific grace period is not decided yet. Please use the new env var `SYCL_DEVICE_FILTER` instead. |
1212
| `SYCL_DEVICE_FILTER` | `backend:device_type:device_num` | See Section [`SYCL_DEVICE_FILTER`](#sycl_device_filter) below. |
1313
| `SYCL_DEVICE_ALLOWLIST` | See [below](#sycl_device_allowlist) | Filter out devices that do not match the pattern specified. `BackendName` accepts `host`, `opencl`, `level_zero` or `cuda`. `DeviceType` accepts `host`, `cpu`, `gpu` or `acc`. `DeviceVendorId` accepts uint32_t in hex form (`0xXYZW`). `DriverVersion`, `PlatformVersion`, `DeviceName` and `PlatformName` accept regular expression. Special characters, such as parenthesis, must be escaped. DPC++ runtime will select only those devices which satisfy provided values above and regex. More than one device can be specified using the piping symbol "\|".|
1414
| `SYCL_DISABLE_PARALLEL_FOR_RANGE_ROUNDING` | Any(\*) | Disables automatic rounding-up of `parallel_for` invocation ranges. |
@@ -107,7 +107,7 @@ variables in production code.</span>
107107
| `SYCL_DEVICELIB_INHIBIT_NATIVE` | String of device library extensions (separated by a whitespace) | Do not rely on device native support for devicelib extensions listed in this option. |
108108
| `SYCL_PROGRAM_COMPILE_OPTIONS` | String of valid OpenCL compile options | Override compile options for all programs. |
109109
| `SYCL_PROGRAM_LINK_OPTIONS` | String of valid OpenCL link options | Override link options for all programs. |
110-
| `SYCL_USE_KERNEL_SPV` | Path to the SPIR-V binary | Load device image from the specified file. If runtime is unable to read the file, `cl::sycl::runtime_error` exception is thrown.|
110+
| `SYCL_USE_KERNEL_SPV` | Path to the SPIR-V binary | Load device image from the specified file. If runtime is unable to read the file, `sycl::runtime_error` exception is thrown.|
111111
| `SYCL_DUMP_IMAGES` | Any(\*) | Dump device image binaries to file. Control has no effect if `SYCL_USE_KERNEL_SPV` is set. |
112112
| `SYCL_HOST_UNIFIED_MEMORY` | Integer | Enforce host unified memory support or lack of it for the execution graph builder. If set to 0, it is enforced as not supported by all devices. If set to 1, it is enforced as supported by all devices. |
113113
| `SYCL_CACHE_TRACE` | Any(\*) | If the variable is set, messages are sent to std::cerr when caching events or non-blocking failures happen (e.g. unable to access cache item file). |

sycl/doc/FAQ.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ C:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt\crtdbg.h(607,26
128128
> beyond those explicitly mentioned as usable in kernels in this spec.
129129
130130
Replace usage of STD built-ins with SYCL-defined math built-ins. Please, note
131-
that you have to explicitly specify built-in namespace (i.e. `cl::sycl::fmin`).
131+
that you have to explicitly specify built-in namespace (i.e. `sycl::fmin`).
132132
The full list of SYCL math built-ins is provided in section 4.13.3 of the
133133
specification.
134134

sycl/doc/GetStartedGuide.md

Lines changed: 23 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -557,29 +557,29 @@ Creating a file `simple-sycl-app.cpp` with the following C++/SYCL code:
557557
558558
int main() {
559559
// Creating buffer of 4 ints to be used inside the kernel code
560-
cl::sycl::buffer<cl::sycl::cl_int, 1> Buffer(4);
560+
sycl::buffer<sycl::cl_int, 1> Buffer(4);
561561
562562
// Creating SYCL queue
563-
cl::sycl::queue Queue;
563+
sycl::queue Queue;
564564
565565
// Size of index space for kernel
566-
cl::sycl::range<1> NumOfWorkItems{Buffer.size()};
566+
sycl::range<1> NumOfWorkItems{Buffer.size()};
567567
568568
// Submitting command group(work) to queue
569-
Queue.submit([&](cl::sycl::handler &cgh) {
569+
Queue.submit([&](sycl::handler &cgh) {
570570
// Getting write only access to the buffer on a device
571-
auto Accessor = Buffer.get_access<cl::sycl::access::mode::write>(cgh);
571+
auto Accessor = Buffer.get_access<sycl::access::mode::write>(cgh);
572572
// Executing kernel
573573
cgh.parallel_for<class FillBuffer>(
574-
NumOfWorkItems, [=](cl::sycl::id<1> WIid) {
574+
NumOfWorkItems, [=](sycl::id<1> WIid) {
575575
// Fill buffer with indexes
576-
Accessor[WIid] = (cl::sycl::cl_int)WIid.get(0);
576+
Accessor[WIid] = (sycl::cl_int)WIid.get(0);
577577
});
578578
});
579579
580580
// Getting read only access to the buffer on the host.
581581
// Implicit barrier waiting for queue to complete the work.
582-
const auto HostAccessor = Buffer.get_access<cl::sycl::access::mode::read>();
582+
const auto HostAccessor = Buffer.get_access<sycl::access::mode::read>();
583583
584584
// Check the results
585585
bool MismatchFound = false;
@@ -704,36 +704,36 @@ SYCL_BE=PI_CUDA ./simple-sycl-app-cuda.exe
704704
```
705705
706706
**NOTE**: DPC++/SYCL developers can specify SYCL device for execution using
707-
device selectors (e.g. `cl::sycl::cpu_selector`, `cl::sycl::gpu_selector`,
707+
device selectors (e.g. `sycl::cpu_selector`, `sycl::gpu_selector`,
708708
[Intel FPGA selector(s)](extensions/supported/sycl_ext_intel_fpga_device_selector.md)) as
709709
explained in following section [Code the program for a specific
710710
GPU](#code-the-program-for-a-specific-gpu).
711711
712712
### Code the program for a specific GPU
713713
714-
To specify OpenCL device SYCL provides the abstract `cl::sycl::device_selector`
714+
To specify OpenCL device SYCL provides the abstract `sycl::device_selector`
715715
class which the can be used to define how the runtime should select the best
716716
device.
717717
718-
The method `cl::sycl::device_selector::operator()` of the SYCL
719-
`cl::sycl::device_selector` is an abstract member function which takes a
718+
The method `sycl::device_selector::operator()` of the SYCL
719+
`sycl::device_selector` is an abstract member function which takes a
720720
reference to a SYCL device and returns an integer score. This abstract member
721721
function can be implemented in a derived class to provide a logic for selecting
722722
a SYCL device. SYCL runtime uses the device for with the highest score is
723-
returned. Such object can be passed to `cl::sycl::queue` and `cl::sycl::device`
723+
returned. Such object can be passed to `sycl::queue` and `sycl::device`
724724
constructors.
725725
726-
The example below illustrates how to use `cl::sycl::device_selector` to create
726+
The example below illustrates how to use `sycl::device_selector` to create
727727
device and queue objects bound to Intel GPU device:
728728
729729
```c++
730730
#include <sycl/sycl.hpp>
731731
732732
int main() {
733-
class NEOGPUDeviceSelector : public cl::sycl::device_selector {
733+
class NEOGPUDeviceSelector : public sycl::device_selector {
734734
public:
735-
int operator()(const cl::sycl::device &Device) const override {
736-
using namespace cl::sycl::info;
735+
int operator()(const sycl::device &Device) const override {
736+
using namespace sycl::info;
737737
738738
const std::string DeviceName = Device.get_info<device::name>();
739739
const std::string DeviceVendor = Device.get_info<device::vendor>();
@@ -744,9 +744,9 @@ int main() {
744744
745745
NEOGPUDeviceSelector Selector;
746746
try {
747-
cl::sycl::queue Queue(Selector);
748-
cl::sycl::device Device(Selector);
749-
} catch (cl::sycl::invalid_parameter_error &E) {
747+
sycl::queue Queue(Selector);
748+
sycl::device Device(Selector);
749+
} catch (sycl::invalid_parameter_error &E) {
750750
std::cout << E.what() << std::endl;
751751
}
752752
}
@@ -757,10 +757,10 @@ The device selector below selects an NVIDIA device only, and won't execute if
757757
there is none.
758758
759759
```c++
760-
class CUDASelector : public cl::sycl::device_selector {
760+
class CUDASelector : public sycl::device_selector {
761761
public:
762-
int operator()(const cl::sycl::device &Device) const override {
763-
using namespace cl::sycl::info;
762+
int operator()(const sycl::device &Device) const override {
763+
using namespace sycl::info;
764764
const std::string DriverVersion = Device.get_info<device::driver_version>();
765765
766766
if (Device.is_gpu() && (DriverVersion.find("CUDA") != std::string::npos)) {

sycl/doc/MultiTileCardWithLevelZero.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,8 +46,8 @@ The root-device in such cases can be partitioned to sub-devices, each correspond
4646
``` C++
4747
try {
4848
vector<device> SubDevices = RootDevice.create_sub_devices<
49-
cl::sycl::info::partition_property::partition_by_affinity_domain>(
50-
cl::sycl::info::partition_affinity_domain::next_partitionable);
49+
sycl::info::partition_property::partition_by_affinity_domain>(
50+
sycl::info::partition_affinity_domain::next_partitionable);
5151
}
5252
```
5353

sycl/doc/design/CompilerAndRuntimeDesign.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ work:
9090
int foo(int x) { return ++x; }
9191
int bar(int x) { throw std::exception{"CPU code only!"}; }
9292
...
93-
using namespace cl::sycl;
93+
using namespace sycl;
9494
queue Q;
9595
buffer<int, 1> a{range<1>{1024}};
9696
Q.submit([&](handler& cgh) {
@@ -103,17 +103,17 @@ Q.submit([&](handler& cgh) {
103103
```
104104
105105
In this example, the compiler needs to compile the lambda expression passed
106-
to the `cl::sycl::handler::parallel_for` method, as well as the function `foo`
106+
to the `sycl::handler::parallel_for` method, as well as the function `foo`
107107
called from the lambda expression for the device.
108108
109109
The compiler must also ignore the `bar` function when we compile the
110110
"device" part of the single source code, as it's unused inside the device
111111
portion of the source code (the contents of the lambda expression passed to the
112-
`cl::sycl::handler::parallel_for` and any function called from this lambda
112+
`sycl::handler::parallel_for` and any function called from this lambda
113113
expression).
114114
115115
The current approach is to use the SYCL kernel attribute in the runtime to
116-
mark code passed to `cl::sycl::handler::parallel_for` as "kernel functions".
116+
mark code passed to `sycl::handler::parallel_for` as "kernel functions".
117117
The runtime library can't mark foo as "device" code - this is a compiler
118118
job: to traverse all symbols accessible from kernel functions and add them to
119119
the "device part" of the code marking them with the new SYCL device attribute.

sycl/doc/design/KernelParameterPassing.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ int main()
6060

6161
myQueue.submit([&](handler &cgh) {
6262
auto outAcc = outBuf.get_access<access::mode::write>(cgh);
63-
cgh.parallel_for<class Worker>(num_items, [=](cl::sycl::id<1> index) {
63+
cgh.parallel_for<class Worker>(num_items, [=](sycl::id<1> index) {
6464
outAcc[index] = i + s.m;
6565
});
6666
});
@@ -192,7 +192,7 @@ are copied into the array within the local capture object.
192192

193193
myQueue.submit([&](handler &cgh) {
194194
auto outAcc = outBuf.get_access<access::mode::write>(cgh);
195-
cgh.parallel_for<class Worker>(num_items, [=](cl::sycl::id<1> index) {
195+
cgh.parallel_for<class Worker>(num_items, [=](sycl::id<1> index) {
196196
outAcc[index] = array[index.get(0)];
197197
});
198198
});
@@ -264,7 +264,7 @@ of each accessor array element in ascending index value.
264264
in_buffer2.get_access<access::mode::read>(cgh)};
265265
auto outAcc = out_buffer.get_access<access::mode::write>(cgh);
266266

267-
cgh.parallel_for<class Worker>(num_items, [=](cl::sycl::id<1> index) {
267+
cgh.parallel_for<class Worker>(num_items, [=](sycl::id<1> index) {
268268
outAcc[index] = inAcc[0][index] + inAcc[1][index];
269269
});
270270
});
@@ -356,7 +356,7 @@ in a manner similar to other instances of accessor arrays.
356356
};
357357
auto outAcc = out_buffer.get_access<access::mode::write>(cgh);
358358

359-
cgh.parallel_for<class Worker>(num_items, [=](cl::sycl::id<1> index) {
359+
cgh.parallel_for<class Worker>(num_items, [=](sycl::id<1> index) {
360360
outAcc[index] = s.m + s.inAcc[0][index] + s.inAcc[1][index];
361361
});
362362
});

sycl/doc/design/LinkedAllocations.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,33 +9,33 @@ Instead, memory is allocated in each context whenever the SYCL memory object
99
is first accessed there:
1010

1111
```
12-
cl::sycl::buffer<int, 1> buf{cl::sycl::range<1>(1)}; // No allocation here
12+
sycl::buffer<int, 1> buf{sycl::range<1>(1)}; // No allocation here
1313
14-
cl::sycl::queue q;
15-
q.submit([&](cl::sycl::handler &cgh){
14+
sycl::queue q;
15+
q.submit([&](sycl::handler &cgh){
1616
// First access to buf in q's context: allocate memory
17-
auto acc = buf.get_access<cl::sycl::access::mode::read_write>(cgh);
17+
auto acc = buf.get_access<sycl::access::mode::read_write>(cgh);
1818
...
1919
});
2020
2121
// First access to buf on host (assuming q is not host): allocate memory
22-
auto acc = buf.get_access<cl::sycl::access::mode::read_write>();
22+
auto acc = buf.get_access<sycl::access::mode::read_write>();
2323
```
2424

2525
In the DPCPP execution graph these allocations are represented by allocation
26-
command nodes (`cl::sycl::detail::AllocaCommand`). A finished allocation
26+
command nodes (`sycl::detail::AllocaCommand`). A finished allocation
2727
command means that the associated memory object is ready for its first use in
2828
that context, but for host allocation commands it might be the case that no
2929
actual memory allocation takes place: either because it is possible to reuse the
3030
data pointer provided by the user:
3131

3232
```
3333
int val;
34-
cl::sycl::buffer<int, 1> buf{&val, cl::sycl::range<1>(1)};
34+
sycl::buffer<int, 1> buf{&val, sycl::range<1>(1)};
3535
3636
// An alloca command is created, but it does not allocate new memory: &val
3737
// is reused instead.
38-
auto acc = buf.get_access<cl::sycl::access::mode::read_write>();
38+
auto acc = buf.get_access<sycl::access::mode::read_write>();
3939
```
4040

4141
Or because a mapped host pointer obtained from a native device memory object

sycl/doc/design/OptionalDeviceFeatures.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -403,9 +403,9 @@ name. The format looks like this:
403403

404404
```
405405
!intel_types_that_use_aspects = !{!0, !1, !2}
406-
!0 = !{!"class.cl::sycl::detail::half_impl::half", i32 8}
407-
!1 = !{!"class.cl::sycl::amx_type", i32 9}
408-
!2 = !{!"class.cl::sycl::other_type", i32 8, i32 9}
406+
!0 = !{!"class.sycl::detail::half_impl::half", i32 8}
407+
!1 = !{!"class.sycl::amx_type", i32 9}
408+
!2 = !{!"class.sycl::other_type", i32 8, i32 9}
409409
```
410410

411411
The value of the `!intel_types_that_use_aspects` metadata is a list of unnamed
@@ -415,8 +415,8 @@ starts with a string giving the name of the type which is followed by a list of
415415
`i32` constants where each constant is a value from `enum class aspect` telling
416416
the numerical value of an aspect from the type's
417417
`[[sycl_detail::uses_aspects()]]` attribute. In the example above, the type
418-
`cl::sycl::detail::half_impl::half` uses an aspect whose numerical value is
419-
`8` and the type `cl::sycl::other_type` uses two aspects `8` and `9`.
418+
`sycl::detail::half_impl::half` uses an aspect whose numerical value is
419+
`8` and the type `sycl::other_type` uses two aspects `8` and `9`.
420420

421421
**NOTE**: The reason we choose this representation is because LLVM IR does not
422422
allow metadata to be attached directly to types. This representation works

0 commit comments

Comments
 (0)