Skip to content

Commit 051b0c0

Browse files
Merge branch 'main' into mem_obj_proposal
2 parents 8706202 + bffad0c commit 051b0c0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

62 files changed

+15084
-1463
lines changed

SECURITY.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
# Security Policy
22

3+
Intel is committed to rapidly addressing security vulnerabilities affecting our customers
4+
and providing clear guidance on the solution, impact, severity and mitigation.
5+
36
## Report a Vulnerability
47

58
Please report security issues or vulnerabilities to the [Intel Security Center].

include/ur.py

Lines changed: 462 additions & 22 deletions
Large diffs are not rendered by default.

include/ur_api.h

Lines changed: 1696 additions & 435 deletions
Large diffs are not rendered by default.

include/ur_ddi.h

Lines changed: 367 additions & 24 deletions
Large diffs are not rendered by default.

scripts/core/CONTRIB.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,8 @@ To submit a pull request to Unified Runtime, you must first create your own
9595
personal fork of the project and submit your changes to a branch. By convention
9696
we name our branches ``<your_name>/<short_description>``, where the description
9797
indicates the intent of your change. You can then raise a pull request
98-
targeting ``oneapi-src/unified-runtime:main``.
98+
targeting ``oneapi-src/unified-runtime:main``. Please add the *experimental*
99+
label to you pull request.
99100

100101
When making changes to the specification you *must* commit all changes to files
101102
in the repository as a result of `Generating Source`_.

scripts/core/EXP-BINDLESS-IMAGES.rst

Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
2+
<%
3+
OneApi=tags['$OneApi']
4+
x=tags['$x']
5+
X=x.upper()
6+
%>
7+
8+
.. _exp-bindless-images:
9+
10+
================================================================================
11+
Bindless Images
12+
================================================================================
13+
14+
.. warning::
15+
16+
Experimental features:
17+
18+
* May be replaced, updated, or removed at any time.
19+
* Do not require maintaining API/ABI stability of their own additions over
20+
time.
21+
* Do not require conformance testing of their own additions.
22+
23+
================================================================================
24+
Terminology
25+
================================================================================
26+
For the purposes of this document, a bindless image is one which provides
27+
access to the underlying data via image reference handles. At the application
28+
level, this allows the user to implement programs where the number of images
29+
is not known at compile-time, and store all handles to images -- irrespective
30+
of varying formats and layouts -- in some container, e.g. a dynamic array.
31+
32+
================================================================================
33+
Motivation
34+
================================================================================
35+
The `DPC++ bindless images extension <https://github.com/intel/llvm/pull/8307>`_
36+
has sought to provide the flexibility of bindless images at the SYCL
37+
application level. This extension has been implemented using the CUDA backend of
38+
the DPC++ PI. With the movement to migrate from PI to the Unified Runtime in
39+
DPC++, as seen in `Port CUDA plugin to Unified Runtime
40+
<https://github.com/intel/llvm/pull/9512/>`_, the Unified Runtime's support for
41+
this experimental feature would enable the DPC++ bindless images extension to be
42+
migrated to UR without issue.
43+
44+
================================================================================
45+
Overview
46+
================================================================================
47+
48+
In this document, we propose the following experimental additions to the Unified
49+
Runtime:
50+
51+
* Bindless images support
52+
53+
* Sampled images
54+
* Unsampled images
55+
* Mipmaps
56+
* USM backed images
57+
58+
* Interoperability support
59+
60+
* External memory
61+
* Semaphores
62+
63+
================================================================================
64+
API
65+
================================================================================
66+
67+
--------------------------------------------------------------------------------
68+
Definitions
69+
--------------------------------------------------------------------------------
70+
71+
* ${x}_exp_sampler_mip_properties_t
72+
73+
The following definitions will be implementation-dependent
74+
75+
* ${x}_exp_image_handle_t
76+
* ${x}_exp_image_mem_handle_t
77+
* ${x}_exp_interop_mem_handle_t
78+
* ${x}_exp_interop_semaphore_handle_t
79+
80+
--------------------------------------------------------------------------------
81+
Enums
82+
--------------------------------------------------------------------------------
83+
84+
* ${x}_device_info_t
85+
* ${x}_command_t
86+
* ${x}_exp_image_copy_flags_t
87+
88+
--------------------------------------------------------------------------------
89+
Interface
90+
--------------------------------------------------------------------------------
91+
92+
* USM
93+
* ${x}USMPitchedAllocExp
94+
95+
* Bindless Images
96+
* ${x}BindlessImagesUnsampledImageHandleDestroyExp
97+
* ${x}BindlessImagesSampledImageHandleDestroyExp
98+
* ${x}BindlessImagesImageAllocateExp
99+
* ${x}BindlessImagesImageFreeExp
100+
* ${x}BindlessImagesUnsampledImageCreateExp
101+
* ${x}BindlessImagesSampledImageCreateExp
102+
* ${x}BindlessImagesImageCopyExp
103+
* ${x}BindlessImagesImageGetInfoExp
104+
* ${x}BindlessImagesMipmapGetLevelExp
105+
* ${x}BindlessImagesMipmapFreeExp
106+
107+
* Interop
108+
* ${x}BindlessImagesImportOpaqueFDExp
109+
* ${x}BindlessImagesMapExternalArrayExp
110+
* ${x}BindlessImagesReleaseInteropExp
111+
* ${x}BindlessImagesImportExternalSemaphoreOpaqueFDExp
112+
* ${x}BindlessImagesDestroyExternalSemaphoreExp
113+
* ${x}BindlessImagesWaitExternalSemaphoreExp
114+
* ${x}BindlessImagesSignalExternalSemaphoreExp
115+
116+
117+
================================================================================
118+
Changelog
119+
================================================================================
120+
121+
+-----------+------------------------+
122+
| Revision | Changes |
123+
+===========+========================+
124+
| 1 | Intial Draft |
125+
+-----------+------------------------+
126+
127+
================================================================================
128+
Contributors
129+
================================================================================
130+
131+
* Isaac Ault `isaac.ault@codeplay.com <isaac.ault@codeplay.com>`_
132+
* Duncan Brawley `duncan.brawley@codeplay.com <duncan.brawley@codeplay.com>`_
133+
* Przemek Malon `przemek.malon@codeplay.com <przemek.malon@codeplay.com>`_
134+
* Chedy Najjar `chedy.najjar@codeplay.com <chedy.najjar@codeplay.com>`_
135+
* Sean Stirling `sean.stirling@codeplay.com <sean.stirling@codeplay.com>`_
136+
* Peter Zuzek `peter@codeplay.com peter@codeplay.com <peter@codeplay.com>`_

scripts/core/EXP-COMMAND-BUFFER.rst

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
2+
<%
3+
OneApi=tags['$OneApi']
4+
x=tags['$x']
5+
X=x.upper()
6+
%>
7+
.. _experimental-command-buffer:
8+
9+
==============
10+
Command-Buffer
11+
==============
12+
13+
.. warning::
14+
15+
Experimental features:
16+
17+
* May be replaced, updated, or removed at any time.
18+
* Do not require maintaining API/ABI stability of their own additions over
19+
time.
20+
* Do not require conformance testing of their own additions.
21+
22+
23+
A command-buffer represents a series of commands for execution on a command
24+
queue. Many adapters support this kind of construct either natively or through
25+
extensions, but they are not available to use directly. Typically their use is
26+
abstracted through the existing Core APIs, for example when calling
27+
${x}EnqueueKernelLaunch the adapter may both append the kernel command to a
28+
command-buffer-like construct and also submit that command-buffer to a queue for
29+
execution. These types of structures allow for batching of commands to improve
30+
host launch latency, but without direct control it falls to the adapter
31+
implementation to implement automatic batching of commands.
32+
33+
This experimental feature exposes command-buffers in the Unified Runtime API
34+
directly, allowing applications explicit control over the enqueue and execution
35+
of commands to batch commands as required for optimal performance.
36+
37+
Querying Command-Buffer Support
38+
===============================
39+
40+
Support for command-buffers can be queried for a given device/adapter by using
41+
the device info query with ${X}_DEVICE_INFO_EXTENSIONS. Adapters supporting this
42+
experimental feature will report the string "ur_exp_command_buffer" in the
43+
returned list of supported extensions.
44+
45+
.. hint::
46+
The macro ${X}_COMMAND_BUFFER_EXTENSION_STRING_EXP is defined for the string
47+
returned from extension queries for this feature. Since the actual string
48+
may be subject to change it is safer to use this macro when querying for
49+
support for this experimental feature.
50+
51+
.. parsed-literal::
52+
53+
// Retrieve length of extension string
54+
size_t returnedSize;
55+
${x}DeviceGetInfo(hDevice, ${X}_DEVICE_INFO_EXTENSIONS, 0, nullptr,
56+
&returnedSize);
57+
58+
// Retrieve extension string
59+
std::unique_ptr<char[]> returnedExtensions(new char[returnedSize]);
60+
${x}DeviceGetInfo(hDevice, ${X}_DEVICE_INFO_EXTENSIONS, returnedSize, returnedExtensions.get(), nullptr);
61+
62+
std::string_view ExtensionsString(returnedExtensions.get());
63+
bool CmdBufferSupport =
64+
ExtensionsString.find(${X}_COMMAND_BUFFER_EXTENSION_STRING_EXP)
65+
!= std::string::npos;
66+
67+
Command-Buffer Creation
68+
=======================
69+
70+
Command-Buffers are tied to a specific ${x}_context_handle_t and
71+
${x}_device_handle_t. ${x}CommandBufferCreateExp optionally takes a descriptor
72+
to provide additional properties for how the command-buffer should be
73+
constructed. There are currently no unique members defined for
74+
${x}_exp_command_buffer_desc_t, however they may be added in the future.
75+
76+
Command-buffers are reference counted and can be retained and released by
77+
calling ${x}CommandBufferRetainExp and ${x}CommandBufferReleaseExp respectively.
78+
79+
Appending Commands
80+
==================
81+
82+
Commands can be appended to a command-buffer by calling any of the
83+
command-buffer append functions. Typically these closely mimic the existing
84+
enqueue functions in the Core API in terms of their command-specific parameters.
85+
However, they differ in that they take a command-buffer handle instead of a
86+
queue handle, and the dependencies and return parameters are sync-points instead
87+
of event handles.
88+
89+
Currently only the following commands are supported:
90+
91+
* ${x}CommandBufferAppendKernelLaunchExp
92+
* ${x}CommandBufferAppendMemcpyUSMExp
93+
* ${x}CommandBufferAppendMembufferCopyExp
94+
* ${x}CommandBufferAppendMembufferCopyRectExp
95+
96+
It is planned to eventually support any command type from the Core API which can
97+
actually be appended to the equiavalent adapter native constructs.
98+
99+
Sync-Points
100+
===========
101+
102+
A sync-point is a value which represents a command inside of a command-buffer
103+
which is returned from command-buffer append function calls. These can be
104+
optionally passed to these functions to define execution dependencies on other
105+
commands within the command-buffer.
106+
107+
Sync-points are unique and valid for use only within the command-buffer they
108+
were obtained from.
109+
110+
.. parsed-literal::
111+
// Append a memcpy with no sync-point dependencies
112+
${x}_exp_command_buffer_sync_point_t syncPoint;
113+
114+
${x}CommandBufferAppendMemcpyUSMExp(hCommandBuffer, pDst, pSrc, size, 0, nullptr, &syncPoint);
115+
116+
// Append a kernel launch with syncPoint as a dependency, ignore returned
117+
// sync-point
118+
${x}CommandBufferAppendKernelLaunchExp(hCommandBuffer, hKernel, workDim, pGlobalWorkOffset, pGlobalWorkSize, pLocalWorkSize, 1, &syncPoint, nullptr);
119+
120+
Enqueueing Command-Buffers
121+
==========================
122+
123+
Command-buffers are submitted for execution on a ${x}_queue_handle_t with an
124+
optional list of dependent events. An event is returned which tracks the
125+
execution of the command-buffer, and will be complete when all appended commands
126+
have finished executing. It is adapter specific whether command-buffers can be
127+
enqueued or executed simultaneously, and submissions may be serialized.
128+
129+
.. parsed-literal::
130+
${x}_event_handle_t executionEvent;
131+
132+
${x}CommandBufferEnqueueExp(hCommandBuffer, hQueue, 0, nullptr,
133+
&executionEvent);

scripts/core/INTRO.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ The following design philosophies are adopted to reduce Host-side overhead:
107107

108108
- All API functions return ${x}_result_t
109109

110-
+ This enumeration contains error codes for the Level Zero APIs and validation layers
110+
+ This enumeration contains error codes for the Unified Runtime APIs and validation layers
111111
+ This allows for a consistent pattern on the application side for catching errors; especially when validation layer(s) are enabled
112112

113113
Multithreading and Concurrency

scripts/core/common.yml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -256,6 +256,15 @@ etors:
256256
- name: ERROR_ADAPTER_SPECIFIC
257257
desc: "An adapter specific warning/error has been reported and can be retrieved
258258
via the urGetLastResult entry point."
259+
- name: ERROR_INVALID_COMMAND_BUFFER_EXP
260+
value: "0x1000"
261+
desc: "Invalid Command-Buffer"
262+
- name: ERROR_INVALID_COMMAND_BUFFER_SYNC_POINT_EXP
263+
value: "0x1001"
264+
desc: "Sync point is not valid for the command-buffer"
265+
- name: ERROR_INVALID_COMMAND_BUFFER_SYNC_POINT_WAIT_LIST_EXP
266+
value: "0x1002"
267+
desc: "Sync point wait list is invalid"
259268
- name: ERROR_UNKNOWN
260269
value: "0x7ffffffe"
261270
desc: "Unknown or internal error"
@@ -318,6 +327,10 @@ etors:
318327
desc: $x_queue_native_desc_t
319328
- name: DEVICE_PARTITION_PROPERTIES
320329
desc: $x_device_partition_properties_t
330+
- name: EXP_COMMAND_BUFFER_DESC
331+
desc: $x_exp_command_buffer_desc_t
332+
- name: EXP_SAMPLER_MIP_PROPERTIES
333+
desc: $x_exp_sampler_mip_properties_t
321334
- name: MEM_OBJ_PROPERTIES
322335
desc: $x_mem_obj_properties_t
323336
--- #--------------------------------------------------------------------------

scripts/core/device.yml

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -383,6 +383,56 @@ etors:
383383
desc: "[$x_bool_t] Return true if the device supports enqueing commands to read and write pipes from the host."
384384
- name: MAX_REGISTERS_PER_WORK_GROUP
385385
desc: "[uint32_t] The maximum number of registers available per block."
386+
- name: IP_VERSION
387+
desc: "[uint32_t] The device IP version. The meaning of the device IP version is implementation-defined, but newer devices should have a higher version than older devices."
388+
- name: BINDLESS_IMAGES_SUPPORT_EXP
389+
value: "0x2000"
390+
desc: "[$x_bool_t] returns true if the device supports the creation of bindless images"
391+
- name: BINDLESS_IMAGES_1D_USM_SUPPORT_EXP
392+
value: "0x2001"
393+
desc: "[$x_bool_t] returns true if the device supports the creation of 1D bindless images backed by USM"
394+
- name: BINDLESS_IMAGES_2D_USM_SUPPORT_EXP
395+
value: "0x2002"
396+
desc: "[$x_bool_t] returns true if the device supports the creation of 2D bindless images backed by USM"
397+
- name: BINDLESS_IMAGES_3D_USM_SUPPORT_EXP
398+
value: "0x2003"
399+
desc: "[$x_bool_t] returns true if the device supports the creation of 3D bindless images backed by USM"
400+
- name: IMAGE_PITCH_ALIGN_EXP
401+
value: "0x2004"
402+
desc: "[uint32_t] returns the required alignment of the pitch between two rows of an image in bytes"
403+
- name: MAX_IMAGE_LINEAR_WIDTH_EXP
404+
value: "0x2005"
405+
desc: "[size_t] returns the maximum linear width allowed for images allocated using USM"
406+
- name: MAX_IMAGE_LINEAR_HEIGHT_EXP
407+
value: "0x2006"
408+
desc: "[size_t] returns the maximum linear height allowed for images allocated using USM"
409+
- name: MAX_IMAGE_LINEAR_PITCH_EXP
410+
value: "0x2007"
411+
desc: "[size_t] returns the maximum linear pitch allowed for images allocated using USM"
412+
- name: MIPMAP_SUPPORT_EXP
413+
value: "0x2008"
414+
desc: "[$x_bool_t] returns true if the device supports allocating mipmap resources"
415+
- name: MIPMAP_ANISOTROPY_SUPPORT_EXP
416+
value: "0x2009"
417+
desc: "[$x_bool_t] returns true if the device supports sampling mipmap images with anisotropic filtering"
418+
- name: MIPMAP_MAX_ANISOTROPY_EXP
419+
value: "0x200A"
420+
desc: "[uint32_t] returns the maximum anisotropic ratio supported by the device"
421+
- name: MIPMAP_LEVEL_REFERENCE_SUPPORT_EXP
422+
value: "0x200B"
423+
desc: "[$x_bool_t] returns true if the device supports using images created from individual mipmap levels"
424+
- name: INTEROP_MEMORY_IMPORT_SUPPORT_EXP
425+
value: "0x200C"
426+
desc: "[$x_bool_t] returns true if the device supports importing external memory resources"
427+
- name: INTEROP_MEMORY_EXPORT_SUPPORT_EXP
428+
value: "0x200D"
429+
desc: "[$x_bool_t] returns true if the device supports exporting internal memory resources"
430+
- name: INTEROP_SEMAPHORE_IMPORT_SUPPORT_EXP
431+
value: "0x200E"
432+
desc: "[$x_bool_t] returns true if the device supports importing external semaphore resources"
433+
- name: INTEROP_SEMAPHORE_EXPORT_SUPPORT_EXP
434+
value: "0x200F"
435+
desc: "[$x_bool_t] returns true if the device supports exporting internal event resources"
386436
--- #--------------------------------------------------------------------------
387437
type: function
388438
desc: "Retrieves various information about device"

0 commit comments

Comments
 (0)