|
| 1 | + |
| 2 | +<% |
| 3 | + OneApi=tags['$OneApi'] |
| 4 | + x=tags['$x'] |
| 5 | + X=x.upper() |
| 6 | +%> |
| 7 | +.. _experimental-command-buffer: |
| 8 | + |
| 9 | +============== |
| 10 | +Command-Buffer |
| 11 | +============== |
| 12 | + |
| 13 | +.. warning:: |
| 14 | + |
| 15 | + Experimental features: |
| 16 | + |
| 17 | + * May be replaced, updated, or removed at any time. |
| 18 | + * Do not require maintaining API/ABI stability of their own additions over |
| 19 | + time. |
| 20 | + * Do not require conformance testing of their own additions. |
| 21 | + |
| 22 | + |
| 23 | +A command-buffer represents a series of commands for execution on a command |
| 24 | +queue. Many adapters support this kind of construct either natively or through |
| 25 | +extensions, but they are not available to use directly. Typically their use is |
| 26 | +abstracted through the existing Core APIs, for example when calling |
| 27 | +${x}EnqueueKernelLaunch the adapter may both append the kernel command to a |
| 28 | +command-buffer-like construct and also submit that command-buffer to a queue for |
| 29 | +execution. These types of structures allow for batching of commands to improve |
| 30 | +host launch latency, but without direct control it falls to the adapter |
| 31 | +implementation to implement automatic batching of commands. |
| 32 | + |
| 33 | +This experimental feature exposes command-buffers in the Unified Runtime API |
| 34 | +directly, allowing applications explicit control over the enqueue and execution |
| 35 | +of commands to batch commands as required for optimal performance. |
| 36 | + |
| 37 | +Querying Command-Buffer Support |
| 38 | +=============================== |
| 39 | + |
| 40 | +Support for command-buffers can be queried for a given device/adapter by using |
| 41 | +the device info query with ${X}_DEVICE_INFO_EXTENSIONS. Adapters supporting this |
| 42 | +experimental feature will report the string "ur_exp_command_buffer" in the |
| 43 | +returned list of supported extensions. |
| 44 | + |
| 45 | +.. hint:: |
| 46 | + The macro ${X}_COMMAND_BUFFER_EXTENSION_STRING_EXP is defined for the string |
| 47 | + returned from extension queries for this feature. Since the actual string |
| 48 | + may be subject to change it is safer to use this macro when querying for |
| 49 | + support for this experimental feature. |
| 50 | + |
| 51 | +.. parsed-literal:: |
| 52 | +
|
| 53 | + // Retrieve length of extension string |
| 54 | + size_t returnedSize; |
| 55 | + ${x}DeviceGetInfo(hDevice, ${X}_DEVICE_INFO_EXTENSIONS, 0, nullptr, |
| 56 | + &returnedSize); |
| 57 | +
|
| 58 | + // Retrieve extension string |
| 59 | + std::unique_ptr<char[]> returnedExtensions(new char[returnedSize]); |
| 60 | + ${x}DeviceGetInfo(hDevice, ${X}_DEVICE_INFO_EXTENSIONS, returnedSize, returnedExtensions.get(), nullptr); |
| 61 | + |
| 62 | + std::string_view ExtensionsString(returnedExtensions.get()); |
| 63 | + bool CmdBufferSupport = |
| 64 | + ExtensionsString.find(${X}_COMMAND_BUFFER_EXTENSION_STRING_EXP) |
| 65 | + != std::string::npos; |
| 66 | +
|
| 67 | +Command-Buffer Creation |
| 68 | +======================= |
| 69 | + |
| 70 | +Command-Buffers are tied to a specific ${x}_context_handle_t and |
| 71 | +${x}_device_handle_t. ${x}CommandBufferCreateExp optionally takes a descriptor |
| 72 | +to provide additional properties for how the command-buffer should be |
| 73 | +constructed. There are currently no unique members defined for |
| 74 | +${x}_exp_command_buffer_desc_t, however they may be added in the future. |
| 75 | + |
| 76 | +Command-buffers are reference counted and can be retained and released by |
| 77 | +calling ${x}CommandBufferRetainExp and ${x}CommandBufferReleaseExp respectively. |
| 78 | + |
| 79 | +Appending Commands |
| 80 | +================== |
| 81 | + |
| 82 | +Commands can be appended to a command-buffer by calling any of the |
| 83 | +command-buffer append functions. Typically these closely mimic the existing |
| 84 | +enqueue functions in the Core API in terms of their command-specific parameters. |
| 85 | +However, they differ in that they take a command-buffer handle instead of a |
| 86 | +queue handle, and the dependencies and return parameters are sync-points instead |
| 87 | +of event handles. |
| 88 | + |
| 89 | +Currently only the following commands are supported: |
| 90 | + |
| 91 | +* ${x}CommandBufferAppendKernelLaunchExp |
| 92 | +* ${x}CommandBufferAppendMemcpyUSMExp |
| 93 | +* ${x}CommandBufferAppendMembufferCopyExp |
| 94 | +* ${x}CommandBufferAppendMembufferCopyRectExp |
| 95 | + |
| 96 | +It is planned to eventually support any command type from the Core API which can |
| 97 | +actually be appended to the equiavalent adapter native constructs. |
| 98 | + |
| 99 | +Sync-Points |
| 100 | +=========== |
| 101 | + |
| 102 | +A sync-point is a value which represents a command inside of a command-buffer |
| 103 | +which is returned from command-buffer append function calls. These can be |
| 104 | +optionally passed to these functions to define execution dependencies on other |
| 105 | +commands within the command-buffer. |
| 106 | + |
| 107 | +Sync-points are unique and valid for use only within the command-buffer they |
| 108 | +were obtained from. |
| 109 | + |
| 110 | +.. parsed-literal:: |
| 111 | + // Append a memcpy with no sync-point dependencies |
| 112 | + ${x}_exp_command_buffer_sync_point_t syncPoint; |
| 113 | +
|
| 114 | + ${x}CommandBufferAppendMemcpyUSMExp(hCommandBuffer, pDst, pSrc, size, 0, nullptr, &syncPoint); |
| 115 | + |
| 116 | + // Append a kernel launch with syncPoint as a dependency, ignore returned |
| 117 | + // sync-point |
| 118 | + ${x}CommandBufferAppendKernelLaunchExp(hCommandBuffer, hKernel, workDim, pGlobalWorkOffset, pGlobalWorkSize, pLocalWorkSize, 1, &syncPoint, nullptr); |
| 119 | +
|
| 120 | +Enqueueing Command-Buffers |
| 121 | +========================== |
| 122 | + |
| 123 | +Command-buffers are submitted for execution on a ${x}_queue_handle_t with an |
| 124 | +optional list of dependent events. An event is returned which tracks the |
| 125 | +execution of the command-buffer, and will be complete when all appended commands |
| 126 | +have finished executing. It is adapter specific whether command-buffers can be |
| 127 | +enqueued or executed simultaneously, and submissions may be serialized. |
| 128 | + |
| 129 | +.. parsed-literal:: |
| 130 | + ${x}_event_handle_t executionEvent; |
| 131 | +
|
| 132 | + ${x}CommandBufferEnqueueExp(hCommandBuffer, hQueue, 0, nullptr, |
| 133 | + &executionEvent); |
0 commit comments