|
| 1 | +===================== |
| 2 | +Clang Offload Wrapper |
| 3 | +===================== |
| 4 | + |
| 5 | +.. contents:: |
| 6 | + :local: |
| 7 | + |
| 8 | +.. _clang-offload-wrapper: |
| 9 | + |
| 10 | +Introduction |
| 11 | +============ |
| 12 | + |
| 13 | +This tool is used in OpenMP offloading toolchain to embed device code objects |
| 14 | +(usually ELF) into a wrapper host llvm IR (bitcode) file. The wrapper host IR |
| 15 | +is then assembled and linked with host code objects to generate the executable |
| 16 | +binary. See :ref:`image-binary-embedding-execution` for more details. |
| 17 | + |
| 18 | +Usage |
| 19 | +===== |
| 20 | + |
| 21 | +This tool can be used as follows: |
| 22 | + |
| 23 | +.. code-block:: console |
| 24 | +
|
| 25 | + $ clang-offload-wrapper -help |
| 26 | + OVERVIEW: A tool to create a wrapper bitcode for offload target binaries. |
| 27 | + Takes offload target binaries as input and produces bitcode file containing |
| 28 | + target binaries packaged as data and initialization code which registers |
| 29 | + target binaries in offload runtime. |
| 30 | + USAGE: clang-offload-wrapper [options] <input files> |
| 31 | + OPTIONS: |
| 32 | + Generic Options: |
| 33 | + --help - Display available options (--help-hidden for more) |
| 34 | + --help-list - Display list of available options (--help-list-hidden for more) |
| 35 | + --version - Display the version of this program |
| 36 | + clang-offload-wrapper options: |
| 37 | + -o=<filename> - Output filename |
| 38 | + --target=<triple> - Target triple for the output module |
| 39 | +
|
| 40 | +Example |
| 41 | +======= |
| 42 | + |
| 43 | +.. code-block:: console |
| 44 | +
|
| 45 | + clang-offload-wrapper -target host-triple -o host-wrapper.bc gfx90a-binary.out |
| 46 | +
|
| 47 | +.. _openmp-device-binary_embedding: |
| 48 | + |
| 49 | +OpenMP Device Binary Embedding |
| 50 | +============================== |
| 51 | + |
| 52 | +Various structures and functions used in the wrapper host IR form the interface |
| 53 | +between the executable binary and the OpenMP runtime. |
| 54 | + |
| 55 | +Enum Types |
| 56 | +---------- |
| 57 | + |
| 58 | +:ref:`table-offloading-declare-target-flags` lists different flag for |
| 59 | +offloading entries. |
| 60 | + |
| 61 | + .. table:: Offloading Declare Target Flags Enum |
| 62 | + :name: table-offloading-declare-target-flags |
| 63 | + |
| 64 | + +-------------------------+-------+------------------------------------------------------------------+ |
| 65 | + | Name | Value | Description | |
| 66 | + +=========================+=======+==================================================================+ |
| 67 | + | OMP_DECLARE_TARGET_LINK | 0x01 | Mark the entry as having a 'link' attribute (w.r.t. link clause) | |
| 68 | + +-------------------------+-------+------------------------------------------------------------------+ |
| 69 | + | OMP_DECLARE_TARGET_CTOR | 0x02 | Mark the entry as being a global constructor | |
| 70 | + +-------------------------+-------+------------------------------------------------------------------+ |
| 71 | + | OMP_DECLARE_TARGET_DTOR | 0x04 | Mark the entry as being a global destructor | |
| 72 | + +-------------------------+-------+------------------------------------------------------------------+ |
| 73 | + |
| 74 | +Structure Types |
| 75 | +--------------- |
| 76 | + |
| 77 | +:ref:`table-tgt_offload_entry`, :ref:`table-tgt_device_image`, and |
| 78 | +:ref:`table-tgt_bin_desc` are the structures used in the wrapper host IR. |
| 79 | + |
| 80 | + .. table:: __tgt_offload_entry structure |
| 81 | + :name: table-tgt_offload_entry |
| 82 | + |
| 83 | + +---------+------------+------------------------------------------------------------------------------------+ |
| 84 | + | Type | Identifier | Description | |
| 85 | + +=========+============+====================================================================================+ |
| 86 | + | void* | addr | Address of global symbol within device image (function or global) | |
| 87 | + +---------+------------+------------------------------------------------------------------------------------+ |
| 88 | + | char* | name | Name of the symbol | |
| 89 | + +---------+------------+------------------------------------------------------------------------------------+ |
| 90 | + | size_t | size | Size of the entry info (0 if it is a function) | |
| 91 | + +---------+------------+------------------------------------------------------------------------------------+ |
| 92 | + | int32_t | flags | Flags associated with the entry (see :ref:`table-offloading-declare-target-flags`) | |
| 93 | + +---------+------------+------------------------------------------------------------------------------------+ |
| 94 | + | int32_t | reserved | Reserved, to be used by the runtime library. | |
| 95 | + +---------+------------+------------------------------------------------------------------------------------+ |
| 96 | + |
| 97 | + .. table:: __tgt_device_image structure |
| 98 | + :name: table-tgt_device_image |
| 99 | + |
| 100 | + +----------------------+--------------+----------------------------------------+ |
| 101 | + | Type | Identifier | Description | |
| 102 | + +======================+==============+========================================+ |
| 103 | + | void* | ImageStart | Pointer to the target code start | |
| 104 | + +----------------------+--------------+----------------------------------------+ |
| 105 | + | void* | ImageEnd | Pointer to the target code end | |
| 106 | + +----------------------+--------------+----------------------------------------+ |
| 107 | + | __tgt_offload_entry* | EntriesBegin | Begin of table with all target entries | |
| 108 | + +----------------------+--------------+----------------------------------------+ |
| 109 | + | __tgt_offload_entry* | EntriesEnd | End of table (non inclusive) | |
| 110 | + +----------------------+--------------+----------------------------------------+ |
| 111 | + |
| 112 | + .. table:: __tgt_bin_desc structure |
| 113 | + :name: table-tgt_bin_desc |
| 114 | + |
| 115 | + +----------------------+------------------+------------------------------------------+ |
| 116 | + | Type | Identifier | Description | |
| 117 | + +======================+==================+==========================================+ |
| 118 | + | int32_t | NumDeviceImages | Number of device types supported | |
| 119 | + +----------------------+------------------+------------------------------------------+ |
| 120 | + | __tgt_device_image* | DeviceImages | Array of device images (1 per dev. type) | |
| 121 | + +----------------------+------------------+------------------------------------------+ |
| 122 | + | __tgt_offload_entry* | HostEntriesBegin | Begin of table with all host entries | |
| 123 | + +----------------------+------------------+------------------------------------------+ |
| 124 | + | __tgt_offload_entry* | HostEntriesEnd | End of table (non inclusive) | |
| 125 | + +----------------------+------------------+------------------------------------------+ |
| 126 | + |
| 127 | +Global Variables |
| 128 | +---------------- |
| 129 | + |
| 130 | +:ref:`table-global-variables` lists various global variables, along with their |
| 131 | +type and their explicit ELF sections, which are used to store device images and |
| 132 | +related symbols. |
| 133 | + |
| 134 | + .. table:: Global Variables |
| 135 | + :name: table-global-variables |
| 136 | + |
| 137 | + +--------------------------------+---------------------+-------------------------+---------------------------------------------------+ |
| 138 | + | Variable | Type | ELF Section | Description | |
| 139 | + +================================+=====================+=========================+===================================================+ |
| 140 | + | __start_omp_offloading_entries | __tgt_offload_entry | .omp_offloading_entries | Begin symbol for the offload entries table. | |
| 141 | + +--------------------------------+---------------------+-------------------------+---------------------------------------------------+ |
| 142 | + | __stop_omp_offloading_entries | __tgt_offload_entry | .omp_offloading_entries | End symbol for the offload entries table. | |
| 143 | + +--------------------------------+---------------------+-------------------------+---------------------------------------------------+ |
| 144 | + | __dummy.omp_offloading.entry | __tgt_offload_entry | .omp_offloading_entries | Dummy zero-sized object in the offload entries | |
| 145 | + | | | | section to force linker to define begin/end | |
| 146 | + | | | | symbols defined above. | |
| 147 | + +--------------------------------+---------------------+-------------------------+---------------------------------------------------+ |
| 148 | + | .omp_offloading.device_image | __tgt_device_image | .omp_offloading_entries | ELF device code object of the first image. | |
| 149 | + +--------------------------------+---------------------+-------------------------+---------------------------------------------------+ |
| 150 | + | .omp_offloading.device_image.N | __tgt_device_image | .omp_offloading_entries | ELF device code object of the (N+1)th image. | |
| 151 | + +--------------------------------+---------------------+-------------------------+---------------------------------------------------+ |
| 152 | + | .omp_offloading.device_images | __tgt_device_image | .omp_offloading_entries | Array of images. | |
| 153 | + +--------------------------------+---------------------+-------------------------+---------------------------------------------------+ |
| 154 | + | .omp_offloading.descriptor | __tgt_bin_desc | .omp_offloading_entries | Binary descriptor object (see details below). | |
| 155 | + +--------------------------------+---------------------+-------------------------+---------------------------------------------------+ |
| 156 | + |
| 157 | + |
| 158 | +Binary Descriptor for Device Images |
| 159 | +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 160 | + |
| 161 | +This object is passed to the offloading runtime at program startup and it |
| 162 | +describes all device images available in the executable or shared library. It |
| 163 | +is defined as follows: |
| 164 | + |
| 165 | +.. code-block:: console |
| 166 | +
|
| 167 | + __attribute__((visibility("hidden"))) |
| 168 | + extern __tgt_offload_entry *__start_omp_offloading_entries; |
| 169 | + __attribute__((visibility("hidden"))) |
| 170 | + extern __tgt_offload_entry *__stop_omp_offloading_entries; |
| 171 | + static const char Image0[] = { <Bufs.front() contents> }; |
| 172 | + ... |
| 173 | + static const char ImageN[] = { <Bufs.back() contents> }; |
| 174 | + static const __tgt_device_image Images[] = { |
| 175 | + { |
| 176 | + Image0, /*ImageStart*/ |
| 177 | + Image0 + sizeof(Image0), /*ImageEnd*/ |
| 178 | + __start_omp_offloading_entries, /*EntriesBegin*/ |
| 179 | + __stop_omp_offloading_entries /*EntriesEnd*/ |
| 180 | + }, |
| 181 | + ... |
| 182 | + { |
| 183 | + ImageN, /*ImageStart*/ |
| 184 | + ImageN + sizeof(ImageN), /*ImageEnd*/ |
| 185 | + __start_omp_offloading_entries, /*EntriesBegin*/ |
| 186 | + __stop_omp_offloading_entries /*EntriesEnd*/ |
| 187 | + } |
| 188 | + }; |
| 189 | + static const __tgt_bin_desc BinDesc = { |
| 190 | + sizeof(Images) / sizeof(Images[0]), /*NumDeviceImages*/ |
| 191 | + Images, /*DeviceImages*/ |
| 192 | + __start_omp_offloading_entries, /*HostEntriesBegin*/ |
| 193 | + __stop_omp_offloading_entries /*HostEntriesEnd*/ |
| 194 | + }; |
| 195 | +
|
| 196 | +Global Constructor and Destructor |
| 197 | +--------------------------------- |
| 198 | + |
| 199 | +Global constructor (``.omp_offloading.descriptor_reg()``) registers the library |
| 200 | +of images with the runtime by calling ``__tgt_register_lib()`` function. The |
| 201 | +cunstructor is explicitly defined in ``.text.startup`` section. |
| 202 | +Similarly, global destructor |
| 203 | +(``.omp_offloading.descriptor_unreg()``) calls ``__tgt_unregister_lib()`` for |
| 204 | +the unregistration and is also defined in ``.text.startup`` section. |
| 205 | + |
| 206 | +.. _image-binary-embedding-execution: |
| 207 | + |
| 208 | +Image Binary Embedding and Execution for OpenMP |
| 209 | +=============================================== |
| 210 | + |
| 211 | +For each offloading target, device ELF code objects are generated by ``clang``, |
| 212 | +``opt``, ``llc``, and ``lld`` pipeline. These code objects are passed to the |
| 213 | +``clang-offload-wrapper``. |
| 214 | + |
| 215 | + * At compile time, the ``clang-offload-wrapper`` tool takes the following |
| 216 | + actions: |
| 217 | + * It embeds the ELF code objects for the device into the host code (see |
| 218 | + :ref:`openmp-device-binary_embedding`). |
| 219 | + * At execution time: |
| 220 | + * The global constructor gets run and it registers the device image. |
0 commit comments