Releases: intel/llvm
Releases · intel/llvm
DPC++ daily 2022-07-06
[SYCL][L0] Pass non-0 PI event to PI API calls (#6398) Signed-off-by: Sergey V Maslov <sergey.v.maslov@intel.com>
DPC++ daily 2022-07-05
[SYCL][HIP][PI] Multiple HIP streams per SYCL queue (#6325) Closely mimics the functionality of CUDA plugin #6102
DPC++ daily 2022-07-04
[SYCL][HIP][PI] Multiple HIP streams per SYCL queue (#6325) Closely mimics the functionality of CUDA plugin #6102
DPC++ daily 2022-07-03
sycl-nightly/20220703 [SYCL] Remove experimental/builtins.hpp from sycl.hpp due to C++17 (#…
DPC++ daily 2022-07-02
sycl-nightly/20220702 [SYCL] Remove experimental/builtins.hpp from sycl.hpp due to C++17 (#…
DPC++ daily 2022-07-01
[SYCL] Throw exception for pipe extension on host (#6385) The implementation of the pipes extension currently uses a failing assert when pipe operations are done on host. This commit changes these assertions into throwing a SYCL exception, both allowing for failure recovery and makes the failures independent on whether assertions are enabled. Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
DPC++ daily 2022-06-30
[SYCL] Use std::ignore for all unused args in bfloat builtins (#6381) Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
DPC++ daily 2022-06-29
sycl-nightly/20220629 [SYCL][FPGA][NFC] Refactor [[intel::num_simd_work_items()]] attribute…
oneAPI DPC++ Compiler 2022-06
New features
SYCL Compiler
- Added
-fcuda-prec-sqrt
frontend compiler option which enables higher presision version ofsqrt
in the device code [ebf9ea8] - Added support for local memory accessors for the HIP backend. [58508ba]
- Added initial support of
-lname
processing when searching for fat static libraries. [35e32d8] [a33f9c8] - Added
-fsycl-fp32-prec-sqrt
flag which enables correctly roundedsycl::sqrt
. [5c8b7e7] - Added support for
[[intel::loop_count()]]
attribute. [c536e76] - Added support for passing driver options to JIT compiler and linker. [1c93bfe]
- Added default argument support for
work_group_size_hint
attribute. [0cff80e] - Added support for float and double exchange and compare exchange atomic operations in CUDA libclc. [1d84c99]
- Added
--ffast-math
support for CUDA libclc. [0f0c5d1] - Added support for software atomics (except for the ones using system scope) for lower sm versions of CUDA architecture. Enabled
SYCL_USE_NATIVE_FP_ATOMICS
by default. [7bc8447] - Added support for the global offset for AMDGPU. [2dc3c06]
- Added support for asynchronous barrier for CUDA backend sm 80+. [6770421]
- Added
-f[no-]sycl-device-lib-jit-link
option to control JIT linking of SYCL device libraries. [dfb37a8] [c946286] - Added support for the new FPGA attribute
[[intel::fpga_pipeline(N)]]
for loop pipelining. [92aadf3] - Added
assert
support for Windows NVPTX. [f29b498] - Added support for
sycl_ext_oneapi_properties
extension. [87f60f6][1984e74][a2583ec][cdf561a][d2982c6][35c2e00]
SYCL Library
- Added support for Nvidia MMA for
bf16
, mixed precision int((u)int8/int32)
, and mixed precision float(half/float)
. [5373362] - Added a mode for the Level Zero plugin where only last command in each batch yields a host-visible event. Enabled this mode by default. [c6b7b8e]
- Added an option to query for atomic scope capabilities for the CUDA backend. Updated returns for atomics memory order capabilties. [43a4192]
- Added support for an experimental Level Zero API for host pointer import into USM. The feature can be enabled using
SYCL_USM_HOSTPTR_IMPORT
environment variable. [844d7b6] - Added support for the
wi_element
forbf16
type. [9f2b7bd] - Added complex support for the reduce and scan group algorithms. [90a4dc7]
- Added support for SYCL 2020 methods in the
group
class. [73d59ce] - Added
SYCL_RT_WARNING_LEVEL
environment variable which allows to control amount of warnings and performance hints the runtime library may print. [2741010] - Added
tanh
(for floats/halfs) andexp2
(for halfs) native definitions for CUDA backend. [250c498] - Added
bf16
builtins forfma
,fmin
,fmax
andfmax
on CUDA backend. [62651dd] - Added support for USM buffer location properties which allows to specify at what memory location the device usm allocation should be in. [12c988a]
- Added support for
buffer_location
property to thesycl::buffer
. [9808525] - Added
single_task
support for ESIMD_EMULATOR backend. [2331160] - Added support for SVM 1,2,4-elements gather/scatter for ESIMD. [e200720]
- Added support for
bf16
builtins operating on storage types for CUDA backend. [413a9ef] - Added support for
backend_version
device property for CUDA backend. [4b1a4bc] - Added support for round-robin submissions to multiple compute CCS for the Level Zero backend. Disabled by default, can be controlled using
SYCL_PI_LEVEL_ZERO_USE_COMPUTE_ENGINE
. [a836c87] - Added support for buffer migration for contexts with multiple devices in the Level Zero plugin. [7baf152]
- Added mode where the Level Zero plugin uses immediate command-lists instead of standard command-lists. This mode is disabled by default, can be enabled using
SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS
environment variable. [b9cb1d1] - Added support for
sycl::get_native(sycl::buffer)
for OpenCL and CUDA backends. [8b3c8c4] - Added reduction overloads accepting
span
. [863383b] - Added LSC support for ESIMD_EMULATOR backend. [b78bf00]
- Added
half
type support for__esimd_convertvector_to/from
. [0bfffd6] - Added
buffer_allocator
SYCL 2020 conformant variant. [53430c8] - Added support for the USM buffer location property in
malloc_shared
. [6e89821] [9f61c8e][8c4d9a5] - Added support for the USM buffer location property in
malloc_host
. [2c7caab] - Added experimental context and device interoperability support for CUDA. [f0df89a]
- Added support for memory intrinsics for the ESIMD_EMULATOR plugin. [1a8f501]
- Added support for named barrier APIs for ESIMD. [1df0038]
- Added support for DPAS API for ESIMD. [5881938]
- Added support for LSC memory access APIs for ESIMD. [4bd50e7]
- Added support for the
invoke_simd
feature. [4072557][8471ff3][8c7bb45][62afb59][3e1c1bf] - Added support for
info::device::atomic64
for OpenCL and Level Zero backends. [8feb558] - Added support for
sycl_ext_oneapi_usm_device_read_only
extension [644c614][58c9d3a] - Added support for mapping/unmapping operations for ESIMD_EMULATOR plugin. [bc0579a]
- Added support for
make_buffer
API for the Level Zero backend. [7c49984] - Added interoperability support for HIP backend. [e06d1b5]
- Added missing
+-*/
operations forhalf
. [059efbc] - Introduced new environment variable
SYCL_PI_CUDA_MAX_LOCAL_MEM_SZ
to control the max local memory allowed to be allocated per kernel on CUDA backend. [2e24304] - Added
ext_intel_global_host_space
in accordance withsycl_ext_intel_usm_address_spaces
extension. [7a2f44b] - Added aspect for
bfloat16
. [f84fc32] - Introduced "Intel math functions" device library with support of type cast util functions for float, double and integer type. [a310952]
- Added
bfloat16
support forjoint_matrix
[6ac62ab]
Documentation
- Added
sycl_ext_oneapi_complex_algorithms
extension [7ae7ca8] - Added a design document for
sycl_ext_oneapi_device_global
extension [8c22ef1] - Added a design document for
sycl_ext_oneapi_properties
extension [912572f] - Added new
sycl_ext_oneapi_free_function_queries
proposal. [7a93a49] - Added
sycl_ext_oneapi_group_load_store
extension. [85ccdc0] - Added validation rules to the SPIR-V extension
SPV_INTEL_global_variable_decorations
. [dfaa070] - Added
SYCL_INTEL_buffer_location
extension to supportbuffer_location
property for USM allocations. [962417d] [36a9ee2] - Added
sycl_ext_oneapi_named_sub_group_sizes
extension proposal which aims to simplify the process of using sub-groups. [4f3d7e1] - Added experimental latency control API into
SYCL_INTEL_data_flow_pipes
. [5224f78] - Added
sycl_ext_oneapi_auto_local_range
extension proposal. [cb4e702] - Added SYCL 2020 spec constants design doc. [8ec9755]
- Added
sycl_ext_oneapi_queue_status_query
extension proposal. [b6143e5] - Added initial version of
sycl_ext_oneapi_invoke_simd
andsycl_ext_oneapi_uniform
extenions proposal. [a37ca84] - Added the
sycl_ext_oneapi_annotated_arg
extension proposal for applying properties on kernel arguments. [caa696f] - Added
sycl_ext_oneapi_cuda_async_barrier
extension for CUDA backend. [6770421] - Added
bfloat16
support to thefma
,fmin
,fmax
andfabs
SYCL floating point math functions intosycl_ext_oneapi_bfloat16
extension. [c76ef5c] - Added initial version of
sycl_ext_oneapi_root_group
extension proposal. [b59cd43]
Tools
- Implemented property set generation for device globals in the sycl-post-link. Added the `--device-gl...
DPC++ daily 2022-06-28
sycl-nightly/20220628 [SYCL] Reset signalled command list if there no available command lis…