From 3d13dcfcf8dc471738676e436897cb2ddeb32cfb Mon Sep 17 00:00:00 2001 From: Max <133135930+codeplaymax@users.noreply.github.com> Date: Tue, 23 Jul 2024 14:34:07 +0100 Subject: [PATCH 01/16] Create 2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md adds, "Enabling performance portability on the LiGen drug discovery pipeline" --- ...ty-on-the-ligen-drug-discovery-pipeline.md | 33 +++++++++++++++++++ 1 file changed, 33 insertions(+) create mode 100644 content/research_papers/2024/2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md diff --git a/content/research_papers/2024/2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md b/content/research_papers/2024/2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md new file mode 100644 index 0000000..9afc74f --- /dev/null +++ b/content/research_papers/2024/2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md @@ -0,0 +1,33 @@ +--- +contributor: max +date: '2024-04-16T08:08:10.490000+00:00' +title: 'Enabling performance portability on the LiGen drug discovery pipeline' +external_url: https://www.sciencedirect.com/science/article/pii/S0167739X24001195 +authors: + - name: Luigi Crisci + - name: Lorenzo Carpentieri + - name: Biagio Cosenza + - name: Leon Bogdanović + - name: Gianmarco Accordi + - name: Davide Gadioli + - name: Emanuele Vitali + - name: Gianluca Palermo + - name: Andrea Rosario Beccari +tags: + - gpu + - sycl + - hpc + - cuda + - hip +--- + +In recent years, there has been a growing interest in developing high-performance implementations of drug discovery processing software. To target modern GPU architectures, +such applications are mostly written in proprietary languages such as CUDA or HIP. However, with the increasing heterogeneity of modern HPC systems and the availability of +accelerators from multiple hardware vendors, it has become critical to be able to efficiently execute drug discovery pipelines on multiple large-scale computing systems, +with the ultimate goal of working on urgent computing scenarios. This article presents the challenges of migrating LiGen, an industrial drug discovery software pipeline, +from CUDA to the SYCL programming model, an industry standard based on C++ that enables heterogeneous computing. We perform a structured analysis of the performance portability +of the SYCL LiGen platform, focusing on different aspects of the approach from different perspectives. First, we analyze the performance portability provided by the high-level semantics +of SYCL, including the most recent group algorithms and subgroups of SYCL 2020. Second, we analyze how low-level aspects such as kernel occupancy and register pressure affect +the performance portability of the overall application. The experimental evaluation is performed on two different versions of LiGen, implementing two different parallelization +patterns, by comparing them with a manually optimized CUDA version, and by evaluating performance portability using both known and ad hoc metrics. The results show that, +thanks to the combination of high-level SYCL semantics and some manual tuning, LiGen achieves native-comparable performance on NVIDIA, while also running on AMD GPUs. From 7494ab28c327128d6d6e1ea9e17e585c686c70f2 Mon Sep 17 00:00:00 2001 From: Max <133135930+codeplaymax@users.noreply.github.com> Date: Tue, 23 Jul 2024 14:41:20 +0100 Subject: [PATCH 02/16] Create 2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md Adds, "Open SYCL on heterogeneous GPU systems: A case of study" --- ...terogeneous-gpu-systems-a-case-of-study.md | 33 +++++++++++++++++++ 1 file changed, 33 insertions(+) create mode 100644 content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md diff --git a/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md b/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md new file mode 100644 index 0000000..cc71f2a --- /dev/null +++ b/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md @@ -0,0 +1,33 @@ +--- +contributor: max +date: '2023-10-01T08:08:10.490000+00:00' +title: 'Open SYCL on heterogeneous GPU systems: A case of study' +external_url: https://arxiv.org/ftp/arxiv/papers/2310/2310.06947.pdf +authors: + - name: Rocío Carratalá-Sáez + - name: Francisco J. Andújar + - name: Yuri Torres + - name: Arturo Gonzalez-Escribano + - name: Diego R. Llanos +tags: + - gpu + - cuda + - hpc + - hip +--- + +Computational platforms for high-performance scientific applications are becoming more heterogenous, including hardware +accelerators such as multiple GPUs. Applications in a wide variety of scientific fields require an efcient and careful +management of the computational resources of this type of hardware to obtain the best possible performance. However, +there are currently different GPU vendors, architectures and families that can be found in heterogeneous clusters or +machines. Programming with the vendor provided languages or frameworks, and optimizing for specific devices, may become +cumbersome and compromise porta-bility to other systems. To overcome this problem, several proposals for high-level +heterogeneous programming have appeared, trying to reduce the development effort and increase functional and performance +portability, specifically when using GPU hardware accelerators. This paper evaluates the SYCL programming model, using the +Open SYCL compiler, from two different perspectives: The performance it offers when dealing with single or multiple GPU devices +from the same or different vendors, and the development effort required to implement the code. We use as case of study the Finite Time Lyapunov Exponent +calculation over two real-world scenarios and compare the performance and the development effort of its Open SYCL-based version against +the equivalent versions that use CUDA or HIP. Based on the experimental results, we observe that the use of SYCL does not lead to a +remarkable overhead in terms of the GPU kernels execution time. In general terms, the Open SYCL development effort for the host code +is lower than that observed with CUDA or HIP. Moreover, the SYCL version can take advantage of both CUDA and AMD GPU devices simultaneously +much easier than directly using the vendor-specific programming solutions. From bef505c2c83f226b71bbc06fecc5cbd7730da65f Mon Sep 17 00:00:00 2001 From: Max <133135930+codeplaymax@users.noreply.github.com> Date: Tue, 23 Jul 2024 16:23:35 +0100 Subject: [PATCH 03/16] Create 2023-10-24-a-performance-portable-sycl-implementation-of-crk-hacc-for-exascale.md Adds, "A Performance-Portable SYCL Implementation of CRK-HACC for Exascale" --- ...implementation-of-crk-hacc-for-exascale.md | 29 +++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 content/research_papers/2023/2023-10-24-a-performance-portable-sycl-implementation-of-crk-hacc-for-exascale.md diff --git a/content/research_papers/2023/2023-10-24-a-performance-portable-sycl-implementation-of-crk-hacc-for-exascale.md b/content/research_papers/2023/2023-10-24-a-performance-portable-sycl-implementation-of-crk-hacc-for-exascale.md new file mode 100644 index 0000000..3fc7e98 --- /dev/null +++ b/content/research_papers/2023/2023-10-24-a-performance-portable-sycl-implementation-of-crk-hacc-for-exascale.md @@ -0,0 +1,29 @@ +--- +contributor: max +date: '2023-10-24T08:08:10.490000+00:00' +title: 'A Performance-Portable SYCL Implementation of CRK-HACC for Exascale' +external_url: https://arxiv.org/pdf/2310.16122 +authors: + - name: Esteban M. Rangel + - name: S. John Pennycook + - name: Adrian Pope + - name: Nicholas Frontiere + - name: Zhiqiang Ma + - name: Varsha Madananth +tags: + - sycl + - hpc + - cuda + - hip + - heterogeneous-programming +--- + +The first generation of exascale systems will include a variety of machine architectures, featuring GPUs from multiple vendors. +As a result, many developers are interested in adopting portable pro- gramming models to avoid maintaining multiple versions of +their code. It is necessary to document experiences with such program- ming models to assist developers in understanding the advantages and +disadvantages of different approaches. +To this end, this paper evaluates the performance portability of a SYCL implementation of a large-scale cosmology application (CRK-HACC) +running on GPUs from three different vendors: AMD, Intel, and NVIDIA. We detail the process of migrating the original code from CUDA to +SYCL and show that specializing kernels for specific targets can greatly improve performance portability with- out significantly impacting +programmer productivity. The SYCL version of CRK-HACC achieves a performance portability of 0.96 with a code divergence of almost 0, +demonstrating that SYCL is a viable programming model for performance-portable applications. From 1fcfc0bfd64325792d0de65bbfc57c5b33861016 Mon Sep 17 00:00:00 2001 From: Max <133135930+codeplaymax@users.noreply.github.com> Date: Tue, 23 Jul 2024 16:33:13 +0100 Subject: [PATCH 04/16] Create lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md Adds, "Lessons Learned Migrating CUDA to SYCL: A HEP Case Study with ROOT RDataFrame" --- ...l-a-hep-case-study-with-root-rdataframe.md | 26 +++++++++++++++++++ 1 file changed, 26 insertions(+) create mode 100644 content/research_papers/2024/lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md diff --git a/content/research_papers/2024/lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md b/content/research_papers/2024/lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md new file mode 100644 index 0000000..2c54cbd --- /dev/null +++ b/content/research_papers/2024/lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md @@ -0,0 +1,26 @@ +--- +contributor: max +date: '2024-01-24T08:08:10.490000+00:00' +title: 'Lessons Learned Migrating CUDA to SYCL: A HEP Case Study with ROOT RDataFrame' +external_url: https://arxiv.org/pdf/2310.16122 +authors: + - name: Jolly Chen + - name: Monica Dessole + - name: Ana Lucia Varbanescu +tags: + - sycl + - gpu + - cuda + - heterogeneous-programming +--- + +The world’s largest particle accelerator, located at CERN, produces petabytes of data that need to be analysed efficiently, +to study the fundamental structures of our universe. ROOT is an open-source C++ data analysis framework, developed for this +purpose. Its high- level data analysis interface, RDataFrame, currently only supports CPU parallelism. Given the increasing +heterogeneity in computing facilities, it becomes crucial to efficiently support GPGPUs to take advantage of the available +resources. SYCL allows for a single-source implementation, which enables support for different architectures. In this paper, +we describe a CUDA implementation and the migration process to SYCL, focusing on a core high energy physics operation in +RDataFrame – histogramming. We detail the challenges that we faced when integrating SYCL into a large and complex code base. +Furthermore, we perform an extensive comparative performance analysis of two SYCL compilers, AdaptiveCpp and DPC++, and the +reference CUDA implementation. We highlight the performance bottlenecks that we encountered, and the methodology used to detect +these. Based on our findings, we provide actionable insights for developers of SYCL applications. From c7d4e93da5c375fb95684c6e84b21f8d788de017 Mon Sep 17 00:00:00 2001 From: Max <133135930+codeplaymax@users.noreply.github.com> Date: Tue, 23 Jul 2024 16:40:43 +0100 Subject: [PATCH 05/16] Rename lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md to 2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md --- ...grating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename content/research_papers/2024/{lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md => 2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md} (100%) diff --git a/content/research_papers/2024/lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md b/content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md similarity index 100% rename from content/research_papers/2024/lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md rename to content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md From 66151158245c3be12c085e49c964ab0f57070a2b Mon Sep 17 00:00:00 2001 From: Max <133135930+codeplaymax@users.noreply.github.com> Date: Tue, 23 Jul 2024 16:45:02 +0100 Subject: [PATCH 06/16] Create 2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows Adds, "XFLUIDS: A SYCL-based unified cross-architecture heterogeneous simulation solver for compressible reacting flows" --- ...ion-solver-for-compressible-reacting-flows | 30 +++++++++++++++++++ 1 file changed, 30 insertions(+) create mode 100644 content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows diff --git a/content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows b/content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows new file mode 100644 index 0000000..4f5fca7 --- /dev/null +++ b/content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows @@ -0,0 +1,30 @@ +--- +contributor: max +date: '2024-03-09T08:08:10.490000+00:00' +title: 'XFLUIDS: A SYCL-based unified cross-architecture heterogeneous simulation solver for compressible reacting flows' +external_url: https://arxiv.org/abs/2403.05910 +authors: + - name: Jinlong Li + - name: Shucheng Pan +tags: + - sycl + - hpc + - cuda + - hip + - heterogeneous-programming +--- + +We present a cross-architecture high-order heterogeneous Navier-Stokes simulation solver, XFluids, for compressible +reacting multicomponent flows on different platforms. The multi-component reacting flows are ubiquitous in many scientific +and engineering applications, while their numerical simulations are usually time-consuming to capture the underlying +multiscale features. Although heterogeneous accelerated computing is significantly beneficial for large-scale simulations +of these flows, effective utilization of various heterogeneous accelerators with different architectures and programming +models in the market remains a challenge. To address this, we develop XFluids by SYCL, to perform acceleration directly +targeted to different devices, without translating any source code. A variety of optimization techniques have been proposed +to increase the computational performance of XFluids, including adaptive range assignment, partial eigensystem reconstruction, +hotspot device function optimizations, etc. This solver has been open-sourced, and tested on multiple GPUs from different mainstream +vendors, indicating high portability. Through various benchmark cases, the accuracy of XFluids is demonstrated, with approximately +no efficiency loss compared to existing GPU programming models, such as CUDA and HIP. In addition, the MPI library is used to extend +the solver to multi-GPU platforms, with the GPU-enabled MPI supported. With this, the weak scaling of XFluids for multi-GPU devices +is larger than 95% for 1024 GPUs. Finally, we simulate both the inert and reactive multicomponent shock-bubble interaction problems +with high-resolution meshes, to investigate the reacting effects on the mixing, vortex stretching, and shape deformation of the bubble evolution. From 17756f08a4286312dfb50b57c1aa8a6d1154d85d Mon Sep 17 00:00:00 2001 From: Max <133135930+codeplaymax@users.noreply.github.com> Date: Tue, 23 Jul 2024 16:45:39 +0100 Subject: [PATCH 07/16] Update 2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md Fixes link --- ...rating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md b/content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md index 2c54cbd..20c1f61 100644 --- a/content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md +++ b/content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md @@ -2,7 +2,7 @@ contributor: max date: '2024-01-24T08:08:10.490000+00:00' title: 'Lessons Learned Migrating CUDA to SYCL: A HEP Case Study with ROOT RDataFrame' -external_url: https://arxiv.org/pdf/2310.16122 +external_url: https://arxiv.org/pdf/2401.13310 authors: - name: Jolly Chen - name: Monica Dessole From 401b6fa68b37a6c25e76d54abd16a7611efb5f1c Mon Sep 17 00:00:00 2001 From: Max <133135930+codeplaymax@users.noreply.github.com> Date: Tue, 23 Jul 2024 16:46:12 +0100 Subject: [PATCH 08/16] Rename 2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows to 2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows.md adds .md to file name --- ...ogeneous-simulation-solver-for-compressible-reacting-flows.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename content/research_papers/2024/{2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows => 2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows.md} (100%) diff --git a/content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows b/content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows.md similarity index 100% rename from content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows rename to content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows.md From 4572066c110610335d1de4686bd2273bfb951ae7 Mon Sep 17 00:00:00 2001 From: Max <133135930+codeplaymax@users.noreply.github.com> Date: Tue, 23 Jul 2024 16:52:50 +0100 Subject: [PATCH 09/16] Create 2024-01-05-preliminary-report-initial-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md Adds, "Preliminary report: Initial evaluation of StdPar implementations on AMD GPUs for HPC" --- ...par-implementations-on-amd-gpus-for-hpc.md | 26 +++++++++++++++++++ 1 file changed, 26 insertions(+) create mode 100644 content/research_papers/2024/2024-01-05-preliminary-report-initial-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md diff --git a/content/research_papers/2024/2024-01-05-preliminary-report-initial-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md b/content/research_papers/2024/2024-01-05-preliminary-report-initial-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md new file mode 100644 index 0000000..31d51dd --- /dev/null +++ b/content/research_papers/2024/2024-01-05-preliminary-report-initial-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md @@ -0,0 +1,26 @@ +--- +contributor: max +date: '2024-01-05T08:08:10.490000+00:00' +title: 'Preliminary report: Initial evaluation of StdPar implementations on AMD GPUs for HPC' +external_url: https://arxiv.org/pdf/2401.02680 +authors: + - name: Wei-Chen Lin + - name: Simon McIntosh-Smith + - name: Tom Deakin +tags: + - sycl + - gpu + - hip +--- + +Recently, AMD platforms have not supported offloading C++17 PSTL (StdPar) programs to the GPU. Our previous work highlights +how StdPar is able to achieve good performance across NVIDIA and Intel GPU platforms. In that work, we acknowledged AMD’s past +effort such as HCC, which unfortunately is deprecated and does not support newer hardware platforms. Recent developments by AMD, Codeplay, +and AdaptiveCpp (previously known as hipSYCL or OpenSYCL) have enabled multiple paths for StdPar programs to run on AMD GPUs. +This informal report discusses our experiences and evaluation of currently available StdPar implementations for AMD GPUs. +We conduct benchmarks using our suite of HPC mini-apps with ports in many heterogeneous programming models, including StdPar. +We then compare the performance of StdPar, using all available StdPar compilers, to contemporary heterogeneous programming models +supported on AMD GPUs: HIP, OpenCL, Thrust, Kokkos, OpenMP, SYCL. Where appropriate, we discuss issues encountered and workarounds +applied during our evaluation. Finally, the StdPar model discussed in this report largely depends on Unified Shared Memory (USM) performance +and very few AMD GPUs have proper support for this feature. As such, this report demonstrates a proof-of-concept host-side userspace pagefault +solution for models that use the HIP API. We discuss performance improvements achieved with our solution using the same set of benchmarks. From f9fffa6a9b392cd052865f5b443c6f2897c2dd8e Mon Sep 17 00:00:00 2001 From: Max <133135930+codeplaymax@users.noreply.github.com> Date: Tue, 13 Aug 2024 09:17:11 +0100 Subject: [PATCH 10/16] oneapi construction kit news image for article --- .../2024-07-30-oneapi-construction-kit.webp | Bin 0 -> 13950 bytes 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 static/images/news/2024-07-30-oneapi-construction-kit.webp diff --git a/static/images/news/2024-07-30-oneapi-construction-kit.webp b/static/images/news/2024-07-30-oneapi-construction-kit.webp new file mode 100644 index 0000000000000000000000000000000000000000..dd3339c812b366ef1c75c676fd2ec6c89e577eb2 GIT binary patch literal 13950 zcmb`u1ymMW+c!RU(B0DA-67o}0@4W5ozfsFDM%xYq?D8(B`ruuBOxj+DJrS-{U7wG z=Q+=L-|zpe_03vy&u^}M_1=5-wPEI#ilU;o5CG^a$ZG0oiWr~+0KfsB6L4T23aH4* zX?Ed&w*XF;xtp646c+%TT|C`yDaq0p7#h)_YywCC8h`;n0Tgpf4>uW2%^O$TzphvK z&+%;rfLz_Z;k?rIhyMQy;aORGSONe<6XdqA^00IU@pAxx=UBSAdjbIR6`kJ4)9ng_ zCqU>9Itb$ED{T7{=UicnpSbo1kIpSQkf#>_P%+KT-E9E?dmNm3|S6zv;348#Xt$_^s32-1ayA84D;08kV>BaC0*E z`*HLC^2gc93(VJ#g9-e_v~yR!1zrikyuEdFRlCBNAl`AYxTyhRd=MYogVFuOP;@&l zWnB=1{7?!@4>>TG05OPptjy)rKuiH*MHi==SN(lOx3IBSQ~@zAh<)unRdhgn4aCvb z9tzrj@#MR^-ulZHw9&?0PUbKAfVn$}e%e~_a@4(|U+LesvsU>_KMdR7PDdZ4g98-C z;BBvW6U5jc=JIgVzVZk30VeKmCwE2v!Q<|A>&kxykbcYBMM)OKU@S0i8+XO4*su6w zoIODFBX(H6oxRGHESLvaou{3SGKj(azk@+hpFmG!QgDc-aerThW zwY&m|!PsG&)-JkN@m}dUcJq|IvIWY*(cPSW%ngtZr?Ylax}wv8Sir+u`!89zo~OIc z6&>^wZsX>ud6h%ZPq@E>xvD0JL4V+hzzskSkOyc0FYvSgTmc8besrg42fX{)q6nA+ z?tm>|4RHQN`6Z$Mqs0aM)du{5&wvZazwdupHwSfEg8n*z=T-Z^(|=L_aLR!3-uO$_>z73u zoL}X!32KXxto_uf-xahAq{_rEMBTXQ!AWb7JAD7o-Lm9pISVveLcbP49K~<`MKNM+IiB*fGdJEjf#sU7Y7X=51#-4T&*ux zJOFT@^<&+D5D)*Mg)IVrL?;4)X#GP|?F4{l!r=68`9ot70RTJ<02pYo^m6z9sSkd2 zK?0u=IN(ep1*iaefCY?)4-f*b1JZy3pbBUJI)EW?2OO_<;5c^&e1HHT6o>?3fkfaD zkOkxcML-!)1=ImeKpW5v^Z`S_I4}*o2S?8u@Cn!jz5!Wuq zyYL|RLwFXv4BiOug};F>!*}7ANLWY|NbE>rNGeDMNVZ5mNRdcsNF_)ONUy-Tw1)Hz z85x-vnHgCaoQsCY4#@YAA0p=>*C2NxPav-%AEBV2kfN}oNT6t-n4@^2M51J(RHAgE zOrorz9HU~OQls*rDxw;px}e@iO+&3f?LeJG-9SAVfJT!%KU)^i$#j%E=wWH6e}{T2&*G&F6%fOlueM$ zjxC#Qj2+4@#BR@?%Ra$@#39Pz!coXE%ZbS;!|BcWlyjMjnCk}DeXb_1FWd~=M%+o< z{oI#4f;>(<#XRqM@p)Bw@AJOk-REQDv*gR>o8rgfSL6@oZ|2_C3!cr5T%kU&sf zFiP;H;DwN|kh@T|(3UWRu$gd<@T>@-h^ELxkpWR8QCZPo(KgW&F(ENeu{yD@*Ez2{ zUaz>mDb6fzEnY0XDnTbR${BuOJ_CYdj}Bt$SCB26!CC0!!DA;T);AX6#x zMV3d_UA95?yPTL@pj?+cOkPPoMt)QQUqM$POJPBgPSIBJsp77Zpwd01E@dQTHRUAb zSrtkZOOvVx(tOVsvy{;r653YsP}ck;XG7%qG4jgLla8INfPC#W6KE ztv5q5(>JRyJ2%%bFEl^0P_f9d*tL|iOt<`GC2f^zwPh_~oou~fBVm(lvuP_~n_|0V zCuNsrw_`77pJl)2pzM(6@ZC|vvD6Xar0-PYjOuLW+~R`o;^@-rO6BV7I^o9d7U{Nh zSM2ViySwgc?j;@&4-=1OPeQPeANFGLitt+Ymh{f{KK9Y`srSY6b@3hcWAl69x8bko z|KuM0p5?ul0ki?30n33hfdxU3Ad8@v!Suo5!D}H(A!VWHp^l*=_j&Fo-#-k~4|@?# z86Fb88le>NG!i?~J#secdQ@ICJlZ~b^a1~a%mzLdml21E zw}~H35K4HQh@9w>IFlrmRF;gF?4P`vqMp*2N|PFsdicon(O{ZDT3$MOx>x#A#*K{T zOoq(F%=0X}tm$mo?3x^koCi7IbFFeGA4@;3&ZEqW%{$Gv&wpE>RM1$+RG3zTT;yG} zUaVK#|3u_Tc?nrbY{^BbOX*VCt+L*7;qvkdii(7%(5GHcwBms{6Pw{7?8OVgL{dklJ}dbN5-U#Yzs=u_6hv6 z9FQ1j8x$LCc`fp~X-H_OaaeG;VMK7GVN__eaZGrud0ce-#f11o$E4I`_Z#^)eN!q^ zL(?~>CuVeK-p(4&F1@vSyE*4HxA)Hb-P!w)4@e(k=5goK7AO`97g-l;mV}ntm*tm- zSF~5&uUf8t`snfTd@Xz(Ydvj)dZT=kZ?kPnacg}0_V(H*w@+s~k)QEC=X_!L^8Bmx z*Wq1*-L*aUJ;eUQ1JZ-iLxIDWN18_q-<-dleUCXNJuW*DIq5&uKixRGY|JYYv-5`T| zi2R!u7S0xbHT}os=bjTJ{qp;tS1=lWez17@#}(Y!(+Ge&A>1TLE`)6oz5ta>QqI4@ z_=KL0&eJ!#79Y*l<=moXD_4t-t@F%u`T8VZQxpQ@EaehbS|8u0eo?e_ouu>e{vE!> z9z}+@=nrOg1kIOE{1gTjvwb)B`clt znWyEEENZSFcbrBEzxqBU+QHpX(BGxGFYxd*D4I?!s6__<_R(w9pRPfAfUB>3?OoQiz&j<^F##W(|`#@nZp9`%fRNVVZENT8?~&pKeHHmIM+02jyQJ z%VjlMRVMPAk|4ILtxzEgB3^~_6kY}N4`qKuiPk3)9|_Ekj1_;6Y)IMA8>HO%ldFGo zh&D>|l3)5^I@*l{mR!t(ENc#$m(yJTOO}!d`qI(k7~6KfJYU(rU8})G_3EXNw^-E; zgFF9IH`EiS*&|+Ly7tXSdwQ9uXL*>LK2@xECGkH-L9EHJZSgI!em+s^EM`CIiE@DB z0{MZ#EXvgJ*8`!{gv7qJ|IFr}1Q2U7F1_mat4A9W-?jCNZ^ zndqLZNQbIW50gDuP_JZC%gaNL3SfmbgnCsxKA!lFI+wgFLZrj zIZ>A2E1|6qeq4vRXh5z`5Y>rq5MerxaCuV`)+2wL)VBA|3TJxCT~C@uc6<(^@>qw#mKTb+Hfqm&wJ5ncC*h`n8cAbz9~gp@(mfZ=o*Z z*NOPFhoX%QIS}ucarTV@eoPgoq+bxm51s88IgbxiT1caeO;47#&X3Iz$&~)-ft%T#CRBucPR5fcR`m5x_@CPM-x0L#l#GtC6?PaZ@UPe2qYlpu^A*4 zyS^tDZLbE1toCJ$!Q9^HpUg;W{R;bU&y(Vo9l^VqpG4OZ_U0ZA)}m6rOS~%(P#_@} zdHrGB*CgoU`10mXt@kST{pgIJP%3>@yluL8o~MLRdIIg0)hbh>s&uOrftI{)4|b;Y zD=4z2*Czh&G~@88Is}LjN`2=Q%F@`Q;Ablc49qP|?TV9$$R#S(%Mcex9+YW2#Ju#J zX`Q0GK9}tle%kI7jHr*W4L#>E=>ATfjgqgxHhhOUUX`9Y`suaMGz2}1#+=>?bXC)L zol)=K9&*>Rx?E>u3Oc_H%mmepQ&fscZEBaAzwxk-9YWuiGVtyRKFFlD_-+x!Wba~*EOM~@9OL7XMAEL~isf>GIYPkW{_M!>vFYie zgD7>3y{NZ%#fB#M;ar99C*x3RNdh65<)b|{!Zv@Nn?HI@26Yi3(t5ss*%{1BvfJAY zp`BkmW9D#}I}JjWCX>N+UBQ1?H6&KpvmV(H#9OFSNT3PZzF)IAtx6TxZP9;j@~qlI z4fjm*NsEASd-lj!;@_UcaJ`6t#2Q3a>c-kvjRdIgj(odX{sg##aK~bNO1v89SFxkg zaOb&pm?oU2XrO5hl0J^I_v9hT)nEv@=yXX9rg$YaKXvMd^fp& zCe`{BF+`K~ruOEvT*f2n)T2Nuo=`kxlSf)&aJC^{=;UhDlu*wv8;mP~dnqv=T8WJ8 zy5>Cv7vbt!mEoseXzv9Y z;cVY0P^6Bfc$FV`V{nbT4u##!KMo);cETyRSMZWzQ(Va!aYWMuCyz1}CEq=)1S2R}%iz;+*+xm8oK%%-RG#%*; zR7ZxztichJYjz`hn?ug+=bDW9sUG`WAP}s?B}Iwn5ADi*p3;$I`M2LrN6gkB4_lhL z7@qu9b$}1wjQiv})=f*ZU+{8T?t`nch@zTC(G(~m}OEm1WBm!oB};$Qyqei=v zOjz{YkR$tZg3NI1{?ZuD^*qF6eO}8zu;0tzkFkq%X5yc zj$Ny&lsqI%6-Fv-twLbiDAWp;h*G`0>ZXC;Q|Z_E{J9jCMcE{+EpKK`;kwAby?VND zqH0H%@JpJKAXZN=`jG!N$N&C>)LAe>vKjgy+4xNL8mC$vFZA^*|Br+0v)TW>+WxL5 ztvL^#y{^bL52bmk(0^ZsSF6ic?fbq&5DCIIoVWIGgY;i2PC3+#{~>}u>i_5B=6^5d zepi?OvbO!73ex{8k!yT!R?fC57XPbwAK42$!gh`QTILa88Yq{2#S{Vnmxw!tA+}?s zT`%9+-WCVV;5^+>`Aodn<`LG;`T4mw&T1acENiYU_RZg@7 z3Q5%=MO5x7qLepSh@u#2_v~?;5Vj@q9wwf-(au6|wn zgwfu46!>>&8Q`J(8|$2E<5!24?d`mkv^HN6$er~<4uaq^Hf3Uzh1idNwCPV0x4yQ(F{~&{Hp3Bm+``Uo^mi12|8}7zP_97 z8LC(GkwNdfV1gWUUFu{PL!aCiQjS}JC01UuR>RZO2g@JEOOu4~RJn&XPqr8^g5vy0 zgL_c{NN}STbLC=4DH6_QcPc6!YC-1s(fd$r;Ter%oU^aBcQX-Nf_-(<@qyzjW4t|< zv6+scM``jh#WJ{GrZMUqLOw__bHy|>Xd>13Exb1A#0mBWCR-aJu{r)ryGCM!d~Zqa zjOhor4`GaJ6?DdbrQV>80D#<*B&@w->?*BkqLB0#pFZ*Bbcjo_E(b+EYt9Sof{qq3 zP44zMWaN|Qj+Rzn+?@HOAo$K$13?rUk-S#pq4Ks%Z-81*i28Y*RjbF@x%yqDvE2*U zaS)dobJEJ~ekc z$x?5(V%&x>R{MSTtaS1HN0GW7!L8#*x|74^hl`%Ad5N{8#d-Hh|^RZ zyxJsV7t1h}vIA^+G>mUIzderc#uu+g_curz$=Y9)S7tLHAfa(@zzHR=Bc$qraCXRAKuLZ4HiBy^=UHq8(S$)^X z@QM#y)R3En)9^adfH-xJ=EMfX&QbG5m1j~6ksHd?Nnp&&cb_o0iKt$X-rz&VemdpK zJ!R!(f}E=7llMx`Y&aHa_^Uft7l74)n(Pla)73MYb}yC}7SNPfbj$9Dc9>Ffr7Fo2 zS&QAC!EH>hh&)T7v)8b?qkNtGwVk#YcG9>NRp=A96;G23vCBR=W2L)fg%5P14D=GR zV^0w*Va7Rh=e2CT`}rXb-07r*Z1yw(g(&n)+V7{*S?AetGz;rX^O9dns*>M*2VO*e}!)m^AAaXC;WB9`$I-A<@5axw+e5c0}ienOAhyIBoZ<1EBqRur1 zf!9wc*(q+ZA7>C6#Q4=jw~M}uDo`z28E28JXq)ety;cBhZUfl8dd=$Wj zDT^z()kM7*cI=kc(ztf8+v(ZzM9nW}bQ>&p-&#gB4#U?1MZ@tA2xV%rHQSxsE$*#l z&d{Y^-mP5hz#z-C>Ps*+Fi-6b)1T(@@^jeR`Y@2)=n=<-&Qw#Y^E@_*)sT4^i^@li@RxY@GR<$i598A2jchb>KoYyv#&*dQSHn~*r0dszNs}$w zlbtBENE?dCStQRSTm>-LR=esmBb&oNJjAPJG&|O{cw`_x{B91hw7q_e#~W0X9ATb* z9P!X9VBx*_7`p13SQ_~Tb0O81c;_qoV(3oa>TSwMzrY7vfda(v);y|P+NyTXy)LbB zobSVKeQaS(6gZ50Gw}E}zd7=2RAtJLjUuSH*MlCfsdkEl_%B-vbGRdy7$J}k*>7E# zZ1pg{c(%AFyznCa)>=o!PxpNaQ=l`_w_)5p(Xe-9o+uZ2o0X(y1*@udLgt$62eUbR z6Z<+W`;@!gWcdlHgI#=H0Ko0>IUgMjsm*4l(zi=~>XJ|N(buVC@Cu0bG}ksW_2;QGN40d7dTFC?G%-l3 zzLy($>Dx-KxgChA-lLMiS9eD*EE!BbTIfa(?d1=T9nt*o*^B9E zn7#3OQ!?Q(g@k_TE6J)iF;ckv0=x|zirHE;MqNUSUOipc$o--DT1DU5q*^h)DQT3v zSY2~Vo|6usWp_%qD`O=I$%>6yqZP+`cFMJ2?09pmV?b14fCmO6HS5#QZu6yNU{|kS zo8mp4-XLAdfBo9+d5Ba$mgx7bYjP72Bk+ycXfoNGr|b8@znb>Al=t7$!LXfrevz9s zVC+K7#*XGr+V;4+o7cfccN>#IY$-VO_M8AVLzq>c!~60hk+7wjtkF!hU7FNwQH9Vh z>E>IbYiL9>Et0`K`>wf%_Hxf6)=$H+k+X8E0w!Zp4Y49o=vFGdSd6QQD!F3I4{jT7N+r|nnYYRhVj%I&gZgP#UM&Z67RQ_E?GtOA0QO(8Vb1VG>@siuU#0qWV0dE8S8ee-3db^4^&OminVYR&)?w(m#U69S{Xtx+%uozwfe zdor~OrD-rM;l+2WIVLsX%Qt=$i)%G{O|myaXg*S|#RrS#?k1$xJHaoSBj_`tN44Azs|;UQ@b9#fPqOgS@-w%%bmm*OQV+#~U=A z4F!hE{$A;p$L1&Z6PQ!nL#oP}KHw`=8?bOHvpjI0dEry&kD#BIk>9opi%m%LhhPDl zh>Boi7IbvkQwG{HmR*5W(OLW(}vZTxgGIUBfy?#?V9&Cj_|DQM(`WmbpPn zkb3!%tAr8%Eh9ET-ZIBbP8zk7Y7YiHr$(+%%Ql-)q0XS zUj0G0#(DHnH_~2}ncd(uoq9K9Zk}XNA8E{V?Tp=UBOMh8z5Ugv^sY&h-rd`7C0{hQq+u z%x3k`JrBMIJGd9jM&U`w6d(N$AIxwC7|$R*j%p2vH*pe&Lpv@z^uFBF;dkSYuT|Vi zQeiN|zNL_oE1_L3gFtsx%@kouf8SHs22i1;M{bZ|DcRA?PPy5H zIak+@PhAG3-%{M}Mr#$+zW?0nu`~Lr)vQip!+j?5O?rZ&mmPdByz9~ z=lf=gFSc043KVlMCHBojGm_}`%B{^bQJr7TpgnrXknBv=;XWfCl5aWAG(L=EJF?f{ z#Wv19>@Nr465Gd;<1^fU$z$i-^jcH(wn*5^nz@3Z>P7$42VM^dan~?+M$aDNK2by4 z^sw$2_!a?+L-%}1uux}12t|pkO2xrBquCDa z#vR;!m3NlDoSaV|7vZ6ql`$@@;vFUhy)%-aR_yOqyz3irLt!9sSDjdGdl*3(I*fkV zkq2xx4%`)Ubrb=T{fpO0JFAlX5+ihLgIG1CHqR?xDAw#Was~AqA+qs5mOGXK=_bR^ zpS+%-evqWaKYK>KZ&!$q>nI_e!2J>CI&5O`na4|5)tdbNtl{&Ir*+59``XD_a+P}4^jNB{4g5t+d1)$Fa#hBHLXzdW(&S<_)n-(ZLKO)T$ zyt7^^`4ahC#Lbr@XAE+OMVMl#!YxbQ*OM;Lvalxl!q4&=i~(7rMSmTmmqi)~sNU;L z{JiPe)|^DvG7I6aYE86e&epv<69plF7W*gVi3=xjlNs;3%}?*BjHgRYd{Aj>Jc*L- zoy67WY_t0hV zyvcmZzJ(W<2$b5|52JcJJgR@ZuI{0zcnf@YCYv=ZhCy})UMe*oxr-4oLzv&9tM9&v zJkdqUpKl)0cx+GD!}q$bC!JT0Negl_g9!gEZe7|c4ZllVzV7?B4YH9$Yc6TO9edHb#C z>HCb+xas;(RTnyq+_(KN63KZTOytY>-PeX`1Vr|*a$4iEP6y?h*iTy!Mod@_9&z@@ zuXMib zu)f67J)v{Ceg1CJQz~-m>09fa*7Teh)?yQyP8rY>EQI7$W~RE4=@^Z*Il+k+Q_zIGmoz zW>I!w_JFw6NX2j=LTuHMq{K2D55L`XfLSL`spulZ(Z{XA(&XT)@#%CCmUev|!B8If z=M%Ak&Ha3znRfN{Svm2b5t_SUhGw6O<9kpd3ccbM6SFXGHq|{j(S0w)UICQ14#-As zc=MhlKzC;LxrdjYE4l6D1tneU8T(qJBs5y_R>Zs8*z<%->HLe|aK{^sP5h+Y^oDjSgmf_~Ux3xVr#>Jqq<%MzOu%aPXm)Iew)18@-_VQM$LXoZw)m@k@2RxwUeQ1Q zS_`va7mO#Z<2*>Z7|P!jhdSmj55ZEh{Pv{0U0>iWzZqA+NB2r70g;j{rnJ^6%j zB)D5gOv|+iYs;jLNh)TBhN?K9#FgOMNQOnkR(X>xVMD~sBggshY4(FW41_3Gn+RHM zdR=UuHvf8&0NOPN}F+k`oW zVLNjjhp7*rWce&ZnCfSc&5*3!6ECwzM&;@rnzY%ZNne*zol_tz_Vfzw>o@E(=F#x- zk4|RhcIs2r5=L%4R~1p>t>S-=w*T(=hLI^D^{6JBTg=o{Ps{WZDZl06EdNtwjxQs% z9|ssKGE#^saz_9Wh#J)krdO9LYwsj=V>N6fy2wpazhU_+R{9$+7)|wY8SN)<`7^9- zHG~wgGn?CnU=mu3F=*%r3tVfpEyfdNtJ5qfD%cLnTq4!Q?lP@8p5!s&igeEo7JHmn zT~}XT*3-Kys@~?&m}&5aOT+E5#J;2UA_nhXb;5$mRLPhluC~oH{M9#==pEAI(J-~U zUIH7Hv!bHvDl5eGJ4Gjgs7Ps%TV0DnDeVy^ORmLJcC$(`?u4jBm&woJ(sRYdFoSYk z5r^!BIHUynof}>b5#_<~k*S4{qyihorV8Cj=I2TthkRq>jg3^g)sq#BjA3J|cV(l6 z(=d?I{YU3FEmAFnX-}8$6WYsVb9cAfyu9O`?CdaosWs(;PVlNeCx`m3F5u5ib~aTm znz3pVa_gh1CzG5&y}ahg@&&x)O;*7=_59c_CFbgj8VEmElaOT*!F3*5U39;$&dXC7 zmS%a%Ua8}5ibr4k9x)An*}=DNtsZTTGw&FrLElCbDb-ivMZb(^f%*ltF>R_q~QtP*c8>m>QGDQ=1-TmaKE$n3P)i}GQxVYs3&Ur=%;+Gr9+gNohA!Os8G^McT+8gY(H`^mZU9Vy z=Dp6W8~MIF*QLi6N7LeMrm+lBWvq?%k{r|e35gw@ZoO)z=(6xIo569cxv|~_#kAR_ z8*fz8Mwg;fY7<6+94fk?Uyq9fcCMq?D}>bO^2Eui^yV_hKDx!=8@|3FKpY+9i#*a) zJo2d;q7y<<4V%bFI_1HT+^FFmd`WyoAWR7J#dANfkZsDjhme79Pyc)eN)kf; z8!JLse!Q0i0lx+UqU?dI*Q~GJle+q<;_65Q&#U{NS6Ff+fCAou-$MYt`Osh7@X37G zFAQA(+W;ADIhCsqOSCQBt*u==>|8w!)HP|kK~3VmV!qC9&eophG``MGE*@gO5_INP zt`^o}Aig4V)6x83@pP1+yAqOfb+@Jw;^N`rp_4?Xp`j6Xx3UqtC9n9GICv*PXXokZ zCdSR}8``Z7pxc9(N?25ovu{^@|f i?Eby+FF|qctI+?Jil2G@K?M^ni7w9lyVE4mf&UM7Aey%T literal 0 HcmV?d00001 From a85810464999912bae56becd67bf0fe7059e2fda Mon Sep 17 00:00:00 2001 From: Max <133135930+codeplaymax@users.noreply.github.com> Date: Tue, 13 Aug 2024 09:18:43 +0100 Subject: [PATCH 11/16] Create 2024-07-30-oneapi-construction-kit-4.0-brings-risc-v-host-cpu-support.md --- ...on-kit-4.0-brings-risc-v-host-cpu-support.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) create mode 100644 content/news/2024/2024-07-30-oneapi-construction-kit-4.0-brings-risc-v-host-cpu-support.md diff --git a/content/news/2024/2024-07-30-oneapi-construction-kit-4.0-brings-risc-v-host-cpu-support.md b/content/news/2024/2024-07-30-oneapi-construction-kit-4.0-brings-risc-v-host-cpu-support.md new file mode 100644 index 0000000..9350f2b --- /dev/null +++ b/content/news/2024/2024-07-30-oneapi-construction-kit-4.0-brings-risc-v-host-cpu-support.md @@ -0,0 +1,17 @@ +--- +contributor: max +date: '2024-07-30T14:10:22.153253' +external_url: https://www.phoronix.com/news/oneAPI-Construction-Kit-4.0 +title: 'oneAPI Construction Kit 4.0 Brings RISC-V Host CPU Support' +image: ../../../static/images/news/2024-07-30-oneapi-construction-kit.webp +tags: + - oneapi + - sycl + - hpc + - portability +--- + +Last year the oneAPI Construction Kit was introduced by Intel-owned Codeplay Software +for bringing SYCL to new hardware even for hardware outside of Intel's offerings. One +of the early targets of this oneAPI Construction Kit support was for RISC-V processors +and now with today's release of oneAPI Construction Kit 4.0 there is finally RISC-V host CPU support. From 52b84d84feae3606cf12c7656f588988020b3802 Mon Sep 17 00:00:00 2001 From: Max <133135930+codeplaymax@users.noreply.github.com> Date: Tue, 13 Aug 2024 09:21:35 +0100 Subject: [PATCH 12/16] Create 2024-07-05-bring-your-code-to-riscv-accelerators-with-sycl.md --- ...ur-code-to-riscv-accelerators-with-sycl.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) create mode 100644 content/videos/2024/2024-07-05-bring-your-code-to-riscv-accelerators-with-sycl.md diff --git a/content/videos/2024/2024-07-05-bring-your-code-to-riscv-accelerators-with-sycl.md b/content/videos/2024/2024-07-05-bring-your-code-to-riscv-accelerators-with-sycl.md new file mode 100644 index 0000000..6b76bc0 --- /dev/null +++ b/content/videos/2024/2024-07-05-bring-your-code-to-riscv-accelerators-with-sycl.md @@ -0,0 +1,19 @@ +--- +contributor: max +date: '2024-07-05T14:16:16.441310' +title: 'Bring your code to RISC-V accelerators with SYCL' +external_url: https://www.youtube.com/watch?v=Jw_nEtYi2-k +type: presentation +tags: + - hpc + - risc-v +--- + +This talk will show attendees how to overcome proprietary code with RISC-V and SYCL. +They will learn how they can achieve code portability and adopt RISC-V hardware without +losing their existing work, for greater productivity. + +The talk will also highlight the ongoing research into pioneering applications for RISC-V, +funded by the EU Horizon programme. AERO and SYCLOPS are two such projects. AERO seeks +to enable the future heterogeneous EU cloud infrastructure, while SYCLOPS will bring +together the RISC-V and SYCL standards together into a single software stack for the first time. From dcac92e2272ad9498546b7a8a3fb275b327bad57 Mon Sep 17 00:00:00 2001 From: Scott Straughan Date: Tue, 13 Aug 2024 11:46:22 +0100 Subject: [PATCH 13/16] Tidied line length. --- ...-construction-kit-4.0-brings-risc-v-host-cpu-support.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/content/news/2024/2024-07-30-oneapi-construction-kit-4.0-brings-risc-v-host-cpu-support.md b/content/news/2024/2024-07-30-oneapi-construction-kit-4.0-brings-risc-v-host-cpu-support.md index 9350f2b..b74a09f 100644 --- a/content/news/2024/2024-07-30-oneapi-construction-kit-4.0-brings-risc-v-host-cpu-support.md +++ b/content/news/2024/2024-07-30-oneapi-construction-kit-4.0-brings-risc-v-host-cpu-support.md @@ -11,7 +11,6 @@ tags: - portability --- -Last year the oneAPI Construction Kit was introduced by Intel-owned Codeplay Software -for bringing SYCL to new hardware even for hardware outside of Intel's offerings. One -of the early targets of this oneAPI Construction Kit support was for RISC-V processors -and now with today's release of oneAPI Construction Kit 4.0 there is finally RISC-V host CPU support. +Last year the oneAPI Construction Kit was introduced by Intel-owned Codeplay Software for bringing SYCL to new hardware +even for hardware outside of Intel's offerings. One of the early targets of this oneAPI Construction Kit support was for +RISC-V processors and now with today's release of oneAPI Construction Kit 4.0 there is finally RISC-V host CPU support. From caa742435af07e96fe8b79c6b092bd59c7587c77 Mon Sep 17 00:00:00 2001 From: Scott Straughan Date: Tue, 13 Aug 2024 11:46:30 +0100 Subject: [PATCH 14/16] Fixed typos. --- ...pen-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md b/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md index cc71f2a..f509271 100644 --- a/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md +++ b/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md @@ -16,12 +16,12 @@ tags: - hip --- -Computational platforms for high-performance scientific applications are becoming more heterogenous, including hardware -accelerators such as multiple GPUs. Applications in a wide variety of scientific fields require an efcient and careful +Computational platforms for high-performance scientific applications are becoming more heterogeneous, including hardware +accelerators such as multiple GPUs. Applications in a wide variety of scientific fields require an efficient and careful management of the computational resources of this type of hardware to obtain the best possible performance. However, there are currently different GPU vendors, architectures and families that can be found in heterogeneous clusters or machines. Programming with the vendor provided languages or frameworks, and optimizing for specific devices, may become -cumbersome and compromise porta-bility to other systems. To overcome this problem, several proposals for high-level +cumbersome and compromise portability to other systems. To overcome this problem, several proposals for high-level heterogeneous programming have appeared, trying to reduce the development effort and increase functional and performance portability, specifically when using GPU hardware accelerators. This paper evaluates the SYCL programming model, using the Open SYCL compiler, from two different perspectives: The performance it offers when dealing with single or multiple GPU devices From 632fbc3259d3d66ae53d7f85207189a920329908 Mon Sep 17 00:00:00 2001 From: Scott Straughan Date: Tue, 13 Aug 2024 11:53:19 +0100 Subject: [PATCH 15/16] Fixed line length overflowing and some typos. --- ...terogeneous-gpu-systems-a-case-of-study.md | 31 ++++++++++--------- ...implementation-of-crk-hacc-for-exascale.md | 20 ++++++------ ...par-implementations-on-amd-gpus-for-hpc.md | 24 +++++++------- ...l-a-hep-case-study-with-root-rdataframe.md | 21 +++++++------ ...-solver-for-compressible-reacting-flows.md | 28 +++++++++-------- ...ty-on-the-ligen-drug-discovery-pipeline.md | 25 +++++++++------ ...ur-code-to-riscv-accelerators-with-sycl.md | 13 ++++---- 7 files changed, 87 insertions(+), 75 deletions(-) diff --git a/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md b/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md index f509271..2ece7fb 100644 --- a/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md +++ b/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md @@ -16,18 +16,19 @@ tags: - hip --- -Computational platforms for high-performance scientific applications are becoming more heterogeneous, including hardware -accelerators such as multiple GPUs. Applications in a wide variety of scientific fields require an efficient and careful -management of the computational resources of this type of hardware to obtain the best possible performance. However, -there are currently different GPU vendors, architectures and families that can be found in heterogeneous clusters or -machines. Programming with the vendor provided languages or frameworks, and optimizing for specific devices, may become -cumbersome and compromise portability to other systems. To overcome this problem, several proposals for high-level -heterogeneous programming have appeared, trying to reduce the development effort and increase functional and performance -portability, specifically when using GPU hardware accelerators. This paper evaluates the SYCL programming model, using the -Open SYCL compiler, from two different perspectives: The performance it offers when dealing with single or multiple GPU devices -from the same or different vendors, and the development effort required to implement the code. We use as case of study the Finite Time Lyapunov Exponent -calculation over two real-world scenarios and compare the performance and the development effort of its Open SYCL-based version against -the equivalent versions that use CUDA or HIP. Based on the experimental results, we observe that the use of SYCL does not lead to a -remarkable overhead in terms of the GPU kernels execution time. In general terms, the Open SYCL development effort for the host code -is lower than that observed with CUDA or HIP. Moreover, the SYCL version can take advantage of both CUDA and AMD GPU devices simultaneously -much easier than directly using the vendor-specific programming solutions. +Computational platforms for high-performance scientific applications are becoming more heterogeneous, including hardware +accelerators such as multiple GPUs. Applications in a wide variety of scientific fields require an efficient and careful +management of the computational resources of this type of hardware to obtain the best possible performance. However, +there are currently different GPU vendors, architectures and families that can be found in heterogeneous clusters or +machines. Programming with the vendor provided languages or frameworks, and optimizing for specific devices, may become +cumbersome and compromise portability to other systems. To overcome this problem, several proposals for high-level +heterogeneous programming have appeared, trying to reduce the development effort and increase functional and performance +portability, specifically when using GPU hardware accelerators. This paper evaluates the SYCL programming model, using +the Open SYCL compiler, from two different perspectives: The performance it offers when dealing with single or multiple +GPU devices from the same or different vendors, and the development effort required to implement the code. We use as +case of study the Finite Time Lyapunov Exponent calculation over two real-world scenarios and compare the performance +and the development effort of its Open SYCL-based version against the equivalent versions that use CUDA or HIP. Based on +the experimental results, we observe that the use of SYCL does not lead to a remarkable overhead in terms of the GPU +kernels execution time. In general terms, the Open SYCL development effort for the host code is lower than that observed +with CUDA or HIP. Moreover, the SYCL version can take advantage of both CUDA and AMD GPU devices simultaneously much +easier than directly using the vendor-specific programming solutions. diff --git a/content/research_papers/2023/2023-10-24-a-performance-portable-sycl-implementation-of-crk-hacc-for-exascale.md b/content/research_papers/2023/2023-10-24-a-performance-portable-sycl-implementation-of-crk-hacc-for-exascale.md index 3fc7e98..3f70819 100644 --- a/content/research_papers/2023/2023-10-24-a-performance-portable-sycl-implementation-of-crk-hacc-for-exascale.md +++ b/content/research_papers/2023/2023-10-24-a-performance-portable-sycl-implementation-of-crk-hacc-for-exascale.md @@ -18,12 +18,14 @@ tags: - heterogeneous-programming --- -The first generation of exascale systems will include a variety of machine architectures, featuring GPUs from multiple vendors. -As a result, many developers are interested in adopting portable pro- gramming models to avoid maintaining multiple versions of -their code. It is necessary to document experiences with such program- ming models to assist developers in understanding the advantages and -disadvantages of different approaches. -To this end, this paper evaluates the performance portability of a SYCL implementation of a large-scale cosmology application (CRK-HACC) -running on GPUs from three different vendors: AMD, Intel, and NVIDIA. We detail the process of migrating the original code from CUDA to -SYCL and show that specializing kernels for specific targets can greatly improve performance portability with- out significantly impacting -programmer productivity. The SYCL version of CRK-HACC achieves a performance portability of 0.96 with a code divergence of almost 0, -demonstrating that SYCL is a viable programming model for performance-portable applications. +The first generation of exascale systems will include a variety of machine architectures, featuring GPUs from multiple +vendors. As a result, many developers are interested in adopting portable programming models to avoid maintaining +multiple versions of their code. It is necessary to document experiences with such programming models to assist +developers in understanding the advantages and disadvantages of different approaches. + +To this end, this paper evaluates the performance portability of a SYCL implementation of a large-scale cosmology +application (CRK-HACC) running on GPUs from three different vendors: AMD, Intel, and NVIDIA. We detail the process of +migrating the original code from CUDA to SYCL and show that specializing kernels for specific targets can greatly +improve performance portability without significantly impacting programmer productivity. The SYCL version of CRK-HACC +achieves a performance portability of 0.96 with a code divergence of almost 0, demonstrating that SYCL is a viable +programming model for performance-portable applications. diff --git a/content/research_papers/2024/2024-01-05-preliminary-report-initial-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md b/content/research_papers/2024/2024-01-05-preliminary-report-initial-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md index 31d51dd..5b0f80b 100644 --- a/content/research_papers/2024/2024-01-05-preliminary-report-initial-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md +++ b/content/research_papers/2024/2024-01-05-preliminary-report-initial-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md @@ -13,14 +13,16 @@ tags: - hip --- -Recently, AMD platforms have not supported offloading C++17 PSTL (StdPar) programs to the GPU. Our previous work highlights -how StdPar is able to achieve good performance across NVIDIA and Intel GPU platforms. In that work, we acknowledged AMD’s past -effort such as HCC, which unfortunately is deprecated and does not support newer hardware platforms. Recent developments by AMD, Codeplay, -and AdaptiveCpp (previously known as hipSYCL or OpenSYCL) have enabled multiple paths for StdPar programs to run on AMD GPUs. -This informal report discusses our experiences and evaluation of currently available StdPar implementations for AMD GPUs. -We conduct benchmarks using our suite of HPC mini-apps with ports in many heterogeneous programming models, including StdPar. -We then compare the performance of StdPar, using all available StdPar compilers, to contemporary heterogeneous programming models -supported on AMD GPUs: HIP, OpenCL, Thrust, Kokkos, OpenMP, SYCL. Where appropriate, we discuss issues encountered and workarounds -applied during our evaluation. Finally, the StdPar model discussed in this report largely depends on Unified Shared Memory (USM) performance -and very few AMD GPUs have proper support for this feature. As such, this report demonstrates a proof-of-concept host-side userspace pagefault -solution for models that use the HIP API. We discuss performance improvements achieved with our solution using the same set of benchmarks. +Recently, AMD platforms have not supported offloading C++17 PSTL (StdPar) programs to the GPU. Our previous work +highlights how StdPar is able to achieve good performance across NVIDIA and Intel GPU platforms. In that work, we +acknowledged AMD’s past effort such as HCC, which unfortunately is deprecated and does not support newer hardware +platforms. Recent developments by AMD, Codeplay, and AdaptiveCpp (previously known as hipSYCL or OpenSYCL) have enabled +multiple paths for StdPar programs to run on AMD GPUs. This informal report discusses our experiences and evaluation of +currently available StdPar implementations for AMD GPUs. We conduct benchmarks using our suite of HPC mini-apps with +ports in many heterogeneous programming models, including StdPar. We then compare the performance of StdPar, using all +available StdPar compilers, to contemporary heterogeneous programming models supported on AMD GPUs: HIP, OpenCL, Thrust, +Kokkos, OpenMP, SYCL. Where appropriate, we discuss issues encountered and workarounds applied during our evaluation. +Finally, the StdPar model discussed in this report largely depends on Unified Shared Memory (USM) performance and very +few AMD GPUs have proper support for this feature. As such, this report demonstrates a proof-of-concept host-side +userspace pagefault solution for models that use the HIP API. We discuss performance improvements achieved with our +solution using the same set of benchmarks. diff --git a/content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md b/content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md index 20c1f61..0c3d711 100644 --- a/content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md +++ b/content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md @@ -14,13 +14,14 @@ tags: - heterogeneous-programming --- -The world’s largest particle accelerator, located at CERN, produces petabytes of data that need to be analysed efficiently, -to study the fundamental structures of our universe. ROOT is an open-source C++ data analysis framework, developed for this -purpose. Its high- level data analysis interface, RDataFrame, currently only supports CPU parallelism. Given the increasing -heterogeneity in computing facilities, it becomes crucial to efficiently support GPGPUs to take advantage of the available -resources. SYCL allows for a single-source implementation, which enables support for different architectures. In this paper, -we describe a CUDA implementation and the migration process to SYCL, focusing on a core high energy physics operation in -RDataFrame – histogramming. We detail the challenges that we faced when integrating SYCL into a large and complex code base. -Furthermore, we perform an extensive comparative performance analysis of two SYCL compilers, AdaptiveCpp and DPC++, and the -reference CUDA implementation. We highlight the performance bottlenecks that we encountered, and the methodology used to detect -these. Based on our findings, we provide actionable insights for developers of SYCL applications. +The world’s largest particle accelerator, located at CERN, produces petabytes of data that need to be analysed +efficiently, to study the fundamental structures of our universe. ROOT is an open-source C++ data analysis framework, +developed for this purpose. Its high-level data analysis interface, RDataFrame, currently only supports CPU parallelism. +Given the increasing heterogeneity in computing facilities, it becomes crucial to efficiently support GPGPUs to take +advantage of the available resources. SYCL allows for a single-source implementation, which enables support for +different architectures. In this paper, we describe a CUDA implementation and the migration process to SYCL, focusing on +a core high energy physics operation in RDataFrame – histogramming. We detail the challenges that we faced when +integrating SYCL into a large and complex code base. Furthermore, we perform an extensive comparative performance +analysis of two SYCL compilers, AdaptiveCpp and DPC++, and the reference CUDA implementation. We highlight the +performance bottlenecks that we encountered, and the methodology used to detect these. Based on our findings, we provide +actionable insights for developers of SYCL applications. diff --git a/content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows.md b/content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows.md index 4f5fca7..844e2d7 100644 --- a/content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows.md +++ b/content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows.md @@ -15,16 +15,18 @@ tags: --- We present a cross-architecture high-order heterogeneous Navier-Stokes simulation solver, XFluids, for compressible -reacting multicomponent flows on different platforms. The multi-component reacting flows are ubiquitous in many scientific -and engineering applications, while their numerical simulations are usually time-consuming to capture the underlying -multiscale features. Although heterogeneous accelerated computing is significantly beneficial for large-scale simulations -of these flows, effective utilization of various heterogeneous accelerators with different architectures and programming -models in the market remains a challenge. To address this, we develop XFluids by SYCL, to perform acceleration directly -targeted to different devices, without translating any source code. A variety of optimization techniques have been proposed -to increase the computational performance of XFluids, including adaptive range assignment, partial eigensystem reconstruction, -hotspot device function optimizations, etc. This solver has been open-sourced, and tested on multiple GPUs from different mainstream -vendors, indicating high portability. Through various benchmark cases, the accuracy of XFluids is demonstrated, with approximately -no efficiency loss compared to existing GPU programming models, such as CUDA and HIP. In addition, the MPI library is used to extend -the solver to multi-GPU platforms, with the GPU-enabled MPI supported. With this, the weak scaling of XFluids for multi-GPU devices -is larger than 95% for 1024 GPUs. Finally, we simulate both the inert and reactive multicomponent shock-bubble interaction problems -with high-resolution meshes, to investigate the reacting effects on the mixing, vortex stretching, and shape deformation of the bubble evolution. +reacting multi-component flows on different platforms. The multi-component reacting flows are ubiquitous in many +scientific and engineering applications, while their numerical simulations are usually time-consuming to capture the +underlying multiscale features. Although heterogeneous accelerated computing is significantly beneficial for large-scale +simulations of these flows, effective utilization of various heterogeneous accelerators with different architectures and +programming models in the market remains a challenge. To address this, we develop XFluids by SYCL, to perform +acceleration directly targeted to different devices, without translating any source code. A variety of optimization +techniques have been proposed to increase the computational performance of XFluids, including adaptive range assignment, +partial eigen-system reconstruction, hotspot device function optimizations, etc. This solver has been open-sourced, and +tested on multiple GPUs from different mainstream vendors, indicating high portability. Through various benchmark cases, +the accuracy of XFluids is demonstrated, with approximately no efficiency loss compared to existing GPU programming +models, such as CUDA and HIP. In addition, the MPI library is used to extend the solver to multi-GPU platforms, with the +GPU-enabled MPI supported. With this, the weak scaling of XFluids for multi-GPU devices is larger than 95% for 1024 +GPUs. Finally, we simulate both the inert and reactive multi-component shock-bubble interaction problems with +high-resolution meshes, to investigate the reacting effects on the mixing, vortex stretching, and shape deformation of +the bubble evolution. diff --git a/content/research_papers/2024/2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md b/content/research_papers/2024/2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md index 9afc74f..30f4f77 100644 --- a/content/research_papers/2024/2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md +++ b/content/research_papers/2024/2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md @@ -21,13 +21,18 @@ tags: - hip --- -In recent years, there has been a growing interest in developing high-performance implementations of drug discovery processing software. To target modern GPU architectures, -such applications are mostly written in proprietary languages such as CUDA or HIP. However, with the increasing heterogeneity of modern HPC systems and the availability of -accelerators from multiple hardware vendors, it has become critical to be able to efficiently execute drug discovery pipelines on multiple large-scale computing systems, -with the ultimate goal of working on urgent computing scenarios. This article presents the challenges of migrating LiGen, an industrial drug discovery software pipeline, -from CUDA to the SYCL programming model, an industry standard based on C++ that enables heterogeneous computing. We perform a structured analysis of the performance portability -of the SYCL LiGen platform, focusing on different aspects of the approach from different perspectives. First, we analyze the performance portability provided by the high-level semantics -of SYCL, including the most recent group algorithms and subgroups of SYCL 2020. Second, we analyze how low-level aspects such as kernel occupancy and register pressure affect -the performance portability of the overall application. The experimental evaluation is performed on two different versions of LiGen, implementing two different parallelization -patterns, by comparing them with a manually optimized CUDA version, and by evaluating performance portability using both known and ad hoc metrics. The results show that, -thanks to the combination of high-level SYCL semantics and some manual tuning, LiGen achieves native-comparable performance on NVIDIA, while also running on AMD GPUs. +In recent years, there has been a growing interest in developing high-performance implementations of drug discovery +processing software. To target modern GPU architectures, such applications are mostly written in proprietary languages +such as CUDA or HIP. However, with the increasing heterogeneity of modern HPC systems and the availability of +accelerators from multiple hardware vendors, it has become critical to be able to efficiently execute drug discovery +pipelines on multiple large-scale computing systems, with the ultimate goal of working on urgent computing scenarios. +This article presents the challenges of migrating LiGen, an industrial drug discovery software pipeline, from CUDA to +the SYCL programming model, an industry standard based on C++ that enables heterogeneous computing. We perform a +structured analysis of the performance portability of the SYCL LiGen platform, focusing on different aspects of the +approach from different perspectives. First, we analyze the performance portability provided by the high-level semantics +of SYCL, including the most recent group algorithms and subgroups of SYCL 2020. Second, we analyze how low-level aspects +such as kernel occupancy and register pressure affect the performance portability of the overall application. The +experimental evaluation is performed on two different versions of LiGen, implementing two different parallelization +patterns, by comparing them with a manually optimized CUDA version, and by evaluating performance portability using both +known and ad hoc metrics. The results show that, thanks to the combination of high-level SYCL semantics and some manual +tuning, LiGen achieves native-comparable performance on NVIDIA, while also running on AMD GPUs. diff --git a/content/videos/2024/2024-07-05-bring-your-code-to-riscv-accelerators-with-sycl.md b/content/videos/2024/2024-07-05-bring-your-code-to-riscv-accelerators-with-sycl.md index 6b76bc0..9358678 100644 --- a/content/videos/2024/2024-07-05-bring-your-code-to-riscv-accelerators-with-sycl.md +++ b/content/videos/2024/2024-07-05-bring-your-code-to-riscv-accelerators-with-sycl.md @@ -9,11 +9,10 @@ tags: - risc-v --- -This talk will show attendees how to overcome proprietary code with RISC-V and SYCL. -They will learn how they can achieve code portability and adopt RISC-V hardware without -losing their existing work, for greater productivity. +This talk will show attendees how to overcome proprietary code with RISC-V and SYCL. They will learn how they can +achieve code portability and adopt RISC-V hardware without losing their existing work, for greater productivity. -The talk will also highlight the ongoing research into pioneering applications for RISC-V, -funded by the EU Horizon programme. AERO and SYCLOPS are two such projects. AERO seeks -to enable the future heterogeneous EU cloud infrastructure, while SYCLOPS will bring -together the RISC-V and SYCL standards together into a single software stack for the first time. +The talk will also highlight the ongoing research into pioneering applications for RISC-V, funded by the EU Horizon +programme. AERO and SYCLOPS are two such projects. AERO seeks to enable the future heterogeneous EU cloud +infrastructure, while SYCLOPS will bring together the RISC-V and SYCL standards together into a single software stack +for the first time. From 6a00cc4c04da6ba7dd58cbcfd1b587912886e47f Mon Sep 17 00:00:00 2001 From: Scott Straughan Date: Tue, 13 Aug 2024 11:53:45 +0100 Subject: [PATCH 16/16] Removed needless timezone offset. --- ...01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md | 2 +- ...nce-portable-sycl-implementation-of-crk-hacc-for-exascale.md | 2 +- ...-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md | 2 +- ...rating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md | 2 +- ...geneous-simulation-solver-for-compressible-reacting-flows.md | 2 +- ...formance-portability-on-the-ligen-drug-discovery-pipeline.md | 2 +- 6 files changed, 6 insertions(+), 6 deletions(-) diff --git a/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md b/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md index 2ece7fb..b7f0c04 100644 --- a/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md +++ b/content/research_papers/2023/2023-10-01-open-sycl-on-heterogeneous-gpu-systems-a-case-of-study.md @@ -1,6 +1,6 @@ --- contributor: max -date: '2023-10-01T08:08:10.490000+00:00' +date: '2023-10-01T08:08:10.490000' title: 'Open SYCL on heterogeneous GPU systems: A case of study' external_url: https://arxiv.org/ftp/arxiv/papers/2310/2310.06947.pdf authors: diff --git a/content/research_papers/2023/2023-10-24-a-performance-portable-sycl-implementation-of-crk-hacc-for-exascale.md b/content/research_papers/2023/2023-10-24-a-performance-portable-sycl-implementation-of-crk-hacc-for-exascale.md index 3f70819..749c39b 100644 --- a/content/research_papers/2023/2023-10-24-a-performance-portable-sycl-implementation-of-crk-hacc-for-exascale.md +++ b/content/research_papers/2023/2023-10-24-a-performance-portable-sycl-implementation-of-crk-hacc-for-exascale.md @@ -1,6 +1,6 @@ --- contributor: max -date: '2023-10-24T08:08:10.490000+00:00' +date: '2023-10-24T08:08:10.490000' title: 'A Performance-Portable SYCL Implementation of CRK-HACC for Exascale' external_url: https://arxiv.org/pdf/2310.16122 authors: diff --git a/content/research_papers/2024/2024-01-05-preliminary-report-initial-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md b/content/research_papers/2024/2024-01-05-preliminary-report-initial-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md index 5b0f80b..5955b9d 100644 --- a/content/research_papers/2024/2024-01-05-preliminary-report-initial-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md +++ b/content/research_papers/2024/2024-01-05-preliminary-report-initial-evaluation-of-stdpar-implementations-on-amd-gpus-for-hpc.md @@ -1,6 +1,6 @@ --- contributor: max -date: '2024-01-05T08:08:10.490000+00:00' +date: '2024-01-05T08:08:10.490000' title: 'Preliminary report: Initial evaluation of StdPar implementations on AMD GPUs for HPC' external_url: https://arxiv.org/pdf/2401.02680 authors: diff --git a/content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md b/content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md index 0c3d711..c962ce8 100644 --- a/content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md +++ b/content/research_papers/2024/2024-01-24-lessons-learned-migrating-cuda-to-sycl-a-hep-case-study-with-root-rdataframe.md @@ -1,6 +1,6 @@ --- contributor: max -date: '2024-01-24T08:08:10.490000+00:00' +date: '2024-01-24T08:08:10.490000' title: 'Lessons Learned Migrating CUDA to SYCL: A HEP Case Study with ROOT RDataFrame' external_url: https://arxiv.org/pdf/2401.13310 authors: diff --git a/content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows.md b/content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows.md index 844e2d7..3ff3da2 100644 --- a/content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows.md +++ b/content/research_papers/2024/2024-03-09-xfluids-a-sycl-based-unified-cross-architecture-heterogeneous-simulation-solver-for-compressible-reacting-flows.md @@ -1,6 +1,6 @@ --- contributor: max -date: '2024-03-09T08:08:10.490000+00:00' +date: '2024-03-09T08:08:10.490000' title: 'XFLUIDS: A SYCL-based unified cross-architecture heterogeneous simulation solver for compressible reacting flows' external_url: https://arxiv.org/abs/2403.05910 authors: diff --git a/content/research_papers/2024/2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md b/content/research_papers/2024/2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md index 30f4f77..84bf470 100644 --- a/content/research_papers/2024/2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md +++ b/content/research_papers/2024/2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md @@ -1,6 +1,6 @@ --- contributor: max -date: '2024-04-16T08:08:10.490000+00:00' +date: '2024-04-16T08:08:10.490000' title: 'Enabling performance portability on the LiGen drug discovery pipeline' external_url: https://www.sciencedirect.com/science/article/pii/S0167739X24001195 authors: