Skip to content

Commit 3d13dcf

Browse files
authored
Create 2024-04-16-enabling-performance-portability-on-the-ligen-drug-discovery-pipeline.md
adds, "Enabling performance portability on the LiGen drug discovery pipeline"
1 parent b5c9826 commit 3d13dcf

File tree

1 file changed

+33
-0
lines changed

1 file changed

+33
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
---
2+
contributor: max
3+
date: '2024-04-16T08:08:10.490000+00:00'
4+
title: 'Enabling performance portability on the LiGen drug discovery pipeline'
5+
external_url: https://www.sciencedirect.com/science/article/pii/S0167739X24001195
6+
authors:
7+
- name: Luigi Crisci
8+
- name: Lorenzo Carpentieri
9+
- name: Biagio Cosenza
10+
- name: Leon Bogdanović
11+
- name: Gianmarco Accordi
12+
- name: Davide Gadioli
13+
- name: Emanuele Vitali
14+
- name: Gianluca Palermo
15+
- name: Andrea Rosario Beccari
16+
tags:
17+
- gpu
18+
- sycl
19+
- hpc
20+
- cuda
21+
- hip
22+
---
23+
24+
In recent years, there has been a growing interest in developing high-performance implementations of drug discovery processing software. To target modern GPU architectures,
25+
such applications are mostly written in proprietary languages such as CUDA or HIP. However, with the increasing heterogeneity of modern HPC systems and the availability of
26+
accelerators from multiple hardware vendors, it has become critical to be able to efficiently execute drug discovery pipelines on multiple large-scale computing systems,
27+
with the ultimate goal of working on urgent computing scenarios. This article presents the challenges of migrating LiGen, an industrial drug discovery software pipeline,
28+
from CUDA to the SYCL programming model, an industry standard based on C++ that enables heterogeneous computing. We perform a structured analysis of the performance portability
29+
of the SYCL LiGen platform, focusing on different aspects of the approach from different perspectives. First, we analyze the performance portability provided by the high-level semantics
30+
of SYCL, including the most recent group algorithms and subgroups of SYCL 2020. Second, we analyze how low-level aspects such as kernel occupancy and register pressure affect
31+
the performance portability of the overall application. The experimental evaluation is performed on two different versions of LiGen, implementing two different parallelization
32+
patterns, by comparing them with a manually optimized CUDA version, and by evaluating performance portability using both known and ad hoc metrics. The results show that,
33+
thanks to the combination of high-level SYCL semantics and some manual tuning, LiGen achieves native-comparable performance on NVIDIA, while also running on AMD GPUs.

0 commit comments

Comments
 (0)