Add vectorisation support (AVX, OpenMP SIMD) #827

toretto-uk · 2025-08-17T17:05:15Z

Overview

This PR introduces improved vectorisation in performance-critical collision step routines (e.g., CalculateDensityAndMomentum, CalculateFeq). Two approaches were added: compiler-guided vectorisation via OpenMP SIMD directives, and explicit 256-bit AVX intrinsics.

Enabling explicit 256-bit AVX vectorisation is configurable via -DHEMELB_USE_AVX=ON/OFF build option. AVX is disabled by default. Enabling OpenMP SIMD is configurable via -DHEMELB_USE_OPENMP_SIMD=ON/OFF build option. OpenMP SIMD is disabled by default.

Results

Across all systems and compilers tested, the AVX version consistently provides the best performance and scalability (outperforming the default SSE3 version). The OpenMP SIMD version only brings modest gains with GNU compilers on both ARCHER2 and Cirrus (compared to the non-vectorised version), and worse performance than the explicit SSE3 version, but with Cray compilers on ARCHER2, it is able to match performance of the AVX version while offering better code maintainability and portability across platforms.

Note: To compile HemeLB with the current Cray compilers (cce/16.0.1) on ARCHER2, it required the following minor workarounds:

For full performance comparison please find the plots below.

ARCHER2

Figure 1: Vectorisation: speedup for the retina dataset (40,000 time steps) on ARCHER2 using GNU compilers, 128 execution units per node.

Figure 2: Vectorisation: speedup for the retina dataset (40,000 time steps) on ARCHER2 using Cray compilers, 128 execution units per node.

Cirrus

Figure 3: Vectorisation: speedup for the retina dataset (40,000 time steps) on Cirrus using GNU compilers, 128 execution units per node.

toretto-uk added 3 commits August 12, 2025 21:50

Add vectorisation support (AVX, OpenMP SIMD)

e69b446

Update compile_options.yml

8efd2ca

Update CMakeOptions.md

2bc087a

toretto-uk marked this pull request as ready for review September 16, 2025 07:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add vectorisation support (AVX, OpenMP SIMD) #827

Add vectorisation support (AVX, OpenMP SIMD) #827

Uh oh!

toretto-uk commented Aug 17, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add vectorisation support (AVX, OpenMP SIMD) #827

Are you sure you want to change the base?

Add vectorisation support (AVX, OpenMP SIMD) #827

Uh oh!

Conversation

toretto-uk commented Aug 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Results

ARCHER2

Cirrus

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

toretto-uk commented Aug 17, 2025 •

edited

Loading