Skip to content

Building on NERSC Perlmutter

Cameron Smith edited this page Nov 16, 2021 · 10 revisions

The following was tested with master@eac504c0.

environment setup

Using the default cuda/11.3 module results in a hang during compilation of Omega_h_adj.cpp; cuda/11.1.1 does not have this problem. Note, there is a regression in cuda 11.2 that prevents us from using it to build Omega_h.

module load cmake/3.20.5
module load PrgEnv-gnu
module load cpe-cuda
module swap cuda cuda/11.1.1

build and test

cmake ../omega_h \
  -DCMAKE_INSTALL_PREFIX=$PWD/install \
  -DBUILD_SHARED_LIBS=off \
  -DOmega_h_USE_Kokkos=off \
  -DOmega_h_USE_CUDA=on \
  -DOmega_h_CUDA_ARCH=80 \
  -DOmega_h_USE_MPI=on \
  -DBUILD_TESTING=on \
  -DCMAKE_CXX_COMPILER=CC \
  -DCMAKE_CUDA_FLAGS="-I$MPICH_DIR/include"

To fix the following error:

In file included from /global/homes/z/zhangc20/xgcm/omega_h/src/Omega_h_comm.hpp:6,
                 from /global/homes/z/zhangc20/xgcm/omega_h/src/Omega_h_cmdline.hpp:8,
                 from /global/homes/z/zhangc20/xgcm/omega_h/src/Omega_h_compare.hpp:6,
                 from /global/homes/z/zhangc20/xgcm/omega_h/src/Omega_h_adapt.hpp:7,
                 from /global/homes/z/zhangc20/xgcm/omega_h/src/Omega_h_amr.hpp:4,
                 from /global/homes/z/zhangc20/xgcm/omega_h/src/Omega_h_amr.cpp:1:
/global/homes/z/zhangc20/xgcm/omega_h/src/Omega_h_mpi.h:35:10: fatal error: mpi.h: No such file or directory
   35 | #include <mpi.h>

add:

-DCMAKE_CUDA_FLAGS="-I$MPICH_DIR/include"

as suggested here.

make -j8
ctest

For packages not using the Cray compiler wrappers (this is what omega_h does), this NERSC page suggests adding logic after find_package(MPI) to define an foo-mpi target.

Issues

Building with PrgEnv-nvidia/8.2.0 (Nvidia HPC SDK 21.7) and the default cuda/11.3 results in a hang during the compilation of Omega_h_adj.cpp. Switching the cuda module to cuda/11.1.1 or cuda/11.0.3 results in the following compatibility errors:

In file included from /global/homes/c/cwsmith/omegahHackathon/omega_h/src/Omega_h_library.cpp:31:0:
/opt/nvidia/hpc_sdk/Linux_x86_64/21.7/compilers/include/xmmintrin.h:25:2: error: #error "This version of <xmmintrin.h> does not work with this compiler."
 #error "This version of <xmmintrin.h> does not work with this compiler."
Clone this wiki locally