AthenaPK scaling instructions

Prerequisites

Assumes a Power9 node with 4x V100
Recommended environment: SpectrumMPI and GCC host compiler

# get source
git clone https://gitlab.com/theias/hpc/jmstone/athena-parthenon/athenapk.git athenaPK
cd athenaPK

# change to branch for scaling test
git checkout pgrete/pack-in-one

# get submodules (mainly Kokkos and Parthenon)
git submodule init
git submodule update

# Configure and build. Reusing Summit machine file (same architecture)
mkdir build-cuda-mpi && cd build-cuda-mpi
cmake -DMACHINE_CFG=$(pwd)/../external/parthenon/cmake/machinecfg/Summit.cmake ..
make -j8 athenaPK

Building on RZAnasel

# get source
git clone https://gitlab.com/theias/hpc/jmstone/athena-parthenon/athenapk.git athenaPK
cd athenaPK

# change to branch for scaling test
git checkout pgrete/pack-in-one

# get submodules (mainly Kokkos and Parthenon)
git submodule init
git submodule update

cmake -S. -B build -DCMAKE_TOOLCHAIN_FILE=$(pwd)/external/parthenon/cmake/machinecfg/RZAnsel.cmake
cmake --build build
```

## Scaling instructions

### Static, uniform mesh scalig

- For static meshes we'll use a workload of 256^3 cells per GPU
- Adjust launch command as needed (e.g., use `-M "-gpu"` parameter of `jsrun` instead of `MY_SPECTRUM_OPTIONS` environment variable)
```bash
# enable Cuda aware MPI
export MY_SPECTRUM_OPTIONS="--gpu"
# make Kokkos pick GPUs round robin
export KOKKOS_NUM_DEVICES=4

cd build-cuda-mpi

# mesh dimensions
export MB=256
export MX=256
export MY=256
export MZ=256


ibrun -n 1 ./src/athenaPK -i ../inputs/advection_3d.in parthenon/meshblock/nx1=$MB parthenon/meshblock/nx2=$MB parthenon/meshblock/nx3=$MB parthenon/time/nlim=10 parthenon/mesh/nx1=$MX parthenon/mesh/nx2=$MY parthenon/mesh/nx3=$MZ parthenon/mesh/refinement=none
# should be about 2.2e8 zone-cycles/wsec_step

export MX=512
ibrun -n 2 ./src/athenaPK -i ../inputs/advection_3d.in parthenon/meshblock/nx1=$MB parthenon/meshblock/nx2=$MB parthenon/meshblock/nx3=$MB parthenon/time/nlim=10 parthenon/mesh/nx1=$MX parthenon/mesh/nx2=$MY parthenon/mesh/nx3=$MZ parthenon/mesh/refinement=none
# should be about 4.4e8 zone-cycles/wsec_step

export MY=512
ibrun -n 4 ./src/athenaPK -i ../inputs/advection_3d.in parthenon/meshblock/nx1=$MB parthenon/meshblock/nx2=$MB parthenon/meshblock/nx3=$MB parthenon/time/nlim=10 parthenon/mesh/nx1=$MX parthenon/mesh/nx2=$MY parthenon/mesh/nx3=$MZ parthenon/mesh/refinement=none
# should be about 8.6e8 zone-cycles/wsec_step

# Test with overdecomposition
export MB=128
ibrun -n 4 ./src/athenaPK -i ../inputs/advection_3d.in parthenon/meshblock/nx1=$MB parthenon/meshblock/nx2=$MB parthenon/meshblock/nx3=$MB parthenon/time/nlim=10 parthenon/mesh/nx1=$MX parthenon/mesh/nx2=$MY parthenon/mesh/nx3=$MZ parthenon/mesh/refinement=none
# should be about 9.5e8 zone-cycles/wsec_step

# And much more overdecomposition
export MB=32
ibrun -n 4 ./src/athenaPK -i ../inputs/advection_3d.in parthenon/meshblock/nx1=$MB parthenon/meshblock/nx2=$MB parthenon/meshblock/nx3=$MB parthenon/time/nlim=10 parthenon/mesh/nx1=$MX parthenon/mesh/nx2=$MY parthenon/mesh/nx3=$MZ parthenon/mesh/refinement=none
# should be about 2.2e8 zone-cycles/wsec_step

# And now with process<->GPU overdecomposition (requires MPS): Using 32 on a single host for 4 GPUs
ibrun -n 32 ./src/athenaPK -i ../inputs/advection_3d.in parthenon/meshblock/nx1=$MB parthenon/meshblock/nx2=$MB parthenon/meshblock/nx3=$MB parthenon/time/nlim=10 parthenon/mesh/nx1=$MX parthenon/mesh/nx2=$MY parthenon/mesh/nx3=$MZ parthenon/mesh/refinement=none
# should be about 3.2e8 zone-cycles/wsec_step
```

### To be continued...

AthenaPK scaling instructions

Prerequisites

Building on RZAnasel

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!