This case study is to study the performance of HBM2. The GPU I will be using is a AMD Radeon 7, and to compare with a Nvidia Tesla P40
Examples are written in CUDA, it will be converted to HIP for AMD through hipify
Depending on your availability of GPU
# Compile with nvcc
make all-cuda
# Compile with hip
make all-hip