Currently, most RAPIDS benchmarking of KvikIO focuses on single-process, multi-threaded, single-GPU IO pipelining. We should add KvikIO benchmarks to cover the single-process, multi-threaded, multi-GPU case as well.