ParCoreLab
diff --git a/‎INSTALL
Lines changed: 111 additions & 0 deletions b/‎INSTALL
Lines changed: 111 additions & 0 deletions
diff --git a/‎LICENSE
Lines changed: 7 additions & 0 deletions b/‎LICENSE
Lines changed: 7 additions & 0 deletions
diff --git a/‎NEWS
Lines changed: 23 additions & 0 deletions b/‎NEWS
Lines changed: 23 additions & 0 deletions
diff --git a/‎README
Lines changed: 26 additions & 0 deletions b/‎README
Lines changed: 26 additions & 0 deletions
@@ -0,0 +1,111 @@
+Installation Instructions
+*************************
+
+    Copyright 2025 Koç University and Simula Research Laboratory
+
+    Copying and distribution of this file, with or without
+    modification, are permitted provided the copyright notice and this
+    notice are preserved.
+
+
+Software dependencies
+=====================
+
+To install aCG the following minimum software dependencies must be satisfied:
+
+* CMake 3.12 or newer
+
+* A C/C++ compiler compatible with C++-17
+
+* For NVIDIA GPUs: NVIDIA CUDA Compiler (NVCC), cuBLAS and cuSPARSE from CUDA Toolkit 11.6 or newer
+
+* For AMD GPUs: HIP C++ compiler, hipBLAS and hipSPARSE from ROCm 6.0.0 or newer
+
+* A GPU-aware MPI library (e.g., HPC-X from NVIDIA HPC SDK, or Cray MPICH)
+
+The following optional software packages may be needed to enable some features:
+
+* METIS 5.1.0 is needed to partition matrices when using multiple GPUs
+
+* NVIDIA Collective Communications Library (NCCL) version 2.18.5 or
+  newer is needed to use NCCL-based communication for NVIDIA GPUs
+
+* ROCm Collective Communications Library (RCCL) version 2.18.3 or
+  newer is needed to use RCCL-based communication for AMD GPUs
+
+* NVSHMEM version 2.10.0 or newer is needed to use CPU- or
+  GPU-initiated one-sided communication for NVIDIA GPUs
+
+* PETSc 3.17 or newer with CUDA or HIP support enabled is needed to
+  use PETSc's CG/pipelined CG solvers
+
+* zlib is needed to use gzip-compressed Matrix Market files as input
+
+* If the compiler supports OpenMP, then it can be enabled to use
+  multiple threads to speed up some preprocessing steps.
+
+
+Basic Installation
+==================
+
+The CMake build system is used to install aCG.  A basic installation
+can be performed by first creating a build directory, e.g., 'build/':
+
+ $ mkdir build
+ $ cd build
+
+From the build directory, the command
+
+ $ cmake ../cuda
+
+builds the CUDA application for NVIDIA GPUs, or
+
+ $ cmake ../hip
+
+builds the HIP application for AMD GPUs.
+
+Once the cmake configuration is finished, use `make' to compile the
+application:
+
+ $ make
+
+
+Installation options
+====================
+
+The usual options used by CMake to manage installation are supported.
+In addition, the following options are useful for aCG:
+
+* CMAKE_BUILD_TYPE should be set to Release to enable optimisations
+  when conducting performance benchmarks.
+
+* CMAKE_CUDA_ARCHITECTURES can be used to set the CUDA architecture
+  when compiling CUDA kernels. By default, the following architectures
+  are included: 70 (Volta), 75 (Turing), 80 (Ampere) and 90 (Hopper).
+
+* CMAKE_HIP_FLAGS can be used to set the HIP architecture that is used
+  to compile HIP kernels.  For example, for AMD Instinct MI250x, it is
+  recommended to set -DCMAKE_HIP_FLAGS="--offload-arch=gfx90a".
+
+* ACG_ENABLE_PROFILING can be set to enable detailed CUDA/HIP-based
+  event profiling. This can be used to print detailed information
+  about time spent in different GPU kernels for some of the CG
+  solvers.
+
+* IDXSIZE can be set to 64 to enable the use of 64-bit integers to
+  index matrix rows and columns. This may be needed for matrices with
+  more than 2 billion rows/columns.
+
+Other options are used to specify locations of third-party libraries:
+
+* METIS_DIR is used to specify the location of the METIS library.
+  Alternatively, METIS_INCLUDE_DIR and METIS_LIB_DIR can be set to
+  directories containing the METIS header files and library,
+  respectively.
+
+* NCCL_HOME or NCCL_ROOT are used to specify the location of the NCCL
+  installation. These may also be set as environment variables.
+
+* NVSHMEM_DIR is used to specify the location of the NVSHMEM
+  installation. The environment variables NVSHMEM_HOME or
+  NVSHMEM_PREFIX may also be used.
@@ -0,0 +1,7 @@
+Copyright 2025 Koç University and Simula Research Laboratory
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,23 @@
+* Noteworthy changes in release 1.0.0 (2025-04-26)
+
+This is the initial release of aCG, which provides GPU-accelerated
+iterative linear solvers based on the conjugate gradient (CG) method.
+The solvers support NVIDIA and AMD GPUs, and multi-GPU systems with
+GPU-aware MPI, NCCL/RCCL or NVSHMEM.
+
+For NVIDIA GPUs, CUDA implementations of CG and pipelined CG are
+provided. Communication is performed by the host CPU using GPU-aware
+MPI, NCCL or NVSHMEM, or by the GPU using NVSHMEM device-initiated
+communication.
+
+For AMD GPUs, HIP implementations of CG and pipelined CG are provided.
+Communication is performed by the host CPU using GPU-aware MPI or RCCL
+communication.  A single-GPU HIP version of the monolithic,
+device-side CG solver is also provided.
+
+--
+Copyright 2025 Koç University and Simula Research Laboratory
+
+Copying and distribution of this file, with or without modification,
+are permitted provided the copyright notice and this notice are
+preserved.
@@ -0,0 +1,26 @@
+This is the README file for aCG, a suite of GPU-accelerated iterative
+linear solvers based on the conjugate gradient (CG) method, supporting
+NVIDIA and AMD GPUs, as well as multi-GPU systems with GPU-aware MPI,
+NCCL, RCCL or NVSHMEM.
+
+For NVIDIA GPUs, CUDA implementations of CG and pipelined CG are
+provided. Communication is performed by the host CPU using GPU-aware
+MPI, NCCL or NVSHMEM, or by the GPU using NVSHMEM device-initiated
+communication.
+
+For AMD GPUs, HIP implementations of CG and pipelined CG are provided.
+Communication is performed by the host CPU with GPU-aware MPI or RCCL.
+A single-GPU HIP version of a monolithic, device-side CG solver is
+also provided.
+
+See the file INSTALL for instructions on how to build and install.
+
+aCG is free software, available under a permissive software license.
+See the file LICENSE for copying conditions.
+
+--
+Copyright 2025 Koç University and Simula Research Laboratory
+
+Copying and distribution of this file, with or without modification,
+are permitted provided the copyright notice and this notice are
+preserved.