Lanczos-Lib is a high-performance library implementing classical and block Lanczos algorithms on both CPU and GPU in C and Fortran, optimized for Intel CPU and NVIDIA GPU.
- MATLAB Prototypes: Provided for deeper insight into the Lanczos algorithm.
- Intel CPU Implementation: Utilizes Intel MKL for high-performance numerical routines.
- Reproducibility Control: Implements Intel Conditional Numerical Reproducibility (CNR) for bitwise-consistent results across runs. Automatically detect and set to the optimal reproducibility.
- NVIDIA GPU Implementation: Utilizes NVIDIA cuBLAS for high-performance numerical routines.
- [fortran] (TBC)
- [eigen-system and solver]??? (TBC)
- Make
- Intel MKL (implemented with intel-oneapi-hpc-toolkit-2025.1.0.666)
- Intel C and Fortran compilers (tested with intel-oneapi-hpc-toolkit-2025.1.0.666)
- NVIDIA HPC SDK (tested with nvhpc_2025_253_Linux_x86_64_cuda_12.8, Driver 570.133.07)
- NVIDIA cuBLAS library (implemented with nvhpc_2025_253_Linux_x86_64_cuda_12.8)
- Clone the repository
git clone https://github.com/taiwan-jjl/Lanczos-Lib.git
cd Lanczos-Lib
- Configure (edit
makefile
if needed)
# Compile and output object files and executables
make
# Clean build artifacts
make clean
- main.c is the tutorial for initializing and using the following Lanczos functions.
If the input A matrix is stored in "full-matrix" scheme, switch to "GEXX" BLAS operation.
If the input A matrix is stored in "uppler or lower triangular" scheme, switch to "SYXX" BLAS operation.
basic_lanczos_cpu :
void basic_lanczos_cpu(
const double* A,
double* nu,
double* omega,
double* alpha,
double* beta,
const int A_dim,
const double Lanczos_stop_crit,
const int Lanczos_stop_check_freq,
int* Lanczos_iter
)
Output: none.
Input:
- A: "A" matrix in Lanczos algorithm. A pointer to a 1d array of double in heap memory.
- nu: "nu" vector in Lanczos algorithm. A pointer to a 1d array of double in heap memory.
- omega: "omega" vector in Lanczos algorithm. A pointer to a 1d array of double in heap memory.
- alpha: "alpha" scalar in Lanczos algorithm. A pointer to a 1d array of double in heap memory.
- beta: "beta" scalar in Lanczos algorithm. A pointer to a 1d array of double in heap memory.
- A_dim: The dimension
n
of an
byn
A matrix. An integer in stack memory. - Lanczos_stop_crit: The Lanczos stop crition of a truncation method for checking the "beta" value to avoid the numerical break down. Default value is 10 times of
DBL_EPSILON
.DBL_EPSILON
is the minimum precision the system guarantees for "double" on host meachine. A double float-point in stack memory. - Lanczos_stop_check_freq: How many Lanczos iterations the "Lanczos_stop_crit" would be checked in. Default value is
0
which means every iteration. It is a trade off between performance and correctness. An integer in stack memory. - Lanczos_iter: A retuen value from "basic_lanczos_cpu" function. It means how many full Lanczos iterations are executed. An integer in stack memory.
basic_lanczos_gpu :
void basic_lanczos_cpu(
const double* A,
double* nu,
double* omega,
double* alpha,
double* beta,
const int A_dim,
const double Lanczos_stop_crit,
const int Lanczos_stop_check_freq,
int* Lanczos_iter
)
Output: none.
Input:
- A: "A" matrix in Lanczos algorithm. A pointer to a 1d array of double in device memory.
- nu: "nu" vector in Lanczos algorithm. A pointer to a 1d array of double in device memory.
- omega: "omega" vector in Lanczos algorithm. A pointer to a 1d array of double in device memory.
- alpha: "alpha" scalar in Lanczos algorithm. A pointer to a 1d array of double in device memory.
- beta: "beta" scalar in Lanczos algorithm. A pointer to a 1d array of double in device memory.
- A_dim: The dimension
n
of an
byn
A matrix. An integer in host memory. - Lanczos_stop_crit: The Lanczos stop crition of a truncation method for checking the "beta" value to avoid the numerical break down. Default value is 10 times of
DBL_EPSILON
.DBL_EPSILON
is the minimum precision the system guarantees for "double" on host meachine. A double float-point in host memory. - Lanczos_stop_check_freq: How many Lanczos iterations the "Lanczos_stop_crit" would be checked in. Default value is
0
which means every iteration. It is a trade off between performance and correctness. An integer in host memory. - Lanczos_iter: A retuen value from "basic_lanczos_cpu" function. It means how many full Lanczos iterations are executed. An integer in host memory.
-
More error detect and handing code.
Currently, it is omitted for code readability and simplicity.
-
SYMV (symmetric matrix-vector product) issue on GPU:
-
More initial vector (nu) methods.
-
Advanced memory allocation method in GPU version.