-
Notifications
You must be signed in to change notification settings - Fork 99
Description
Hello,
This issue is the continuation of #1561 with more details. We use BiCGStab to solve a quasi-tridiagonal pds problem with different executors (Serial, OMP and CUDA). We observe a numerical issue with OMP. In our test bench build upon those solvers (which involve more than just Ginkgo) we evaluate an error to 1e-16
with Serial or CUDA, but ~1e-10
with OMP even with OMP_NUM_THREADS=1
.
The problem disappears just by replacing the OMP executor with a Serial one.
I link the matrix and the rhs which is problematic. Note that the size of the matrix has an impact (no problem with n=10 or n=100 ie.)
Please let me know if it is enough for you to reproduce the problem.
I tried with gcc and clang, with Ginkgo 1.7.0 and current develop branch.
Valgrind does not reveals anything and both cases (with serial or omp executor) "converge" in 3 iterations.
Issue from our side: CExA-project/ddc#332
Problematic ginkgo apply call (all ginkgo stuff is in this file): https://github.com/CExA-project/ddc/blob/cc2942283213cc700b3bd8c09fb00346621997e3/include/ddc/kernels/splines/matrix_sparse.hpp#L207
(create_gko_exec<Kokkos::Serial>()
is a gko::ReferenceExecutor::create();
while gko_exec
is a gko::OmpExecutor::create();
.)
Regards