From f5527744e637561759f14fcbdf4071af2ede77af Mon Sep 17 00:00:00 2001 From: yasahi-hpc Date: Tue, 30 Sep 2025 18:48:02 +0900 Subject: [PATCH 1/2] fix syr routine name Signed-off-by: yasahi-hpc --- docs/source/API/batched/dense/batched_syr.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/API/batched/dense/batched_syr.rst b/docs/source/API/batched/dense/batched_syr.rst index fa0964fc6e..69abe3d1a6 100644 --- a/docs/source/API/batched/dense/batched_syr.rst +++ b/docs/source/API/batched/dense/batched_syr.rst @@ -20,7 +20,7 @@ Perform a symmetric rank-1 update of matrix :math:`A` by vector :math:`x` with s A &= A + \alpha (x * x^H) \: \text{(if ArgTrans == KokkosBatched::Trans::ConjTranspose)} \end{align} -1. If ``ArgTrans == KokkosBatched::Trans::Transpose``, this operation is equivalent to the BLAS routine ``SSYR`` (``CSYR``) or ``DGER`` (``ZSYR``) for single or double precision for real (complex) matrix. +1. If ``ArgTrans == KokkosBatched::Trans::Transpose``, this operation is equivalent to the BLAS routine ``SSYR`` (``CSYR``) or ``DSYR`` (``ZSYR``) for single or double precision for real (complex) matrix. 2. If ``ArgTrans == KokkosBatched::Trans::ConjTranspose``, this operation is equivalent to the BLAS routine ``CHER`` or ``ZHER`` for single or double precision for complex matrix. From a6d653b244463d3d14788c747f950325a2ef5cf0 Mon Sep 17 00:00:00 2001 From: yasahi-hpc Date: Tue, 30 Sep 2025 18:48:36 +0900 Subject: [PATCH 2/2] Add batched serial tbsv docs Signed-off-by: yasahi-hpc --- docs/source/API/batched/dense-index.rst | 3 +- .../source/API/batched/dense/batched_tbsv.rst | 66 +++++++++++++++++++ 2 files changed, 68 insertions(+), 1 deletion(-) create mode 100644 docs/source/API/batched/dense/batched_tbsv.rst diff --git a/docs/source/API/batched/dense-index.rst b/docs/source/API/batched/dense-index.rst index d11f2b60f8..daf725cf24 100644 --- a/docs/source/API/batched/dense-index.rst +++ b/docs/source/API/batched/dense-index.rst @@ -5,6 +5,7 @@ API: Batched Dense (DLA) :maxdepth: 2 :hidden: + dense/batched_tbsv dense/batched_ger dense/batched_syr dense/batched_getrf @@ -196,7 +197,7 @@ BLAS 2 - `TeamTrsv` - `TeamVectorTrsv` * - TBSV - - `SerialTbsv` + - :doc:`SerialTbsv ` - -- - -- * - TPSV diff --git a/docs/source/API/batched/dense/batched_tbsv.rst b/docs/source/API/batched/dense/batched_tbsv.rst new file mode 100644 index 0000000000..84b128e702 --- /dev/null +++ b/docs/source/API/batched/dense/batched_tbsv.rst @@ -0,0 +1,66 @@ +KokkosBatched::Tbsv +################### + +Defined in header: :code:`KokkosBatched_Tbsv.hpp` + +.. code:: c++ + + template + struct SerialTbsv { + template + KOKKOS_INLINE_FUNCTION static int invoke(const AViewType &A, const XViewType &X, const int k); + }; + + +Solves a system of the linear equations :math:`A \cdot X = B` or :math:`A^T \cdot X = B` or :math:`A^H \cdot X = B` where :math:`A` is an n-by-n unit or non-unit, upper or lower triangular band matrix with :math:`(k + 1)` diagonals. + +1. For a real band matrix :math:`A`, this solves a system of the linear equations :math:`A \cdot X = B` or :math:`A^T \cdot X = B`. + This operation is equivalent to the BLAS routine ``STBSV`` or ``DTBSV`` for single or double precision. + +2. For a complex band matrix :math:`A`, this solves a system of the linear equations :math:`A \cdot X = B` or :math:`A^T \cdot X = B` or :math:`A^H \cdot X = B`. + This operation is equivalent to the BLAS routine ``CTBSV`` or ``ZTBSV`` for single or double precision. + +.. note:: + + No test for singularity or near-singularity is included in this routine. Such tests must be performed before calling this routine. + +Parameters +========== + +:A: Input view containing the upper or lower triangular band matrix. See `LAPACK reference `_ for the band storage format. +:X: Input/output view containing the right-hand side on input and the solution on output. +:k: The number of superdiagonals or subdiagonals within the band of :math:`A`. :math:`k >= 0` + + +Type Requirements +----------------- + +- ``ArgUplo`` must be one of the following: + - ``KokkosBatched::Uplo::Upper`` for upper triangular solve + - ``KokkosBatched::Uplo::Lower`` for lower triangular solve + +- ``ArgTrans`` must be one of the following: + - ``KokkosBatched::Trans::NoTranspose`` to solve a system :math:`A \cdot X = B` + - ``KokkosBatched::Trans::Transpose`` to solve a system :math:`A^T \cdot X = B` + - ``KokkosBatched::Trans::ConjTranspose`` to solve a system :math:`A^H \cdot X = B` + +- ``ArgDiag`` must be one of the following: + - ``KokkosBatched::Diag::Unit`` for the unit triangular matrix :math:`A` + - ``KokkosBatched::Diag::NonUnit`` for the non-unit triangular matrix :math:`A` + +- ``ArgAlgo`` must be ``KokkosBatched::Algo::tbsv::Unblocked`` for the unblocked algorithm +- ``AViewType`` must be a Kokkos `View `_ of rank 2 containing the band matrix A +- ``XViewType`` must be a Kokkos `View `_ of rank 1 containing the right-hand side that satisfies + - ``std::is_same_v == true`` + +Example +======= + +.. literalinclude:: ../../../../../example/batched_solve/serial_tbsv.cpp + :language: c++ + +output: + +.. code:: + + tbsv works correctly!