Skip to content

Conversation

@mgates3
Copy link
Collaborator

@mgates3 mgates3 commented Mar 19, 2025

Cleanup the SVD code, applying changes similar to the eigenvalue code (#40).

Note that the 2-stage reduction goes to upper or lower triangular band (tb) form, not general band (gb) form. The LAPACK routine gbbrd takes a general band matrix that can have non-zero upper (ku > 0) and lower (kl > 0) bandwidths. Hence renaming PLASMA's routines to tbbrd.

Also add diagrams to the eigenvalue bulge chasing kernels.

Since both SVD and eig use proper atomic operations, remove the volatile variables. (See discussion on atomics vs. volatile in the PLASMA style guide and Scott Meyers book, Effective Modern C++.)

'S', seed,
'N', Sigma_ref, mode, rcond,
dmax, kl, ku,
pack, Aband + nb, ldab, work);
Copy link
Collaborator Author

@mgates3 mgates3 Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works fine with OpenBLAS 0.3.27 (using Netlib LAPACK), but fails with MKL 2024.2.0. I can disable checks in that case.

################################################################################
# Test with OpenBLAS, using Netlib LAPACK.

sh methane build-openblas> ./run_tests.py tbbrd
Fri Mar 21 18:15:55 2025
./plasmatest stbbrd  --nb=64 --dim=100:500:100 --uplo=l,u
% PLASMA 24.8.7, OpenMP num threads 10,  OpenBLAS 0.3.27 , 2025-03-21 18:15:55
% input: ./plasmatest stbbrd --nb=64 --dim=100:500:100 --uplo=l,u

  Status      Error       Time    Gflop/s    uplo       n    nb
    pass   5.52e-09     0.0008     0.0000       l     100    64
    pass   2.95e-08     0.0028     0.0000       l     200    64
    pass   3.28e-08     0.0054     0.0000       l     300    64
    pass   3.99e-08     0.0081     0.0000       l     400    64
    pass   4.28e-08     0.0115     0.0000       l     500    64
    pass   5.00e-09     0.0008     0.0000       u     100    64
    pass   2.90e-08     0.0036     0.0000       u     200    64
    pass   3.22e-08     0.0080     0.0000       u     300    64
    pass   4.30e-08     0.0137     0.0000       u     400    64
    pass   4.23e-08     0.0184     0.0000       u     500    64

% All tests passed
pass
./plasmatest dtbbrd  --nb=64 --dim=100:500:100 --uplo=l,u
% PLASMA 24.8.7, OpenMP num threads 10,  OpenBLAS 0.3.27 , 2025-03-21 18:15:55
% input: ./plasmatest dtbbrd --nb=64 --dim=100:500:100 --uplo=l,u

  Status      Error       Time    Gflop/s    uplo       n    nb
    pass   1.05e-17     0.0011     0.0000       l     100    64
    pass   4.82e-17     0.0035     0.0000       l     200    64
    pass   6.41e-17     0.0070     0.0000       l     300    64
    pass   8.26e-17     0.0104     0.0000       l     400    64
    pass   8.59e-17     0.0150     0.0000       l     500    64
    pass   1.08e-17     0.0010     0.0000       u     100    64
    pass   4.91e-17     0.0046     0.0000       u     200    64
    pass   6.49e-17     0.0099     0.0000       u     300    64
    pass   8.08e-17     0.0155     0.0000       u     400    64
    pass   8.40e-17     0.0219     0.0000       u     500    64

% All tests passed
pass
./plasmatest ctbbrd  --nb=64 --dim=100:500:100 --uplo=l,u
% PLASMA 24.8.7, OpenMP num threads 10,  OpenBLAS 0.3.27 , 2025-03-21 18:15:56
% input: ./plasmatest ctbbrd --nb=64 --dim=100:500:100 --uplo=l,u

  Status      Error       Time    Gflop/s    uplo       n    nb
    pass   5.32e-09     0.0016     0.0000       l     100    64
    pass   3.11e-08     0.0044     0.0000       l     200    64
    pass   4.55e-08     0.0084     0.0000       l     300    64
    pass   5.56e-08     0.0128     0.0000       l     400    64
    pass   6.48e-08     0.0181     0.0000       l     500    64
    pass   5.56e-09     0.0012     0.0000       u     100    64
    pass   3.15e-08     0.0053     0.0000       u     200    64
    pass   4.85e-08     0.0112     0.0000       u     300    64
    pass   5.79e-08     0.0182     0.0000       u     400    64
    pass   6.24e-08     0.0264     0.0000       u     500    64

% All tests passed
pass
./plasmatest ztbbrd  --nb=64 --dim=100:500:100 --uplo=l,u
% PLASMA 24.8.7, OpenMP num threads 10,  OpenBLAS 0.3.27 , 2025-03-21 18:15:57
% input: ./plasmatest ztbbrd --nb=64 --dim=100:500:100 --uplo=l,u

  Status      Error       Time    Gflop/s    uplo       n    nb
    pass   1.24e-17     0.0016     0.0000       l     100    64
    pass   6.00e-17     0.0058     0.0000       l     200    64
    pass   7.32e-17     0.0117     0.0000       l     300    64
    pass   8.68e-17     0.0181     0.0000       l     400    64
    pass   9.71e-17     0.0250     0.0000       l     500    64
    pass   1.00e-17     0.0015     0.0000       u     100    64
    pass   5.95e-17     0.0065     0.0000       u     200    64
    pass   7.58e-17     0.0136     0.0000       u     300    64
    pass   8.89e-17     0.0224     0.0000       u     400    64
    pass   9.06e-17     0.0313     0.0000       u     500    64

% All tests passed
pass
--------------------------------------------------------------------------------

All routines passed.
Elapsed 3.35 sec
Fri Mar 21 18:15:58 2025


################################################################################
# Test with Intel MKL.

sh methane build> ./run_tests.py tbbrd
Fri Mar 21 18:16:07 2025
./plasmatest stbbrd  --nb=64 --dim=100:500:100 --uplo=l,u
% PLASMA 24.8.7, OpenMP num threads 10, Intel MKL 2024.0.2, 2025-03-21 18:16:07
% input: ./plasmatest stbbrd --nb=64 --dim=100:500:100 --uplo=l,u

  Status      Error       Time    Gflop/s    uplo       n    nb
    pass   5.54e-09     0.0011     0.0000       l     100    64
    pass   2.80e-08     0.0028     0.0000       l     200    64
    pass   3.58e-08     0.0055     0.0000       l     300    64
    pass   4.58e-08     0.0085     0.0000       l     400    64
    pass   4.70e-08     0.0117     0.0000       l     500    64
    pass   6.42e-09     0.0008     0.0000       u     100    64

Intel oneMKL ERROR: Parameter 4 was incorrect on entry to SLAROT.
#### cut 161545 duplicate lines ####
Intel oneMKL ERROR: Parameter 4 was incorrect on entry to SLAROT.
  FAILED   7.01e-03     0.0163     0.0000       u     500    64

% 4 tests failed
FAILED: exit code 4
./plasmatest dtbbrd  --nb=64 --dim=100:500:100 --uplo=l,u
% PLASMA 24.8.7, OpenMP num threads 10, Intel MKL 2024.0.2, 2025-03-21 18:16:08
% input: ./plasmatest dtbbrd --nb=64 --dim=100:500:100 --uplo=l,u

  Status      Error       Time    Gflop/s    uplo       n    nb
    pass   9.60e-18     0.0011     0.0000       l     100    64
    pass   5.29e-17     0.0031     0.0000       l     200    64
    pass   6.15e-17     0.0062     0.0000       l     300    64
    pass   8.12e-17     0.0097     0.0000       l     400    64
    pass   8.45e-17     0.0135     0.0000       l     500    64
    pass   1.04e-17     0.0009     0.0000       u     100    64
    pass   5.35e-17     0.0043     0.0000       u     200    64
    pass   6.30e-17     0.0094     0.0000       u     300    64
    pass   8.36e-17     0.0151     0.0000       u     400    64
    pass   8.80e-17     0.0209     0.0000       u     500    64

% All tests passed
pass
./plasmatest ctbbrd  --nb=64 --dim=100:500:100 --uplo=l,u
% PLASMA 24.8.7, OpenMP num threads 10, Intel MKL 2024.0.2, 2025-03-21 18:16:08
% input: ./plasmatest ctbbrd --nb=64 --dim=100:500:100 --uplo=l,u

  Status      Error       Time    Gflop/s    uplo       n    nb
    pass   6.03e-09     0.0014     0.0000       l     100    64
    pass   3.48e-08     0.0039     0.0000       l     200    64
    pass   5.00e-08     0.0073     0.0000       l     300    64
    pass   6.17e-08     0.0113     0.0000       l     400    64
    pass   7.33e-08     0.0154     0.0000       l     500    64
    pass   4.30e-09     0.0011     0.0000       u     100    64
    pass   3.43e-08     0.0051     0.0000       u     200    64
    pass   5.48e-08     0.0110     0.0000       u     300    64
    pass   5.91e-08     0.0171     0.0000       u     400    64
    pass   7.58e-08     0.0304     0.0000       u     500    64

% All tests passed
pass
./plasmatest ztbbrd  --nb=64 --dim=100:500:100 --uplo=l,u
% PLASMA 24.8.7, OpenMP num threads 10, Intel MKL 2024.0.2, 2025-03-21 18:16:09
% input: ./plasmatest ztbbrd --nb=64 --dim=100:500:100 --uplo=l,u

  Status      Error       Time    Gflop/s    uplo       n    nb
    pass   1.29e-17     0.0015     0.0000       l     100    64
    pass   5.38e-17     0.0053     0.0000       l     200    64
    pass   7.83e-17     0.0101     0.0000       l     300    64
    pass   8.72e-17     0.0164     0.0000       l     400    64
    pass   9.68e-17     0.0210     0.0000       l     500    64
    pass   1.51e-17     0.0014     0.0000       u     100    64
    pass   5.60e-17     0.0059     0.0000       u     200    64
    pass   8.63e-17     0.0128     0.0000       u     300    64
    pass   8.76e-17     0.0212     0.0000       u     400    64
    pass   9.63e-17     0.0272     0.0000       u     500    64

% All tests passed
pass
--------------------------------------------------------------------------------

1 routines FAILED: tbbrd
Elapsed 3.67 sec
Fri Mar 21 18:16:10 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant