|
1 | 1 | OpenBLAS ChangeLog
|
| 2 | +==================================================================== |
| 3 | +Version 0.3.26 |
| 4 | + 2-Jan-2024 |
| 5 | + |
| 6 | +general: |
| 7 | +- improved the version of openblas.pc that is created by the CMAKE build |
| 8 | +- fixed a CMAKE-specific build problem on older versions of MacOS |
| 9 | +- worked around linking problems on old versions of MacOS |
| 10 | +- corrected installation location of the lapacke_mangling header in CMAKE builds |
| 11 | +- added type declarations for complex variables to the MSVC-specific parts of the LAPACK header |
| 12 | +- significantly sped up ?GESV for small problem sizes by introducing a lower bound for multithreading |
| 13 | +- imported additions and corrections from the Reference-LAPACK project: |
| 14 | + - added new LAPACK functions for truncated QR with pivoting (Reference-LAPACK PRs 891&941) |
| 15 | + - handle miscalculation of minimum work array size in corner cases (Reference-LAPACK PR 942) |
| 16 | + - fixed use of uninitialized variables in ?GEDMD and improved inline documentation (PR 959) |
| 17 | + - fixed use of uninitialized variables (and consequential failures) in ?BBCSD (PR 967) |
| 18 | + - added tests for the recently introduced Dynamic Mode Decomposition functions (PR 736) |
| 19 | + - fixed several memory leaks in the LAPACK testsuite (PR 953) |
| 20 | + - fixed counting of testsuite results by the Python script (PR 954) |
| 21 | + |
| 22 | +x86-64: |
| 23 | +- fixed computation of CASUM on SkylakeX and newer targets in the special |
| 24 | + case that AVX512 is not supported by the compiler or operating environment |
| 25 | +- fixed potential undefined behaviour in the CASUM/ZASUM kernels for AVX512 targets |
| 26 | +- worked around a problem in the pre-AVX kernels for GEMV |
| 27 | +- sped up the thread management code on MS Windows |
| 28 | + |
| 29 | +arm64: |
| 30 | +- fixed building of the LAPACK testsuite with Xcode 15 on Apple M1 and newer |
| 31 | +- sped up the thread management code on MS Windows |
| 32 | +- sped up SGEMM and DGEMM on Neoverse V1 and N1 |
| 33 | +- sped up ?DOT on SVE-capable targets |
| 34 | +- reduced the number of targets in DYNAMIC_ARCH builds by eliminating functionally equivalent ones |
| 35 | +- included support for Apple M1 and newer targets in DYNAMIC_ARCH builds |
| 36 | + |
| 37 | +power: |
| 38 | +- improved the SGEMM kernel for POWER10 |
| 39 | +- fixed compilation with (very) old versions of gcc |
| 40 | +- fixed detection of old 32bit PPC targets in CMAKE-based builds |
| 41 | +- added autodetection of the POWERPC 7400 subtype |
| 42 | +- fixed CMAKE-based compilation for PPCG4 and PPC970 targets |
| 43 | + |
| 44 | +loongarch64: |
| 45 | +- added and improved optimized kernels for almost all BLAS functions |
| 46 | + |
2 | 47 | ====================================================================
|
3 | 48 | Version 0.3.25
|
4 | 49 | 12-Nov-2023
|
|
0 commit comments