Skip to content

Commit cddd35f

Browse files
authored
Merge pull request #4407 from martin-frbg/changelog0326
Update Changelog for 0.3.26
2 parents cdff44e + 03713bc commit cddd35f

File tree

1 file changed

+45
-0
lines changed

1 file changed

+45
-0
lines changed

Changelog.txt

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,49 @@
11
OpenBLAS ChangeLog
2+
====================================================================
3+
Version 0.3.26
4+
2-Jan-2024
5+
6+
general:
7+
- improved the version of openblas.pc that is created by the CMAKE build
8+
- fixed a CMAKE-specific build problem on older versions of MacOS
9+
- worked around linking problems on old versions of MacOS
10+
- corrected installation location of the lapacke_mangling header in CMAKE builds
11+
- added type declarations for complex variables to the MSVC-specific parts of the LAPACK header
12+
- significantly sped up ?GESV for small problem sizes by introducing a lower bound for multithreading
13+
- imported additions and corrections from the Reference-LAPACK project:
14+
- added new LAPACK functions for truncated QR with pivoting (Reference-LAPACK PRs 891&941)
15+
- handle miscalculation of minimum work array size in corner cases (Reference-LAPACK PR 942)
16+
- fixed use of uninitialized variables in ?GEDMD and improved inline documentation (PR 959)
17+
- fixed use of uninitialized variables (and consequential failures) in ?BBCSD (PR 967)
18+
- added tests for the recently introduced Dynamic Mode Decomposition functions (PR 736)
19+
- fixed several memory leaks in the LAPACK testsuite (PR 953)
20+
- fixed counting of testsuite results by the Python script (PR 954)
21+
22+
x86-64:
23+
- fixed computation of CASUM on SkylakeX and newer targets in the special
24+
case that AVX512 is not supported by the compiler or operating environment
25+
- fixed potential undefined behaviour in the CASUM/ZASUM kernels for AVX512 targets
26+
- worked around a problem in the pre-AVX kernels for GEMV
27+
- sped up the thread management code on MS Windows
28+
29+
arm64:
30+
- fixed building of the LAPACK testsuite with Xcode 15 on Apple M1 and newer
31+
- sped up the thread management code on MS Windows
32+
- sped up SGEMM and DGEMM on Neoverse V1 and N1
33+
- sped up ?DOT on SVE-capable targets
34+
- reduced the number of targets in DYNAMIC_ARCH builds by eliminating functionally equivalent ones
35+
- included support for Apple M1 and newer targets in DYNAMIC_ARCH builds
36+
37+
power:
38+
- improved the SGEMM kernel for POWER10
39+
- fixed compilation with (very) old versions of gcc
40+
- fixed detection of old 32bit PPC targets in CMAKE-based builds
41+
- added autodetection of the POWERPC 7400 subtype
42+
- fixed CMAKE-based compilation for PPCG4 and PPC970 targets
43+
44+
loongarch64:
45+
- added and improved optimized kernels for almost all BLAS functions
46+
247
====================================================================
348
Version 0.3.25
449
12-Nov-2023

0 commit comments

Comments
 (0)