Skip to content

Commit 94cba8e

Browse files
authored
Merge pull request #3716 from martin-frbg/0321changes
Update Changelog for 0.3.21
2 parents 9f89b62 + 25ce2e2 commit 94cba8e

File tree

1 file changed

+82
-0
lines changed

1 file changed

+82
-0
lines changed

Changelog.txt

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,86 @@
11
OpenBLAS ChangeLog
2+
====================================================================
3+
Version 0.3.21
4+
07-Aug-2022
5+
6+
general:
7+
- Updated the included LAPACK to Reference-LAPACK release 3.10.1
8+
- when no Fortran compiler is available, OpenBLAS builds will now automatically
9+
build LAPACK from an f2c-converted copy of LAPACK 3.9.0 unless the NO_LAPACK option
10+
is specified
11+
- similarly added C versions of the BLAS and CBLAS tests
12+
- enabled building of the ReLAPACK GEMMT kernels when ReLAPACK is built
13+
- function LAPACKE_lsame is now annotated with the GCC attribute "const" to aid static analyzers
14+
- added USE_TLS to the list of options reported by the openblas_get_config() function
15+
- CMAKE builds now support the BUILD_TESTING keyword (to disable the LAPACK testsuite) of Reference-LAPACK
16+
- fixed CMAKE builds of the laswp_ncopy and neg_tcopy kernels
17+
- removed the build system requirements for PERL (while keeping the original perl scripts as backup)
18+
- handle building and running OpenBLAS on systems that report zero available cpu cores
19+
- added SYMBOLPREFIX/SYMBOLSUFFIX handling for LAPACK 3.10.0 functions added in 0.3.20
20+
- fixed linking of the utests on QNX
21+
- Added support for compilation with the Intel ifx compiler
22+
- Added support for compilation with the Fujitsu FCC compiler for Fugaku
23+
- Added support for compilation with the Cray C and Fortran compilers
24+
- reverted OpenMP threadpool behaviour in the exec_blas call to its state before 0.3.11, that is
25+
the threadpool will no longer grow or shrink on demand as the overhead for this is too big at least with
26+
GNU OpenMP. The adaptive behaviour introduced in 0.3.11 can still be requested at runtime by setting
27+
the environment variable OMP_ADAPTIVE
28+
- worked around spurious STFSM/CTFSM errors reported by the LAPACK testsuite
29+
30+
x86_64:
31+
- fixed determination of compiler support for AVX512 and removed the 0.3.19
32+
workaround for building SKYLAKEX kernels on Sandybridge hardware
33+
- fixed compilation for the SKYLAKEX target with gcc 6
34+
- fixed compilation of the CooperLake SBGEMM kernel with LLVM
35+
- fixed compilation of the SkyLakeX small matrix GEMM kernels with LLVM or ICC
36+
- fixed compilation of some BFLOAT16 kernels with CMAKE
37+
- added support for the Zhaoxin/Centaur KH40000 cpu
38+
- fixed a potential crash in the ZSYMV kernel used for all targets except generic
39+
- fixed gmake compilation for DYNAMIC_ARCH with a DYNAMIC_LIST including ATOM
40+
- fixed compilation of LAPACKE with the INTEGER64 option on Windows
41+
- added support for cross-compiling to individual Intel or AMD targets using CMAKE
42+
(previously only CORE2 supported, added targets are ATOM, PRESCOTT, NEHALEM, SANDYBRIDGE,
43+
HASWELL,SKYLAKEX, COOPERLAKE, SAPPHIRERAPIDS, OPTERON, BARCELONA, BULLDOZER, PILEDRIVER,
44+
STEAMROLLER,EXCAVATOR, ZEN)
45+
46+
SPARC:
47+
- worked around an overflow error in the DNRM2 kernel
48+
49+
POWER:
50+
- worked around an overflow error in the POWER6 DNRM2 kernel
51+
- fixed compilation on PPC440
52+
- fixed a performance regression in the level1 BLAS on POWER10
53+
- fixed the POWER10 ZGEMM kernel
54+
- fixed singlethreaded builds for POWER10
55+
- fixed compilation of the POWER10 DGEMV kernel with older gcc versions
56+
- enabled compilation of the BFLOAT16 kernels by default
57+
- enabled the small matrix kernels by default for DYNAMIC_ARCH builds
58+
- added a workaround for a miscompilation of the CDOT and ZDOT kernels by GCC 12
59+
60+
- RISCV:
61+
- fixed cpu autodetection logic
62+
63+
ARMV8:
64+
- added an SBGEMM kernel for Neoverse N2
65+
- worked around an overflow error in the DNRM2 kernel used on M1, NeoverseN1, ThunderX2T99
66+
- added support for ARM64 systems running MS Windows
67+
- added support for cross-compiling to the GENERIC ARMV8 target under CMAKE (Windows/MSVC)
68+
- fixed a performance regression in the generic ARMV8 DGEMM kernel introduced in 0.3.19
69+
- added initial support for the Apple M1 cpu under Linux
70+
- added initial support for the Phytium FT2000 cpu
71+
- added initial support for the Cortex A510, A710, X1 and X2 cpu
72+
- fixed an accidental mixup of cpu identifiers in the autodetection code introduced in 0.3.20
73+
- fixed linking of Apple M1 builds on macOS 12 and later with recent XCode
74+
- made Neoverse N2 available in DYNAMIC_ARCH builds
75+
76+
MIPS,MIPS64:
77+
- worked around an overflow error in the DNRM2 kernel
78+
79+
LOONGARCH64:
80+
- worked around an overflow error in the DNRM2 kernel
81+
- added preliminary support for the LOONGSON2K1000 cpu
82+
- added DYNAMIC_ARCH support
83+
284
====================================================================
385
Version 0.3.20
486
20-Feb-2022

0 commit comments

Comments
 (0)