|
1 | 1 | OpenBLAS ChangeLog
|
| 2 | +==================================================================== |
| 3 | +Version 0.3.14 |
| 4 | + 17-Mar-2021 |
| 5 | + |
| 6 | + common: |
| 7 | + * Fixed a race condition on thread shutdown in non-OpenMP builds |
| 8 | + * Fixed custom BUFFERSIZE option getting ignored in gmake builds |
| 9 | + * Fixed CMAKE compilation of the TRMM kernels for GENERIC platforms |
| 10 | + * Added CBLAS interfaces for CROTG, ZROTG, CSROT and ZDROT |
| 11 | + * Improved performance of OMATCOPY_RT across all platforms |
| 12 | + * Changed perl scripts to use env instead of a hardcoded /usr/bin/perl |
| 13 | + * Fixed potential misreading of the GCC compiler version in the build scripts |
| 14 | + * Fixed convergence problems in LAPACK complex GGEV/GGES (Reference-LAPACK #477) |
| 15 | + * Reduced the stacksize requirements for running the LAPACK testsuite (Reference-LAPACK #335) |
| 16 | + |
| 17 | + RISCV: |
| 18 | + * Fixed compilation on RISCV (missing entry in getarch) |
| 19 | + |
| 20 | + POWER: |
| 21 | + * Fixed compilation for DYNAMIC_ARCH with clang and with old gcc versions |
| 22 | + * Added support for compilation on FreeBSD/ppc64le |
| 23 | + * Added optimized POWER10 kernels for SSCAL, DSCAL, CSCAL, ZSCAL |
| 24 | + * Added optimized POWER10 kernels for SROT, DROT, CDOT, SASUM, DASUM |
| 25 | + * Improved SSWAP, DSWAP, CSWAP, ZSWAP performance on POWER10 |
| 26 | + * Improved SCOPY and CCOPY performance on POWER10 |
| 27 | + * Improved SGEMM and DGEMM performance on POWER10 |
| 28 | + * Added support for compilation with the NVIDIA HPC compiler |
| 29 | + |
| 30 | + x86_64: |
| 31 | + * Added an optimized bfloat16 GEMM kernel for Cooperlake |
| 32 | + * Added CPUID autodetection for Intel Rocket Lake and Tiger Lake cpus |
| 33 | + * Improved the performance of SASUM,DASUM,SROT,DROT on AMD Ryzen cpus |
| 34 | + * Added support for compilation with the NAG Fortran compiler |
| 35 | + * Fixed recognition of the AMD AOCC compiler |
| 36 | + * Fixed compilation for DYNAMIC_ARCH with clang on Windows |
| 37 | + * Added support for running the BLAS/CBLAS tests on Windows |
| 38 | + * Fixed signatures of the tls callback functions for Windows x64 |
| 39 | + * Fixed various issues with fma intrinsics support handling |
| 40 | + |
| 41 | + ARM: |
| 42 | + * Added support for embedded Cortex M targets via a new option EMBEDDED |
| 43 | + |
| 44 | + ARMV8: |
| 45 | + * Fixed the THUNDERX2T99 and NEOVERSEN1 DNRM2/ZNRM2 kernels for inputs with Inf |
| 46 | + * Added support for the DYNAMIC_LIST option |
| 47 | + * Added support for compilation with the NVIDIA HPC compiler |
| 48 | + * Added support for compiling with the NAG Fortran compiler |
| 49 | + |
2 | 50 | ====================================================================
|
3 | 51 | Version 0.3.13
|
4 | 52 | 12-Dec-2020
|
|
0 commit comments