Skip to content

Commit c4c3d9e

Browse files
author
tingbo.liao
committed
Merge remote-tracking branch 'refs/remotes/origin/develop' into develop
2 parents 0bea1cf + 37a4ca7 commit c4c3d9e

File tree

3 files changed

+71
-26
lines changed

3 files changed

+71
-26
lines changed

Makefile.arm64

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -351,4 +351,31 @@ endif
351351

352352
endif
353353

354+
else
355+
# NVIDIA HPC options necessary to enable SVE in the compiler
356+
ifeq ($(CORE), THUNDERX2T99)
357+
CCOMMON_OPT += -tp=thunderx2t99
358+
FCOMMON_OPT += -tp=thunderx2t99
359+
endif
360+
ifeq ($(CORE), NEOVERSEN1)
361+
CCOMMON_OPT += -tp=neoverse-n1
362+
FCOMMON_OPT += -tp=neoverse-n1
363+
endif
364+
ifeq ($(CORE), NEOVERSEV1)
365+
CCOMMON_OPT += -tp=neoverse-v1
366+
FCOMMON_OPT += -tp=neoverse-v1
367+
endif
368+
ifeq ($(CORE), NEOVERSEV2)
369+
CCOMMON_OPT += -tp=neoverse-v2
370+
FCOMMON_OPT += -tp=neoverse-v2
371+
endif
372+
ifeq ($(CORE), ARMV8SVE)
373+
CCOMMON_OPT += -tp=neoverse-v2
374+
FCOMMON_OPT += -tp=neoverse-v2
375+
endif
376+
ifeq ($(CORE), ARMV9SVE)
377+
CCOMMON_OPT += -tp=neoverse-v2
378+
FCOMMON_OPT += -tp=neoverse-v2
379+
endif
380+
354381
endif

docs/install.md

Lines changed: 42 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -437,36 +437,54 @@ To then use the built OpenBLAS shared library in Visual Studio:
437437
[Qt Creator](http://qt.nokia.com/products/developer-tools/).
438438
439439
440-
#### Windows on Arm
441-
442-
While OpenBLAS can be built with Microsoft VisualStudio (Community Edition or commercial), you would only be able to build for the GENERIC target
443-
that does not use optimized assembly kernels, also the stock VisualStudio lacks the Fortran compiler necessary for building the LAPACK component.
444-
It is therefore highly recommended to download the free LLVM compiler suite and use it to compile OpenBLAS outside of VisualStudio.
445-
446-
The following tools needs to be installed to build for Windows on Arm (WoA):
447-
448-
- LLVM for Windows on Arm.
449-
Find the latest LLVM build for WoA from [LLVM release page](https://releases.llvm.org/) - you want the package whose name ends in "woa64.exe".
450-
(This may not always be present in the very latest point release, as building and uploading the binaries takes time.)
451-
E.g: a LLVM 19 build for WoA64 can be found [here](https://github.com/llvm/llvm-project/releases/download/llvmorg-19.1.2/LLVM-19.1.2-woa64.exe).
452-
Run the LLVM installer and ensure that LLVM is added to the environment variable PATH. (If you do not want to add it to the PATH, you will need to specify
453-
both C and Fortran compiler to Make or CMake with their full path later on)
440+
### Windows on Arm
441+
442+
A fully functional native OpenBLAS for WoA that can be built as both a static and dynamic library using LLVM toolchain and Visual Studio 2022. Before starting to build, make sure that you have installed Visual Studio 2022 on your ARM device, including the "Desktop Development with C++" component (that contains the cmake tool).
443+
(Note that you can use the free "Visual Studio 2022 Community Edition" for this task. In principle it would be possible to build with VisualStudio alone, but using
444+
the LLVM toolchain enables native compilation of the Fortran sources of LAPACK and of all the optimized assembly files, which VisualStudio cannot handle on its own)
445+
446+
1. Clone OpenBLAS to your local machine and checkout to latest release of OpenBLAS (unless you want to build the latest development snapshot - here we are using the 0.3.28 release as the example, of course this exact version may be outdated by the time you read this)
447+
448+
```cmd
449+
git clone https://github.com/OpenMathLib/OpenBLAS.git
450+
cd OpenBLAS
451+
git checkout v0.3.28
452+
```
453+
454+
2. Install Latest LLVM toolchain for WoA:
455+
456+
Download the Latest LLVM toolchain for WoA from [the Release page](https://github.com/llvm/llvm-project/releases/tag/llvmorg-19.1.5). At the time of writing, this is version 19.1.5 - be sure to select the latest release for which you can find a precompiled package whose name ends in "-woa64.exe" (precompiled packages
457+
usually lag a week or two behind their corresponding source release).
458+
Make sure to enable the option “Add LLVM to the system PATH for all the users”
459+
Note: Make sure that the path of LLVM toolchain is at the top of Environment Variables section to avoid conflicts between the set of compilers available in the system path
460+
461+
3. Launch the Native Command Prompt for Windows ARM64:
462+
463+
From the start menu search for “ARM64 Native Tools Command Prompt for Visual Studio 2022
464+
Alternatively open command prompt, run the following command to activate the environment:
465+
"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvarsarm64.bat"
466+
467+
Navigate to the OpenBLAS source code directory and start building OpenBLAS by invoking Ninja:
468+
469+
```cmd
470+
cd OpenBLAS
471+
mkdir build
472+
cd build
473+
474+
cmake .. -G Ninja -DCMAKE_BUILD_TYPE=Release -DTARGET=ARMV8 -DBINARY=64 -DCMAKE_C_COMPILER=clang-cl -DCMAKE_C_COMPILER=arm64-pc-windows-msvc -DCMAKE_ASM_COMPILER=arm64-pc-windows-msvc -DCMAKE_Fortran_COMPILER=flang-new
454475
455-
The following steps describe how to build the static library for OpenBLAS with either Make or CMake:
476+
ninja -j16
477+
```
478+
479+
Note: You might want to include additional options in the cmake command here. For example, the default configuration only generates a static.lib version of the library. If you prefer a DLL, you can add -DBUILD_SHARED_LIBS=ON.
456480
457-
1. Build OpenBLAS with Make:
481+
Note that it is also possible to use the same setup to build OpenBLAS with Make, if you prepare Makefiles over the CMake build for some reason:
458482
459-
```bash
483+
```cmd
460484
$ make CC=clang-cl FC=flang-new AR="llvm-ar" TARGET=ARMV8 ARCH=arm64 RANLIB="llvm-ranlib" MAKE=make
461485
```
462486
463-
2. Build OpenBLAS with CMake
464-
```bash
465-
$ mkdir build
466-
$ cd build
467-
$ cmake .. -G Ninja -DCMAKE_C_COMPILER=clang-cl -DCMAKE_Fortran_COMPILER=flang-new -DTARGET=ARMV8 -DCMAKE_BUILD_TYPE=Release
468-
$ cmake --build .
469-
```
487+
470488
471489
#### Generating an import library
472490

driver/others/memory.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2538,7 +2538,7 @@ static void *alloc_shm(void *address){
25382538
}
25392539
#endif
25402540

2541-
#if defined OS_LINUX || defined OS_AIX || defined __sun__ || defined OS_WINDOWS
2541+
#if ((defined ALLOC_HUGETLB) && (defined OS_LINUX || defined OS_AIX || defined __sun__ || defined OS_WINDOWS))
25422542

25432543
static void alloc_hugetlb_free(struct release_t *release){
25442544

@@ -3254,7 +3254,7 @@ void blas_shutdown(void){
32543254
#endif
32553255
newmemory[pos].lock = 0;
32563256
}
3257-
free(newmemory);
3257+
free((void*)newmemory);
32583258
newmemory = NULL;
32593259
memory_overflowed = 0;
32603260
}

0 commit comments

Comments
 (0)