|
1 |
| -This page describes the Make-based build, which is the default/authoritative |
2 |
| -build method. Note that the OpenBLAS repository also supports building with |
3 |
| -CMake (not described here) - that generally works and is tested, however there |
4 |
| -may be small differences between the Make and CMake builds. |
| 1 | +!!! info "Supported build systems" |
| 2 | + |
| 3 | + This page describes the Make-based build, which is the |
| 4 | + default/authoritative build method. Note that the OpenBLAS repository also |
| 5 | + supports building with CMake (not described here) - that generally works |
| 6 | + and is tested, however there may be small differences between the Make and |
| 7 | + CMake builds. |
5 | 8 |
|
6 | 9 | !!! warning
|
7 | 10 | This page is made by someone who is not the developer and should not be considered as an official documentation of the build system. For getting the full picture, it is best to read the Makefiles and understand them yourself.
|
@@ -49,56 +52,78 @@ Makefile
|
49 | 52 |
|
50 | 53 | ## Important Variables
|
51 | 54 |
|
52 |
| -Most of the tunable variables are found in [Makefile.rule](https://github.com/xianyi/OpenBLAS/blob/develop/Makefile.rule), along with their detailed descriptions.<br/> |
53 |
| -Most of the variables are detected automatically in [Makefile.prebuild](https://github.com/xianyi/OpenBLAS/blob/develop/Makefile.prebuild), if they are not set in the environment. |
| 55 | +Most of the tunable variables are found in |
| 56 | +[Makefile.rule](https://github.com/xianyi/OpenBLAS/blob/develop/Makefile.rule), |
| 57 | +along with their detailed descriptions. |
54 | 58 |
|
55 |
| -### CPU related |
56 |
| -``` |
57 |
| -ARCH - Target architecture (eg. x86_64) |
58 |
| -TARGET - Target CPU architecture, in case of DYNAMIC_ARCH=1 means library will not be usable on less capable CPUs |
59 |
| -TARGET_CORE - TARGET_CORE will override TARGET internally during each cpu-specific cycle of the build for DYNAMIC_ARCH |
60 |
| -DYNAMIC_ARCH - For building library for multiple TARGETs (does not lose any optimizations, but increases library size) |
61 |
| -DYNAMIC_LIST - optional user-provided subset of the DYNAMIC_CORE list in Makefile.system |
62 |
| -``` |
| 59 | +Most of the variables are detected automatically in |
| 60 | +[Makefile.prebuild](https://github.com/xianyi/OpenBLAS/blob/develop/Makefile.prebuild), |
| 61 | +if they are not set in the environment. |
63 | 62 |
|
64 |
| -### Toolchain related |
65 |
| -``` |
66 |
| -CC - TARGET C compiler used for compilation (can be cross-toolchains) |
67 |
| -FC - TARGET Fortran compiler used for compilation (can be cross-toolchains, set NOFORTRAN=1 if used cross-toolchain has no fortran compiler) |
68 |
| -AR, AS, LD, RANLIB - TARGET toolchain helpers used for compilation (can be cross-toolchains) |
69 | 63 |
|
70 |
| -HOSTCC - compiler of build machine, needed to create proper config files for target architecture |
71 |
| -HOST_CFLAGS - flags for build machine compiler |
72 |
| -``` |
| 64 | +### CPU related |
73 | 65 |
|
74 |
| -### Library related |
75 |
| -``` |
76 |
| -BINARY - 32/64 bit library |
| 66 | +- `ARCH`: target architecture (e.g., `x86-64`). |
| 67 | +- `DYNAMIC_ARCH`: For building library for multiple `TARGET`s (does not lose any |
| 68 | + optimizations, but increases library size). |
| 69 | +- `DYNAMIC_LIST`: optional user-provided subset of the `DYNAMIC_CORE` list in |
| 70 | + [Makefile.system](https://github.com/xianyi/OpenBLAS/blob/develop/Makefile.system). |
| 71 | +- `TARGET`: target CPU architecture. In case of `DYNAMIC_ARCH=1`, it means that |
| 72 | + the library will not be usable on less capable CPUs. |
| 73 | +- `TARGET_CORE`: override `TARGET` internally during each CPU-specific cycle of |
| 74 | + the build for `DYNAMIC_ARCH`. |
77 | 75 |
|
78 |
| -BUILD_SHARED - Create shared library |
79 |
| -BUILD_STATIC - Create static library |
80 | 76 |
|
81 |
| -QUAD_PRECISION - enable support for IEEE quad precision [ largely unimplemented leftover from GotoBLAS, do not use ] |
82 |
| -EXPRECISION - Obsolete option to use float80 of SSE on BSD-like systems |
83 |
| -INTERFACE64 - Build with 64bit integer representations to support large array index values [ incompatible with standard API ] |
| 77 | +### Toolchain related |
84 | 78 |
|
85 |
| -BUILD_SINGLE - build the single-precision real functions of BLAS [and optionally LAPACK] |
86 |
| -BUILD_DOUBLE - build the double-precision real functions |
87 |
| -BUILD_COMPLEX - build the single-precision complex functions |
88 |
| -BUILD_COMPLEX16 - build the double-precision complex functions |
89 |
| -(all four types are included in the build by default when none was specifically selected) |
| 79 | +- `CC`: `TARGET` C compiler used for compilation (can be cross-toolchains). |
| 80 | +- `FC`: `TARGET` Fortran compiler used for compilation (can be cross-toolchains, |
| 81 | + set `NOFORTRAN=1` if the used cross-toolchain has no Fortran compiler). |
| 82 | +- `AR`, `AS`, `LD`, `RANLIB`: `TARGET` toolchain helpers used for compilation |
| 83 | + (can be cross-toolchains). |
| 84 | +- `HOSTCC`: compiler of build machine, needed to create proper config files for |
| 85 | + the target architecture. |
| 86 | +- `HOST_CFLAGS`: flags for the build machine compiler. |
90 | 87 |
|
91 |
| -BUILD_BFLOAT16 - build the "half precision brainfloat" real functions |
92 |
| - |
93 |
| -USE_THREAD - Use a multithreading backend (default to pthread) |
94 |
| -USE_LOCKING - implement locking for thread safety even when USE_THREAD is not set (so that the singlethreaded library can |
95 |
| - safely be called from multithreaded programs) |
96 |
| -USE_OPENMP - Use OpenMP as multithreading backend |
97 |
| -NUM_THREADS - define this to the maximum number of parallel threads you expect to need (defaults to the number of cores in the build cpu) |
98 |
| -NUM_PARALLEL - define this to the number of OpenMP instances that your code may use for parallel calls into OpenBLAS (default 1,see below) |
99 | 88 |
|
100 |
| -``` |
| 89 | +### Library related |
101 | 90 |
|
| 91 | +#### Library kind and bitness options |
| 92 | + |
| 93 | +- `BINARY`: whether to build a 32-bit or 64-bit library (default is `64`, set |
| 94 | + to `32` on a 32-bit platform). |
| 95 | +- `BUILD_SHARED`: create a shared library |
| 96 | +- `BUILD_STATIC`: create a static library |
| 97 | +- `INTERFACE64`: build with 64-bit (ILP64) integer representations to support |
| 98 | + large array index values (incompatible with the standard 32-bit integer (LP64) API). |
| 99 | + |
| 100 | +#### Data type options |
| 101 | + |
| 102 | +- `BUILD_SINGLE`: build the single-precision real functions of BLAS and (if |
| 103 | + it's built) LAPACK |
| 104 | +- `BUILD_DOUBLE`: build the double-precision real functions |
| 105 | +- `BUILD_COMPLEX`: build the single-precision complex functions |
| 106 | +- `BUILD_COMPLEX16`: build the double-precision complex functions |
| 107 | +- `BUILD_BFLOAT16`: build the "half precision brainfloat" real functions |
| 108 | +- `EXPRECISION`: obsolete option to use float80 of SSE on BSD-like systems |
| 109 | +- `QUAD_PRECISION`: enable support for IEEE quad precision (largely |
| 110 | + unimplemented leftover from GotoBLAS, do not use) |
| 111 | + |
| 112 | +By default, the single- and double-precision real and complex floating-point |
| 113 | +functions are included in the build, while the half- and extended-precision |
| 114 | +functions are not. |
| 115 | + |
| 116 | +#### Threading options |
| 117 | + |
| 118 | +- `USE_THREAD`: Use a multithreading backend (defaults to `pthreads`). |
| 119 | +- `USE_LOCKING`: implement locking for thread safety even when `USE_THREAD` is |
| 120 | + not set (so that the single-threaded library can safely be called from |
| 121 | + multithreaded programs). |
| 122 | +- `USE_OPENMP`: Use OpenMP as multithreading backend |
| 123 | +- `NUM_THREADS`: define this to the maximum number of parallel threads you |
| 124 | + expect to need (defaults to the number of cores in the build CPU). |
| 125 | +- `NUM_PARALLEL`: define this to the number of OpenMP instances that your code |
| 126 | + may use for parallel calls into OpenBLAS (the default is `1`, see below). |
102 | 127 |
|
103 | 128 | OpenBLAS uses a fixed set of memory buffers internally, used for communicating
|
104 | 129 | and compiling partial results from individual threads. For efficiency, the
|
|
0 commit comments