Skip to content

Commit ee72575

Browse files
committed
Add documentation on redistributing OpenBLAS
This touches on the following: - build configurations - naming of symbols, shared/static libraries and other build outputs like pkg-config and CMake files - (in more detail) guidance on ILP64 builds It tries to explain that, while this is only guidance and there may be reasons to deviate from that, for some build options there are best practices, and for some others there are choices to make. It also links to a number of well-maintained build recipes in order to help packagers of other distros make choices. Closes gh-3798 [skip ci]
1 parent 7976def commit ee72575

File tree

1 file changed

+270
-0
lines changed

1 file changed

+270
-0
lines changed

docs/distributing.md

Lines changed: 270 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,270 @@
1+
# Guidance for redistributing OpenBLAS
2+
3+
*We note that this document contains recommendations only - packagers and other
4+
redistributors are in charge of how OpenBLAS is built and distributed in their
5+
systems, and may have good reasons to deviate from the guidance given on this
6+
page. These recommendations are aimed at general packaging systems, with a user
7+
base that typically is large, open source (or freely available at least), and
8+
doesn't behave uniformly or that the packager is directly connected with.*
9+
10+
OpenBLAS has a large number of build-time options which can be used to change
11+
how it behaves at runtime, how artifacts or symbols are named, etc. Variation
12+
in build configuration can be necessary to acheive a given end goal within a
13+
distribution or as an end user. However, such variation can also make it more
14+
difficult to build on top of OpenBLAS and ship code or other packages in a way
15+
that works across many different distros. Here we provide guidance about the
16+
most important build options, what effects they may have when changed, and
17+
which ones to default to.
18+
19+
The Make and CMake build systems provide equivalent options and yield more or
20+
less the same artifacts, but not exactly (the CMake builds are still
21+
experimental). You can choose either one and the options will function in the
22+
same way, however the CMake outputs may require some renaming. To review
23+
available build options, see `Makefile.rule` or `CMakeLists.txt` in the root of
24+
the repository.
25+
26+
Build options typically fall into two categories: (a) options that affect the
27+
user interface, such as library and symbol names or APIs that are made
28+
available, and (b) options that affect performance and runtime behavior, such
29+
as threading behavior or CPU architecture-specific code paths. The user
30+
interface options are more important to keep aligned between distributions,
31+
while for the performance-related options there are typically more reasons to
32+
make choices that deviate from the defaults.
33+
34+
Here are recommendations for user interface related packaging choices where it
35+
is not likely to be a good idea to deviate (typically these are the default
36+
settings):
37+
38+
1. Include CBLAS. The CBLAS interface is widely used and it doesn't affect
39+
binary size much, so don't turn it off.
40+
2. Include LAPACK and LAPACKE. The LAPACK interface is also widely used, and
41+
while it does make up a significant part of the binary size of the installed
42+
library, that does not outweigh the regression in usability when deviating
43+
from the default here.[^1]
44+
3. Always distribute the pkg-config (`.pc`) and CMake `.cmake`) dependency
45+
detection files. These files are used by build systems when users want to
46+
link against OpenBLAS, and there is no benefit of leaving them out.
47+
4. Provide the LP64 interface by default, and if in addition to that you choose
48+
to provide an ILP64 interface build as well, use a symbol suffix to avoid
49+
symbol name clashes (see the next section).
50+
51+
[^1] All major distributions do include LAPACK as of mid 2023 as far as we
52+
know. Older versions of Arch Linux did not, and that was known to cause
53+
problems.
54+
55+
56+
## ILP64 interface builds
57+
58+
The LP64 (32-bit integer) interface is the default build, and has
59+
well-established C and Fortran APIs as determined by the reference (Netlib)
60+
BLAS and LAPACK libraries. The ILP64 (64-bit integer) interface however does
61+
not have a standard API: symbol names and shared/static library names can be
62+
produced in multiple ways, and this tends to make it difficult to use.
63+
As of today there is an agreed-upon way of choosing names for OpenBLAS between
64+
a number of key users/redistributors, which is the closest thing to a standard
65+
that there is now. However, there is an ongoing standardization effort in the
66+
reference BLAS and LAPACK libraries, which differs from the current OpenBLAS
67+
agreed-upon convention. In this section we'll aim to explain both.
68+
69+
Those two methods are fairly similar, and have a key thing in common: *using a
70+
symbol suffix*. This is good practice; it is recommended that if you distribute
71+
an ILP64 build, to have it use a symbol suffix containing `64` in the name.
72+
This avoids potential symbol clashes when different packages which depend on
73+
OpenBLAS load both an LP64 and an ILP64 library into memory at the same time.
74+
75+
### The current OpenBLAS agreed-upon ILP64 convention
76+
77+
This convention comprises the shared library name and the symbol suffix in the
78+
shared library. The symbol suffix to use is `64_`, implying that the library
79+
name will be `libopenblas64_.so` and the symbols in that library end in `64_`.
80+
The central issue where this was discussed is
81+
[openblas#646](https://github.com/xianyi/OpenBLAS/issues/646), and adopters
82+
include Fedora, Julia, NumPy and SciPy - SuiteSparse already used it as well.
83+
84+
To build shared and static libraries with the currently recommended ILP64
85+
conventions with Make:
86+
```bash
87+
$ make INTERFACE64=1 SYMBOLSUFFIX=64_
88+
```
89+
90+
This will produce libraries named `libopenblas64_.so|a`, a pkg-config file
91+
named `openblas64.pc`, and CMake and header files.
92+
93+
Installing locally and inspecting the output will show a few more details:
94+
```bash
95+
$ make install PREFIX=$PWD/../openblas/make64 INTERFACE64=1 SYMBOLSUFFIX=64_
96+
$ tree . # output slightly edited down
97+
.
98+
├── include
99+
│   ├── cblas.h
100+
│   ├── f77blas.h
101+
│   ├── lapacke_config.h
102+
│   ├── lapacke.h
103+
│   ├── lapacke_mangling.h
104+
│   ├── lapacke_utils.h
105+
│   ├── lapack.h
106+
│   └── openblas_config.h
107+
└── lib
108+
├── cmake
109+
│   └── openblas
110+
│   ├── OpenBLASConfig.cmake
111+
│   └── OpenBLASConfigVersion.cmake
112+
├── libopenblas64_.a
113+
├── libopenblas64_.so
114+
└── pkgconfig
115+
└── openblas64.pc
116+
```
117+
118+
A key point are the symbol names. These will equal the LP64 symbol names, then
119+
(for Fortran only) the compiler mangling, and then the `64_` symbol suffix.
120+
Hence to obtain the final symbol names, we need to take into account which
121+
Fortran compiler we are using. For the most common cases (e.g., gfortran, Intel
122+
Fortran, or Flang), that means appending a single underscore. In that case, the
123+
result is:
124+
125+
| base API name | binary symbol name | call from Fortran code | call from C code |
126+
|---------------|--------------------|------------------------|-----------------------|
127+
| `dgemm` | `dgemm_64_` | `dgemm_64(...)` | `dgemm_64_(...)` |
128+
| `cblas_dgemm` | `cblas_dgemm64_` | n/a | `cblas_dgemm64_(...)` |
129+
130+
It is quite useful to have these symbol names be as uniform as possible across
131+
different packaging systems.
132+
133+
The equivalent build options with CMake are:
134+
```bash
135+
$ mkdir build && cd build
136+
$ cmake .. -DINTERFACE64=1 -DSYMBOLSUFFIX=64_ -DBUILD_SHARED_LIBS=ON -DBUILD_STATIC_LIBS=ON
137+
$ cmake --build . -j
138+
```
139+
140+
Note that the result is not 100% identical to the Make result. For example, the
141+
library name ends in `_64` rather than `64_` - it is recommended to rename them
142+
to match the Make library names (also update the `libsuffix` entry in
143+
`openblas64.pc` to match that rename).
144+
```bash
145+
$ cmake --install . --prefix $PWD/../../openblas/cmake64
146+
$ tree .
147+
.
148+
├── include
149+
│   └── openblas64
150+
│   ├── cblas.h
151+
│   ├── f77blas.h
152+
│   ├── lapacke_config.h
153+
│   ├── lapacke_example_aux.h
154+
│   ├── lapacke.h
155+
│   ├── lapacke_mangling.h
156+
│   ├── lapacke_utils.h
157+
│   ├── lapack.h
158+
│   ├── openblas64
159+
│   │   └── lapacke_mangling.h
160+
│   └── openblas_config.h
161+
└── lib
162+
├── cmake
163+
│   └── OpenBLAS64
164+
│   ├── OpenBLAS64Config.cmake
165+
│   ├── OpenBLAS64ConfigVersion.cmake
166+
│   ├── OpenBLAS64Targets.cmake
167+
│   └── OpenBLAS64Targets-noconfig.cmake
168+
├── libopenblas_64.a
169+
├── libopenblas_64.so -> libopenblas_64.so.0
170+
└── pkgconfig
171+
└── openblas64.pc
172+
```
173+
174+
175+
### The upcoming standardized ILP64 convention
176+
177+
While the `64_` convention above got some adoption, it's slightly hacky and is
178+
implemented through the use of `objcopy`. An effort is ongoing for a more
179+
broadly adopted convention in the reference BLAS and LAPACK libraries, using
180+
(a) the `_64` suffix, and (b) applying that suffix _before_ rather than after
181+
Fortran compiler mangling. The central issue for this is
182+
[lapack#666](https://github.com/Reference-LAPACK/lapack/issues/666).
183+
184+
For the most common cases of compiler mangling (a single `_` appended), the end
185+
result will be:
186+
187+
| base API name | binary symbol name | call from Fortran code | call from C code |
188+
|---------------|--------------------|------------------------|-----------------------|
189+
| `dgemm` | `dgemm_64_` | `dgemm_64(...)` | `dgemm_64_(...)` |
190+
| `cblas_dgemm` | `cblas_dgemm_64` | n/a | `cblas_dgemm_64(...)` |
191+
192+
For other compiler mangling schemes, replace the trailing `_` by the scheme in use.
193+
194+
The shared library name for this `_64` convention should be `libopenblas_64.so`.
195+
196+
Note: it is not yet possible to produce an OpenBLAS build which employs this
197+
convention! Once reference BLAS and LAPACK with support for `_64` have been
198+
released, a future OpenBLAS release will support it. For now, please use the
199+
older `64_` scheme and avoid using the name `libopenblas_64.so`; it should be
200+
considered reserved for future use of the `_64` standard as prescribed by
201+
reference BLAS/LAPACK.
202+
203+
204+
## Performance and runtime behavior related build options
205+
206+
For these options there are multiple reasonable or common choices.
207+
208+
### Threading related options
209+
210+
OpenBLAS can be built as a multi-threaded or single-threaded library, with the
211+
default being multi-threaded. It's expected that the default `libopenblas`
212+
library is multi-threaded; if you'd like to also distribute single-threaded
213+
builds, consider naming them `libopenblas_sequential`.
214+
215+
OpenBLAS can be built with pthreads or OpenMP as the threading model, with the
216+
default being pthreads. Both options are commonly used, and the choice here
217+
should not influence the shared library name. The choice will be captured by
218+
the `.pc` file. E.g.,:
219+
```bash
220+
$ pkg-config --libs openblas
221+
-fopenmp -lopenblas
222+
223+
$ cat openblas.pc
224+
...
225+
openblas_config= ... USE_OPENMP=0 MAX_THREADS=24
226+
```
227+
228+
The maximum number of threads users will be able to use is determined at build
229+
time by the `NUM_THREADS` build option. It defaults to 24, and there's a wide
230+
range of values that are reasonable to use (up to 256). 64 is a typical choice
231+
here; there is a memory footprint penalty that is linear in `NUM_THREADS`.
232+
Please see `Makefile.rule` for more details.
233+
234+
### CPU architecture related options
235+
236+
OpenBLAS contains a lot of CPU architecture-specific optimizations, hence when
237+
distributing to a user base with a variety of hardware, it is recommended to
238+
enable CPU architecture runtime detection. This will dynamically select
239+
optimized kernels for individual APIs. To do this, use the `DYNAMIC_ARCH=1`
240+
build option. This is usually done on all common CPU families, except when
241+
there are known issues.
242+
243+
In case the CPU architecture is known (e.g. you're building binaries for macOS
244+
M1 users), it is possible to specify the target architecture directly with the
245+
`TARGET=` build option.
246+
247+
`DYNAMIC_ARCH` and `TARGET` are covered in more detail in the main `README.md`
248+
in this repository.
249+
250+
251+
## Real-world examples
252+
253+
OpenBLAS is likely to be distributed in one of these distribution models:
254+
255+
1. As a standalone package, or multiple packages, in a packaging ecosystem like
256+
a Linux distro, Homebrew, conda-forge or MSYS2.
257+
2. Vendored as part of a larger package, e.g. in Julia, NumPy, SciPy, or R.
258+
3. Locally, e.g. making available as a build on a single HPC cluster.
259+
260+
The guidance on this page is most important for models (1) and (2). These links
261+
to build recipes for a representative selection of packaging systems may be
262+
helpful as a reference:
263+
264+
- [Fedora](https://src.fedoraproject.org/rpms/openblas/blob/rawhide/f/openblas.spec)
265+
- [Debian](https://salsa.debian.org/science-team/openblas/-/blob/master/debian/rules)
266+
- [Homebrew](https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/openblas.rb)
267+
- [MSYS2](https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-openblas/PKGBUILD)
268+
- [conda-forge](https://github.com/conda-forge/openblas-feedstock/blob/main/recipe/build.sh)
269+
- [NumPy/SciPy](https://github.com/MacPython/openblas-libs/blob/main/tools/build_openblas.sh)
270+
- [Nixpkgs](https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/libraries/science/math/openblas/default.nix)

0 commit comments

Comments
 (0)