Skip to content

v5.0.x: Fail configure if external hwloc >= v3.0.0 is found #11787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 27 additions & 2 deletions config/opal_config_hwloc.m4
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ dnl
dnl only safe to call from OPAL_CONFIG_HWLOC, assumes variables from
dnl there are set.
AC_DEFUN([_OPAL_CONFIG_HWLOC_EXTERNAL], [
OPAL_VAR_SCOPE_PUSH([opal_hwloc_min_num_version opal_hwloc_min_version opal_hwloc_CPPFLAGS_save opal_hwloc_LDFLAGS_save opal_hwloc_LIBS_save opal_hwloc_external_support])
OPAL_VAR_SCOPE_PUSH([opal_hwloc_min_num_version opal_hwloc_min_version opal_hwlox_max_num_version opal_hwloc_CPPFLAGS_save opal_hwloc_LDFLAGS_save opal_hwloc_LIBS_save opal_hwloc_external_support])

OAC_CHECK_PACKAGE([hwloc],
[opal_hwloc],
Expand All @@ -118,7 +118,7 @@ AC_DEFUN([_OPAL_CONFIG_HWLOC_EXTERNAL], [
opal_hwloc_min_num_version=OMPI_HWLOC_NUMERIC_MIN_VERSION
opal_hwloc_min_version=OMPI_HWLOC_NUMERIC_MIN_VERSION
AS_IF([test "$opal_hwloc_external_support" = "yes"],
[AC_MSG_CHECKING([if external hwloc version is OMPI_HWLOC_MIN_VERSION or greater])
[AC_MSG_CHECKING([if external hwloc version is version OMPI_HWLOC_MIN_VERSION or greater])
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[#include <hwloc.h>
]], [[
#if HWLOC_API_VERSION < $opal_hwloc_min_num_version
Expand All @@ -130,6 +130,31 @@ AC_DEFUN([_OPAL_CONFIG_HWLOC_EXTERNAL], [
AC_MSG_WARN([external hwloc version is too old (OMPI_HWLOC_MIN_VERSION or later required)])
opal_hwloc_external_support="no"])])

# Ensure that we are not using Hwloc >= v3.x. Open MPI does not
# (yet) support Hwloc >= v3.x (which will potentially have ABI and
# API breakage compared to <= v2.x), and using it would lead to
# complicated failure cases. Hence, we just abort outright if we
# find an external Hwloc >= v3.x.
AS_IF([test "$opal_hwloc_external_support" = "yes"],
[AC_MSG_CHECKING([if external hwloc version is less than version 3.0.0])
opal_hwloc_max_num_version=0x00030000
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[#include <hwloc.h>
]], [[
#if HWLOC_API_VERSION >= $opal_hwloc_max_num_version
#error "hwloc API version is >= $opal_hwloc_max_num_version"
#endif
]])],
[AC_MSG_RESULT([yes])],
[AC_MSG_RESULT([no])
AC_MSG_WARN([External hwloc version is too new (less than v3.0.0 is required)])
dnl Yes, the URL below will be wrong for master
dnl builds. But this is "good enough" -- we're
dnl more concerned about getting the URL correct
dnl for end-user builds of official release Open
dnl MPI distribution tarballs.
AC_MSG_WARN([See https://docs.open-mpi.org/en/v$OMPI_MAJOR_VERSION.$OMPI_MINOR_VERSION.x/installing-open-mpi/required-support-libraries.html for more details])
AC_MSG_ERROR([Cannot continue])])])

AS_IF([test "$opal_hwloc_external_support" = "yes"],
[AC_CHECK_DECLS([HWLOC_OBJ_OSDEV_COPROC], [], [], [#include <hwloc.h>
])
Expand Down
29 changes: 29 additions & 0 deletions docs/installing-open-mpi/required-support-libraries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,35 @@ process. More on this below.
.. note:: The versions listed in this table are the *minimum* versions needed. In general, the Open MPI community recommends using more recent versions of both the :ref:`required support libraries <label-install-required-support-libraries>` and any other optional support libraries. This is because more recent versions typically tend to include bug fixes, sometimes affecting Open MPI functionality. As a specific example, there is a known issue with `Hardware Locality <https://www.open-mpi.org/projects/hwloc/>`_ releases older than v2.8.0 on systems with Intel Ponte Vecchio accelerators. If you run Open MPI on such systems, you need to use Hwloc v2.8.0 or newer, or you will experience undefined behavior.
This effect is not unique to the Hardware Locality library; this is why the Open MPI community recommends using as recent as possible versions of all support libraries.

.. danger:: As of |ompi_ver|, Open MPI does not yet support the
Hwloc v3.x series (which may not even be available at
the time of Open MPI |ompi_ver|'s release). Hwloc v3.x
is anticipated to break API and/or ABI compared to the
Hwloc v2.x series.

Open MPI will refuse to build if it finds an external
Hwloc installation that is >= v3.0.0 on the assumption
that other HPC applications and/or libraries may be
using it. Such a configuration could lead to obscure
and potentially confusing run-time failures of Open MPI
applications.

If Open MPI's ``configure`` script aborts because it
finds an Hwloc installation that is >= v3.0.0, you can
either ensure that Open MPI finds a < v3.0.0 Hwloc
installation (e.g., by changing the order of paths in
``LD_LIBRARY_PATH``), or force the use of Open MPI's
bundled Hwloc via:

.. code::

shell$ ./configure --with-hwloc=internal ...

Regardless, *it is critically important* that if an MPI
application |mdash| or any of its dependencies |mdash|
uses Hwloc, it uses the *same* Hwloc with which Open MPI
was compiled.

Library dependencies
--------------------

Expand Down