Skip to content

Commit 2c865d9

Browse files
authored
Merge pull request #10996 from jjhursey/doc-rankfile
doc/mpirun: Fixup the rankfile documentation
2 parents af58732 + 325fd22 commit 2c865d9

File tree

3 files changed

+12
-33
lines changed

3 files changed

+12
-33
lines changed

config/opal_setup_sphinx.m4

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -69,8 +69,8 @@ AC_DEFUN([OPAL_SETUP_SPHINX],[
6969
[AC_MSG_WARN([*** You will not have documentation installed.])
7070
AC_MSG_WARN([*** See the following URL for more information:])
7171
dnl Note that we have to double escape the string below
72-
dnl so that the # it contains coesn't confuse the Autotools
73-
AC_MSG_WARN([[*** https://ompi.readthedocs.io/en/latest/developers/prerequisites.html#sphinx]])
72+
dnl so that the # it contains doesn't confuse the Autotools
73+
AC_MSG_WARN([[*** https://docs.open-mpi.org/en/main/developers/prerequisites.html#sphinx-and-therefore-python]])
7474
])
7575

7676
# If --enable-sphinx was specified and we did not find Sphinx,

docs/man-openmpi/man1/mpirun.1.rst

Lines changed: 9 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -357,6 +357,7 @@ For process binding:
357357
For rankfiles:
358358

359359
* ``--rankfile <rankfile>``: Provide a rankfile file.
360+
(deprecated in favor of ``--map-by rankfile:file=FILE``)
360361

361362
To manage standard I/O:
362363

@@ -1106,41 +1107,18 @@ For example:
11061107
shell$ cat myrankfile
11071108
rank 0=aa slot=1:0-2
11081109
rank 1=bb slot=0:0,1
1109-
rank 2=cc slot=1-2
1110-
shell$ mpirun -H aa,bb,cc,dd -rf myrankfile ./a.out
1110+
rank 2=cc slot=2-3
1111+
shell$ mpirun -H aa,bb,cc,dd --map-by rankfile:file=myrankfile ./a.out
11111112
11121113
Means that:
11131114

11141115
* Rank 0 runs on node aa, bound to logical socket 1, cores 0-2.
11151116
* Rank 1 runs on node bb, bound to logical socket 0, cores 0 and 1.
1116-
* Rank 2 runs on node cc, bound to logical cores 1 and 2.
1117+
* Rank 2 runs on node cc, bound to logical cores 2 and 3.
11171118

1118-
Rankfiles can alternatively be used to specify physical processor
1119-
locations. In this case, the syntax is somewhat different. Sockets are
1120-
no longer recognized, and the slot number given must be the number of
1121-
the physical PU as most OS's do not assign a unique physical
1122-
identifier to each core in the node. Thus, a proper physical rankfile
1123-
looks something like the following:
1119+
Note that only logicical processor locations are supported. By default, the values specifed are assumed to be cores. If you intend to specify specific hardware threads then you must add the ``:hwtcpus`` qualifier to the ``--map-by`` command line option (e.g., ``--map-by rankfile:file=myrankfile:hwtcpus``).
11241120

1125-
.. code::
1126-
1127-
shell$ cat myphysicalrankfile
1128-
rank 0=aa slot=1
1129-
rank 1=bb slot=8
1130-
rank 2=cc slot=6
1131-
1132-
This means that
1133-
1134-
* Rank 0 will run on node aa, bound to the core that contains physical
1135-
PU 1
1136-
* Rank 1 will run on node bb, bound to the core that contains physical
1137-
PU 8
1138-
* Rank 2 will run on node cc, bound to the core that contains physical
1139-
PU 6
1140-
1141-
Rankfiles are treated as logical by default, and the MCA parameter
1142-
``rmaps_rank_file_physical`` must be set to 1 to indicate that the
1143-
rankfile is to be considered as physical.
1121+
If the binding specification overlaps between any two ranks then an error occurs. If you intend to allow processes to share the same logical processing unit then you must pass the ``--bind-to :overload-allowed`` command line option to tell the runtime to ignore this check.
11441122

11451123
The hostnames listed above are "absolute," meaning that actual
11461124
resolveable hostnames are specified. However, hostnames can also be
@@ -1157,12 +1135,12 @@ hostnames, indexed from 0. For example:
11571135
shell$ cat myrankfile
11581136
rank 0=+n0 slot=1:0-2
11591137
rank 1=+n1 slot=0:0,1
1160-
rank 2=+n2 slot=1-2
1161-
shell$ mpirun -H aa,bb,cc,dd -rf myrankfile ./a.out
1138+
rank 2=+n2 slot=2-3
1139+
shell$ mpirun -H aa,bb,cc,dd --map-by rankfile:file=myrankfile ./a.out
11621140
11631141
All socket/core slot locations are specified as logical indexes.
11641142

1165-
.. note:: The Open MPI v1.6 series used physical indexes.
1143+
.. note:: The Open MPI v1.6 series used physical indexes. Starting in Open MPI v5.0 only logicial indexes are supported and the ``rmaps_rank_file_physical`` MCA parameter is no longer recognized.
11661144

11671145
You can use tools such as Hwloc's `lstopo(1)` to find the logical
11681146
indexes of socket and cores.

docs/news/news-v5.0.x.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,7 @@ Open MPI version 5.0.0rc9
153153
- ompi/contrib: Removed ``libompitrace``.
154154
This library was incomplete and unmaintained. If needed, it
155155
is available in the v4/v4.1 series.
156+
- The rankfile format no longer supports physical processor locations. Only logical processor locations are supported.
156157

157158
- HWLOC updates:
158159

0 commit comments

Comments
 (0)