Skip to content

Commit fa7face

Browse files
authored
Merge pull request #11082 from drwootton/debug_doc_update
Move debugging section of OpenMPI faq to new section in user documentation
2 parents 53acf37 + e106598 commit fa7face

File tree

12 files changed

+660
-676
lines changed

12 files changed

+660
-676
lines changed

docs/Makefile.am

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ IMAGE_SOURCE_FILES = \
3535
$(srcdir)/installing-open-mpi/required-support-libraries-dependency-graph.png
3636
RST_SOURCE_FILES = \
3737
$(srcdir)/*.rst \
38+
$(srcdir)/app-debug/*.rst \
3839
$(srcdir)/building-apps/*.rst \
3940
$(srcdir)/developers/*.rst \
4041
$(srcdir)/faq/*.rst \

docs/app-debug/debug-options.rst

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
Open MPI Runtime Debugging Options
2+
==================================
3+
4+
Open MPI has a series of MCA parameters for the MPI layer
5+
itself that are designed to help with debugging.
6+
These parameters :ref:`can be set <label-running-setting-mca-param-values>`
7+
in the usual ways. MPI-level MCA parameters can be
8+
displayed by invoking the following command:
9+
10+
.. code-block:: sh
11+
12+
# Use "--level 9" to see all the MCA parameters
13+
# (the default is "--level 1"):
14+
shell$ ompi_info --param mpi all --level 9
15+
16+
Here is a summary of the debugging parameters for the MPI layer:
17+
18+
* ``mpi_param_check``: If set to true (any positive value), and when
19+
Open MPI is compiled with parameter checking enabled (the default),
20+
the parameters to each MPI function can be passed through a series
21+
of correctness checks. Problems such as passing illegal values
22+
(e.g., NULL or ``MPI_DATATYPE_NULL`` or other "bad" values) will be
23+
discovered at run time and an MPI exception will be invoked (the
24+
default of which is to print a short message and abort the entire
25+
MPI job). If set to false, these checks are disabled, slightly
26+
increasing performance.
27+
28+
* ``mpi_show_handle_leaks``: If set to true (any positive value),
29+
Open MPI will display lists of any MPI handles that were not freed before
30+
:ref:`MPI_Finalize(3) <mpi_finalize>` (e.g., communicators,
31+
datatypes, requests, etc.)
32+
33+
* ``mpi_no_free_handles``: If set to true (any positive value), do not
34+
actually free MPI objects when their corresponding MPI "free"
35+
function is invoked (e.g., do not free communicators when
36+
:ref:`MPI_Comm_free(3) <mpi_comm_free>`. This can be helpful in tracking down
37+
applications that accidentally continue to use MPI handles after
38+
they have been freed.
39+
40+
* ``mpi_show_mca_params``: If set to true (any positive value), show a
41+
list of all MCA parameters and their values when MPI is initialized.
42+
This can be quite helpful for reproducibility of MPI applications.
43+
44+
* ``mpi_show_mca_params_file``: If set to a non-empty value, and if
45+
the value of ``mpi_show_mca_params`` is true, then output the list
46+
of MCA parameters to the filename value. If this parameter is an
47+
empty value, the list is sent to ``stderr``.
48+
49+
* ``mpi_abort_delay``: If nonzero, print out an identifying message
50+
when :ref:`MPI_Abort(3) <mpi_abort>` is invoked showing the hostname and PID of the
51+
process that invoked :ref:`MPI_Abort(3) <mpi_abort>`, and then delay that many seconds
52+
before exiting. A negative value means to delay indefinitely. This
53+
allows a user to manually come in and attach a debugger when an
54+
error occurs. Remember that the default MPI error handler |mdash|
55+
``MPI_ERRORS_ABORT`` |mdash| invokes :ref:`MPI_Abort(3) <mpi_abort>`, so this
56+
parameter can be useful to discover problems identified by
57+
``mpi_param_check``.
58+
59+
* ``mpi_abort_print_stack``: If nonzero, print out a stack trace (on
60+
supported systems) when :ref:`MPI_Abort(3) <mpi_abort>` is invoked.
61+

docs/app-debug/debug-tools.rst

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
Parallel Debugging Tools
2+
========================
3+
4+
There are two main categories of tools that can aid in
5+
parallel debugging:
6+
7+
* **Debuggers:** Both serial and parallel debuggers are useful. Serial
8+
debuggers are what most programmers are used to (e.g.,
9+
the GNU debugger, ``gdb``), while
10+
parallel debuggers can attach to all the individual processes in an
11+
MPI job simultaneously, treating the MPI application as a single
12+
entity. This can be an extremely powerful abstraction, allowing the
13+
user to control every aspect of the MPI job, manually replicate race
14+
conditions, etc.
15+
16+
* **Profilers:** Tools that analyze your usage of MPI and display
17+
statistics and meta information about your application's run. Some
18+
tools present the information "live" (as it occurs), while others
19+
collect the information and display it in a post mortem analysis.

docs/app-debug/index.rst

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
.. Open MPI Application Debugging
2+
3+
Debugging Open MPI Parallel Applications
4+
========================================
5+
6+
Debugging a serial applications includes solving problems like
7+
logic errors, uninitialized variables, storage overlays and timing
8+
problems.
9+
10+
Debugging a parallel application can be further complicated
11+
by problems that can include additional race conditions and aysynchronous
12+
events, as well as understanding execution of multiple application
13+
processes running simultaneously.
14+
15+
This section of the documentation describes some techniques that can
16+
be useful for parallel debugging. This section also describes some
17+
tools that can be useful as well as some Open MPI runtime options
18+
that can aid debugging.
19+
20+
.. toctree::
21+
:maxdepth: 1
22+
23+
debug-tools
24+
debug-options
25+
serial-debug
26+
lost-output
27+
memchecker
28+
valgrind
29+
mpir-tools

docs/app-debug/lost-output.rst

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
Application Output Lost with Abnormal Termination
2+
=================================================
3+
4+
There many be many reasons for application output to be lost when
5+
an application abnormally terminates. The Open MPI Team strongly
6+
encourages the use of tools (such as debuggers) whenever possible.
7+
8+
One of the reasons, however, may come from inside Open MPI itself. If
9+
your application fails due to memory corruption, Open MPI may
10+
subsequently fail to output an error message before terminating.
11+
Open MPI attempts to aggregate error
12+
messages from multiple processes in an attempt to show unique error
13+
messages only once (vs. once for each MPI process |mdash| which can be
14+
unwieldy, especially when running large MPI jobs).
15+
16+
However, this aggregation process requires allocating memory in the
17+
MPI process when it displays the error message. If the process's
18+
memory is already corrupted, Open MPI's attempt to allocate memory may
19+
fail and the process will simply terminate, possibly silently. When Open
20+
MPI does not attempt to aggregate error messages, most of its setup
21+
work is done when the MPI library is initiaized and no memory is allocated
22+
during the "print the error" routine. It therefore almost always successfully
23+
outputs error messages in real time |mdash| but at the expense that you'll
24+
potentially see the same error message for *each* MPI process that
25+
encountered the error.
26+
27+
Hence, the error message aggregation is *usually* a good thing, but
28+
sometimes it can mask a real error. You can disable Open MPI's error
29+
message aggregation with the ``opal_base_help_aggregate`` MCA
30+
parameter. For example:
31+
32+
.. code-block:: sh
33+
34+
shell$ mpirun --mca opal_base_help_aggregate 0 ...

docs/app-debug/memchecker.rst

Lines changed: 193 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
Using Memchecker
2+
================
3+
4+
The Memchecker functionality in Open MPI provides MPI semantic
5+
checking for your application (as well as internals of Open MPI), with
6+
the help of memory checking tools such as the ``memcheck`` component of
7+
`the Valgrind suite <https://www.valgrind.org/>`_.
8+
9+
/////////////////////////////////////////////////////////////////////////
10+
11+
Types of Errors Detected by Memchecker
12+
--------------------------------------
13+
14+
Open MPI's Memchecker is based on the ``memcheck`` tool included with
15+
Valgrind, so it takes all the advantages from it. Firstly, it checks
16+
all reads and writes of memory, and intercepts calls to
17+
``malloc(3)``/``free(3)`` and C++'s ``new``/``delete`` operators.
18+
Most importantly, Memchecker is able to detect
19+
the user buffer errors in both non-blocking and one-sided
20+
communications, e.g. reading or writing to buffers of active
21+
non-blocking receive operations and writing to buffers of active
22+
non-blocking send operations.
23+
24+
Here are some example problems that Memchecker can detect:
25+
26+
Accessing buffer under control of non-blocking communication:
27+
28+
.. code-block:: c
29+
30+
int buf;
31+
MPI_Irecv(&buf, 1, MPI_INT, 1, 0, MPI_COMM_WORLD, &req);
32+
// The following line will produce a memchecker warning
33+
buf = 4711;
34+
MPI_Wait (&req, &status);
35+
36+
Wrong input parameters, e.g., wrong-sized send buffers:
37+
38+
.. code-block:: c
39+
40+
char *send_buffer;
41+
send_buffer = malloc(5);
42+
memset(send_buffer, 0, 5);
43+
// The following line will produce a memchecker warning
44+
MPI_Send(send_buffer, 10, MPI_CHAR, 1, 0, MPI_COMM_WORLD);
45+
46+
Accessing a window in a one-sided communication:
47+
48+
.. code-block:: c
49+
50+
MPI_Get(A, 10, MPI_INT, 1, 0, 1, MPI_INT, win);
51+
A[0] = 4711;
52+
MPI_Win_fence(0, win);
53+
54+
Uninitialized input buffers:
55+
56+
.. code-block:: c
57+
58+
char *buffer;
59+
buffer = malloc(10);
60+
// The following line will produce a memchecker warning
61+
MPI_Send(buffer, 10, MPI_INT, 1, 0, MPI_COMM_WORLD);
62+
63+
Usage of the uninitialized ``MPI_Status`` field in ``MPI_ERROR``
64+
structure: (the MPI-1 standard defines the ``MPI ERROR`` field to be
65+
undefined for single-completion calls such as :ref:`MPI_Wait(3) <mpi_wait>` or
66+
:ref:`MPI_Test(3) <mpi_test>`, see MPI-1 p. 22):
67+
68+
.. code-block:: c
69+
70+
MPI_Wait(&request, &status);
71+
// The following line will produce a memchecker warning
72+
if (status.MPI_ERROR != MPI_SUCCESS)
73+
return ERROR;
74+
75+
/////////////////////////////////////////////////////////////////////////
76+
77+
Building Open MPI with Memchecker Support
78+
-----------------------------------------
79+
80+
To use Memchecker, you need Valgrind 3.2.0 or later, and have an Open
81+
MPI that was configured with the ``--enable-memchecker`` and
82+
``--enable-debug`` flags.
83+
84+
.. note:: The Memchecker functionality is off by default, because it
85+
incurs a performance penalty.
86+
87+
When ``--enable-memchecker`` is specified, ``configure`` will check
88+
for a recent-enable valgrind distribution. If found, Open MPI will
89+
build Memchecker support.
90+
91+
For example:
92+
93+
.. code-block:: sh
94+
95+
shell$ ./configure --prefix=/path/to/openmpi --enable-debug \
96+
--enable-memchecker --with-valgrind=/path/to/valgrind
97+
98+
You can check that Open MPI was built with Memchecker support by using
99+
the :ref:`ompi_info(1) <man1-ompi_info>` command.
100+
101+
.. code-block:: sh
102+
103+
# The exact version numbers shown may be different for your Open
104+
# MPI installation
105+
shell$ ompi_info | grep memchecker
106+
MCA memchecker: valgrind (MCA v1.0, API v1.0, Component v1.3)
107+
108+
If you do not see the "MCA memchecker: valgrind" line, you probably
109+
did not configure and install Open MPI correctly.
110+
111+
/////////////////////////////////////////////////////////////////////////
112+
113+
Running an Open MPI Application with Memchecker
114+
-----------------------------------------------
115+
116+
After Open MPI was built and installed with Memchecker support,
117+
simply run your application with Valgrind, e.g.:
118+
119+
.. code-block:: sh
120+
121+
shell$ mpirun -n 2 valgrind ./my_app
122+
123+
If you enabled Memchecker, but you don't want to check the
124+
application at this time, then just run your application as
125+
usual. E.g.:
126+
127+
.. code-block:: sh
128+
129+
shell$ mpirun -n 2 ./my_app
130+
131+
/////////////////////////////////////////////////////////////////////////
132+
133+
Application Performance Impacts Using Memchecker
134+
------------------------------------------------
135+
136+
The configure option ``--enable-memchecker`` (together with
137+
``--enable-debug``) *does* cause performance degradation, even if not
138+
running under Valgrind. The following explains the mechanism and may
139+
help in making the decision whether to provide a cluster-wide
140+
installation with ``--enable-memchecker``.
141+
142+
There are two cases:
143+
144+
#. If run without Valgrind, the Valgrind ClientRequests (assembler
145+
instructions added to the normal execution path for checking) do
146+
not affect overall MPI performance. Valgrind ClientRequests are
147+
explained in detail `in Valgrind's documentation
148+
<https://valgrind.org/docs/manual/manual-core-adv.html>`_.
149+
In the case of x86-64, ClientRequests boil down to the following
150+
four rotate-left (ROL) and one xchange (XCHG) assembler instructions
151+
from ``valgrind.h``:
152+
153+
.. code-block:: c
154+
155+
#define __SPECIAL_INSTRUCTION_PREAMBLE \
156+
"rolq \$3, %%rdi; rolq \$13, %%rdi\\n\\t" \
157+
"rolq \$61, %%rdi; rolq \$51, %%rdi\\n\\t"
158+
159+
and
160+
161+
.. We do not make the code block below as "c" because the Sphinx C
162+
syntax highlighter fails to parse it as C and emits a warning.
163+
So we might as well just leave it as a plan verbatim block
164+
(i.e., not syntax highlighted).
165+
166+
.. code-block::
167+
168+
__asm__ volatile(__SPECIAL_INSTRUCTION_PREAMBLE \
169+
/* %RDX = client_request ( %RAX ) */ \
170+
"xchgq %%rbx,%%rbx" \
171+
: "=d" (_zzq_result) \
172+
: "a" (& _zzq_args``0``), "0" (_zzq_default) \
173+
: "cc", "memory" \
174+
);
175+
176+
for every single ClientRequest. In the case of not running
177+
Valgrind, these ClientRequest instructions do not change the
178+
arithmetic outcome (rotating a 64-bit register left by 128-Bits,
179+
exchanging a register with itself), except for the carry flag.
180+
181+
The first request is checking whether we're running under Valgrind.
182+
In case we're not running under Valgrind subsequent checks (a.k.a.
183+
ClientRequests) are not done.
184+
185+
#. If the application is run under Valgrind, performance is naturally reduced due
186+
to the Valgrind JIT and the checking tool employed.
187+
For costs and overheads of Valgrind's Memcheck tool on the SPEC 2000 Benchmark,
188+
please see the excellent paper
189+
`Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation
190+
<https://valgrind.org/docs/valgrind2007.pdf>`_.
191+
For an evaluation of various internal implementation alternatives of Shadow Memory, please see
192+
`Building Workload Characterization Tools with Valgrind
193+
<https://valgrind.org/docs/iiswc2006.pdf>`_.

0 commit comments

Comments
 (0)