Skip to content

Commit e106598

Browse files
David Woottonjsquyres
andcommitted
Move debugging section of OpenMPI faq to new section in user documentation
Cleanup some text in debugging topics Fix review comments Signed-off-by: David Wootton <dwootton@us.ibm.com> Co-authored-by: Jeff Squyres <jsquyres@cisco.com>
1 parent 0119b5a commit e106598

File tree

12 files changed

+660
-676
lines changed

12 files changed

+660
-676
lines changed

docs/Makefile.am

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ IMAGE_SOURCE_FILES = \
3535
$(srcdir)/installing-open-mpi/required-support-libraries-dependency-graph.png
3636
RST_SOURCE_FILES = \
3737
$(srcdir)/*.rst \
38+
$(srcdir)/app-debug/*.rst \
3839
$(srcdir)/building-apps/*.rst \
3940
$(srcdir)/developers/*.rst \
4041
$(srcdir)/faq/*.rst \

docs/app-debug/debug-options.rst

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
Open MPI Runtime Debugging Options
2+
==================================
3+
4+
Open MPI has a series of MCA parameters for the MPI layer
5+
itself that are designed to help with debugging.
6+
These parameters :ref:`can be set <label-running-setting-mca-param-values>`
7+
in the usual ways. MPI-level MCA parameters can be
8+
displayed by invoking the following command:
9+
10+
.. code-block:: sh
11+
12+
# Use "--level 9" to see all the MCA parameters
13+
# (the default is "--level 1"):
14+
shell$ ompi_info --param mpi all --level 9
15+
16+
Here is a summary of the debugging parameters for the MPI layer:
17+
18+
* ``mpi_param_check``: If set to true (any positive value), and when
19+
Open MPI is compiled with parameter checking enabled (the default),
20+
the parameters to each MPI function can be passed through a series
21+
of correctness checks. Problems such as passing illegal values
22+
(e.g., NULL or ``MPI_DATATYPE_NULL`` or other "bad" values) will be
23+
discovered at run time and an MPI exception will be invoked (the
24+
default of which is to print a short message and abort the entire
25+
MPI job). If set to false, these checks are disabled, slightly
26+
increasing performance.
27+
28+
* ``mpi_show_handle_leaks``: If set to true (any positive value),
29+
Open MPI will display lists of any MPI handles that were not freed before
30+
:ref:`MPI_Finalize(3) <mpi_finalize>` (e.g., communicators,
31+
datatypes, requests, etc.)
32+
33+
* ``mpi_no_free_handles``: If set to true (any positive value), do not
34+
actually free MPI objects when their corresponding MPI "free"
35+
function is invoked (e.g., do not free communicators when
36+
:ref:`MPI_Comm_free(3) <mpi_comm_free>`. This can be helpful in tracking down
37+
applications that accidentally continue to use MPI handles after
38+
they have been freed.
39+
40+
* ``mpi_show_mca_params``: If set to true (any positive value), show a
41+
list of all MCA parameters and their values when MPI is initialized.
42+
This can be quite helpful for reproducibility of MPI applications.
43+
44+
* ``mpi_show_mca_params_file``: If set to a non-empty value, and if
45+
the value of ``mpi_show_mca_params`` is true, then output the list
46+
of MCA parameters to the filename value. If this parameter is an
47+
empty value, the list is sent to ``stderr``.
48+
49+
* ``mpi_abort_delay``: If nonzero, print out an identifying message
50+
when :ref:`MPI_Abort(3) <mpi_abort>` is invoked showing the hostname and PID of the
51+
process that invoked :ref:`MPI_Abort(3) <mpi_abort>`, and then delay that many seconds
52+
before exiting. A negative value means to delay indefinitely. This
53+
allows a user to manually come in and attach a debugger when an
54+
error occurs. Remember that the default MPI error handler |mdash|
55+
``MPI_ERRORS_ABORT`` |mdash| invokes :ref:`MPI_Abort(3) <mpi_abort>`, so this
56+
parameter can be useful to discover problems identified by
57+
``mpi_param_check``.
58+
59+
* ``mpi_abort_print_stack``: If nonzero, print out a stack trace (on
60+
supported systems) when :ref:`MPI_Abort(3) <mpi_abort>` is invoked.
61+

docs/app-debug/debug-tools.rst

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
Parallel Debugging Tools
2+
========================
3+
4+
There are two main categories of tools that can aid in
5+
parallel debugging:
6+
7+
* **Debuggers:** Both serial and parallel debuggers are useful. Serial
8+
debuggers are what most programmers are used to (e.g.,
9+
the GNU debugger, ``gdb``), while
10+
parallel debuggers can attach to all the individual processes in an
11+
MPI job simultaneously, treating the MPI application as a single
12+
entity. This can be an extremely powerful abstraction, allowing the
13+
user to control every aspect of the MPI job, manually replicate race
14+
conditions, etc.
15+
16+
* **Profilers:** Tools that analyze your usage of MPI and display
17+
statistics and meta information about your application's run. Some
18+
tools present the information "live" (as it occurs), while others
19+
collect the information and display it in a post mortem analysis.

docs/app-debug/index.rst

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
.. Open MPI Application Debugging
2+
3+
Debugging Open MPI Parallel Applications
4+
========================================
5+
6+
Debugging a serial applications includes solving problems like
7+
logic errors, uninitialized variables, storage overlays and timing
8+
problems.
9+
10+
Debugging a parallel application can be further complicated
11+
by problems that can include additional race conditions and aysynchronous
12+
events, as well as understanding execution of multiple application
13+
processes running simultaneously.
14+
15+
This section of the documentation describes some techniques that can
16+
be useful for parallel debugging. This section also describes some
17+
tools that can be useful as well as some Open MPI runtime options
18+
that can aid debugging.
19+
20+
.. toctree::
21+
:maxdepth: 1
22+
23+
debug-tools
24+
debug-options
25+
serial-debug
26+
lost-output
27+
memchecker
28+
valgrind
29+
mpir-tools

docs/app-debug/lost-output.rst

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
Application Output Lost with Abnormal Termination
2+
=================================================
3+
4+
There many be many reasons for application output to be lost when
5+
an application abnormally terminates. The Open MPI Team strongly
6+
encourages the use of tools (such as debuggers) whenever possible.
7+
8+
One of the reasons, however, may come from inside Open MPI itself. If
9+
your application fails due to memory corruption, Open MPI may
10+
subsequently fail to output an error message before terminating.
11+
Open MPI attempts to aggregate error
12+
messages from multiple processes in an attempt to show unique error
13+
messages only once (vs. once for each MPI process |mdash| which can be
14+
unwieldy, especially when running large MPI jobs).
15+
16+
However, this aggregation process requires allocating memory in the
17+
MPI process when it displays the error message. If the process's
18+
memory is already corrupted, Open MPI's attempt to allocate memory may
19+
fail and the process will simply terminate, possibly silently. When Open
20+
MPI does not attempt to aggregate error messages, most of its setup
21+
work is done when the MPI library is initiaized and no memory is allocated
22+
during the "print the error" routine. It therefore almost always successfully
23+
outputs error messages in real time |mdash| but at the expense that you'll
24+
potentially see the same error message for *each* MPI process that
25+
encountered the error.
26+
27+
Hence, the error message aggregation is *usually* a good thing, but
28+
sometimes it can mask a real error. You can disable Open MPI's error
29+
message aggregation with the ``opal_base_help_aggregate`` MCA
30+
parameter. For example:
31+
32+
.. code-block:: sh
33+
34+
shell$ mpirun --mca opal_base_help_aggregate 0 ...

docs/app-debug/memchecker.rst

Lines changed: 193 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
Using Memchecker
2+
================
3+
4+
The Memchecker functionality in Open MPI provides MPI semantic
5+
checking for your application (as well as internals of Open MPI), with
6+
the help of memory checking tools such as the ``memcheck`` component of
7+
`the Valgrind suite <https://www.valgrind.org/>`_.
8+
9+
/////////////////////////////////////////////////////////////////////////
10+
11+
Types of Errors Detected by Memchecker
12+
--------------------------------------
13+
14+
Open MPI's Memchecker is based on the ``memcheck`` tool included with
15+
Valgrind, so it takes all the advantages from it. Firstly, it checks
16+
all reads and writes of memory, and intercepts calls to
17+
``malloc(3)``/``free(3)`` and C++'s ``new``/``delete`` operators.
18+
Most importantly, Memchecker is able to detect
19+
the user buffer errors in both non-blocking and one-sided
20+
communications, e.g. reading or writing to buffers of active
21+
non-blocking receive operations and writing to buffers of active
22+
non-blocking send operations.
23+
24+
Here are some example problems that Memchecker can detect:
25+
26+
Accessing buffer under control of non-blocking communication:
27+
28+
.. code-block:: c
29+
30+
int buf;
31+
MPI_Irecv(&buf, 1, MPI_INT, 1, 0, MPI_COMM_WORLD, &req);
32+
// The following line will produce a memchecker warning
33+
buf = 4711;
34+
MPI_Wait (&req, &status);
35+
36+
Wrong input parameters, e.g., wrong-sized send buffers:
37+
38+
.. code-block:: c
39+
40+
char *send_buffer;
41+
send_buffer = malloc(5);
42+
memset(send_buffer, 0, 5);
43+
// The following line will produce a memchecker warning
44+
MPI_Send(send_buffer, 10, MPI_CHAR, 1, 0, MPI_COMM_WORLD);
45+
46+
Accessing a window in a one-sided communication:
47+
48+
.. code-block:: c
49+
50+
MPI_Get(A, 10, MPI_INT, 1, 0, 1, MPI_INT, win);
51+
A[0] = 4711;
52+
MPI_Win_fence(0, win);
53+
54+
Uninitialized input buffers:
55+
56+
.. code-block:: c
57+
58+
char *buffer;
59+
buffer = malloc(10);
60+
// The following line will produce a memchecker warning
61+
MPI_Send(buffer, 10, MPI_INT, 1, 0, MPI_COMM_WORLD);
62+
63+
Usage of the uninitialized ``MPI_Status`` field in ``MPI_ERROR``
64+
structure: (the MPI-1 standard defines the ``MPI ERROR`` field to be
65+
undefined for single-completion calls such as :ref:`MPI_Wait(3) <mpi_wait>` or
66+
:ref:`MPI_Test(3) <mpi_test>`, see MPI-1 p. 22):
67+
68+
.. code-block:: c
69+
70+
MPI_Wait(&request, &status);
71+
// The following line will produce a memchecker warning
72+
if (status.MPI_ERROR != MPI_SUCCESS)
73+
return ERROR;
74+
75+
/////////////////////////////////////////////////////////////////////////
76+
77+
Building Open MPI with Memchecker Support
78+
-----------------------------------------
79+
80+
To use Memchecker, you need Valgrind 3.2.0 or later, and have an Open
81+
MPI that was configured with the ``--enable-memchecker`` and
82+
``--enable-debug`` flags.
83+
84+
.. note:: The Memchecker functionality is off by default, because it
85+
incurs a performance penalty.
86+
87+
When ``--enable-memchecker`` is specified, ``configure`` will check
88+
for a recent-enable valgrind distribution. If found, Open MPI will
89+
build Memchecker support.
90+
91+
For example:
92+
93+
.. code-block:: sh
94+
95+
shell$ ./configure --prefix=/path/to/openmpi --enable-debug \
96+
--enable-memchecker --with-valgrind=/path/to/valgrind
97+
98+
You can check that Open MPI was built with Memchecker support by using
99+
the :ref:`ompi_info(1) <man1-ompi_info>` command.
100+
101+
.. code-block:: sh
102+
103+
# The exact version numbers shown may be different for your Open
104+
# MPI installation
105+
shell$ ompi_info | grep memchecker
106+
MCA memchecker: valgrind (MCA v1.0, API v1.0, Component v1.3)
107+
108+
If you do not see the "MCA memchecker: valgrind" line, you probably
109+
did not configure and install Open MPI correctly.
110+
111+
/////////////////////////////////////////////////////////////////////////
112+
113+
Running an Open MPI Application with Memchecker
114+
-----------------------------------------------
115+
116+
After Open MPI was built and installed with Memchecker support,
117+
simply run your application with Valgrind, e.g.:
118+
119+
.. code-block:: sh
120+
121+
shell$ mpirun -n 2 valgrind ./my_app
122+
123+
If you enabled Memchecker, but you don't want to check the
124+
application at this time, then just run your application as
125+
usual. E.g.:
126+
127+
.. code-block:: sh
128+
129+
shell$ mpirun -n 2 ./my_app
130+
131+
/////////////////////////////////////////////////////////////////////////
132+
133+
Application Performance Impacts Using Memchecker
134+
------------------------------------------------
135+
136+
The configure option ``--enable-memchecker`` (together with
137+
``--enable-debug``) *does* cause performance degradation, even if not
138+
running under Valgrind. The following explains the mechanism and may
139+
help in making the decision whether to provide a cluster-wide
140+
installation with ``--enable-memchecker``.
141+
142+
There are two cases:
143+
144+
#. If run without Valgrind, the Valgrind ClientRequests (assembler
145+
instructions added to the normal execution path for checking) do
146+
not affect overall MPI performance. Valgrind ClientRequests are
147+
explained in detail `in Valgrind's documentation
148+
<https://valgrind.org/docs/manual/manual-core-adv.html>`_.
149+
In the case of x86-64, ClientRequests boil down to the following
150+
four rotate-left (ROL) and one xchange (XCHG) assembler instructions
151+
from ``valgrind.h``:
152+
153+
.. code-block:: c
154+
155+
#define __SPECIAL_INSTRUCTION_PREAMBLE \
156+
"rolq \$3, %%rdi; rolq \$13, %%rdi\\n\\t" \
157+
"rolq \$61, %%rdi; rolq \$51, %%rdi\\n\\t"
158+
159+
and
160+
161+
.. We do not make the code block below as "c" because the Sphinx C
162+
syntax highlighter fails to parse it as C and emits a warning.
163+
So we might as well just leave it as a plan verbatim block
164+
(i.e., not syntax highlighted).
165+
166+
.. code-block::
167+
168+
__asm__ volatile(__SPECIAL_INSTRUCTION_PREAMBLE \
169+
/* %RDX = client_request ( %RAX ) */ \
170+
"xchgq %%rbx,%%rbx" \
171+
: "=d" (_zzq_result) \
172+
: "a" (& _zzq_args``0``), "0" (_zzq_default) \
173+
: "cc", "memory" \
174+
);
175+
176+
for every single ClientRequest. In the case of not running
177+
Valgrind, these ClientRequest instructions do not change the
178+
arithmetic outcome (rotating a 64-bit register left by 128-Bits,
179+
exchanging a register with itself), except for the carry flag.
180+
181+
The first request is checking whether we're running under Valgrind.
182+
In case we're not running under Valgrind subsequent checks (a.k.a.
183+
ClientRequests) are not done.
184+
185+
#. If the application is run under Valgrind, performance is naturally reduced due
186+
to the Valgrind JIT and the checking tool employed.
187+
For costs and overheads of Valgrind's Memcheck tool on the SPEC 2000 Benchmark,
188+
please see the excellent paper
189+
`Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation
190+
<https://valgrind.org/docs/valgrind2007.pdf>`_.
191+
For an evaluation of various internal implementation alternatives of Shadow Memory, please see
192+
`Building Workload Characterization Tools with Valgrind
193+
<https://valgrind.org/docs/iiswc2006.pdf>`_.

0 commit comments

Comments
 (0)