@@ -36,24 +36,33 @@ places. If you have:
36
36
there.
37
37
38
38
.. note :: Because of spam, only subscribers to the mailing list are
39
- allowed to post to the mailing list. Specifically: you must
40
- subscribe to the mailing list before posting.
39
+ allowed to post to the mailing list. Specifically: ** you must
40
+ subscribe to the mailing list before posting. **
41
41
42
- * If you have a run-time question or problem, see the :ref: `For
43
- run-time problems <getting-help-run-time-label>` section below for
44
- the content of what to include in your email.
45
42
* If you have a compile-time question or problem, see the :ref: `For
46
- compile-time problems <getting-help-compile-time-label>` section
43
+ problems building or installing Open MPI
44
+ <getting-help-compile-time-label>` section below for the content
45
+ of what to include in your email.
46
+
47
+ * If you have problems launching your MPI or OpenSHMEM application
48
+ successfully, see the :ref: `For problems launching MPI or
49
+ OpenSHMEM applications <getting-help-launching-label>` section
50
+ below for the content of what to include in your email.
51
+
52
+ * If you have other questions or problems about running your MPI or
53
+ OpenSHMEM application, see the :ref: `For problems running MPI or
54
+ OpenSHMEM applications <getting-help-running-label>` section
47
55
below for the content of what to include in your email.
48
56
49
- .. note :: The mailing lists have **a 150 KB size limit on
50
- messages** (this is a limitation of the mailing list web
51
- archives). If attaching your files results in an email larger
52
- than this, please try compressing it and/or posting it on the
53
- web somewhere for people to download. A `Github Gist
54
- <https://gist.github.com/> `_ or a `Pastebin
55
- <https://pastebin.com/> `_ might be a good choice for posting
56
- large text files.
57
+ .. important :: The more information you include in your report, the
58
+ better. E-mails/bug reports simply stating, "It doesn't work!"
59
+ are not helpful; we need to know as much information about your
60
+ environment as possible in order to provide meaningful
61
+ assistance.
62
+
63
+ **The best way to get help ** is to provide a "recipe" for
64
+ reproducing the problem. This will allow the Open MPI developers
65
+ to see the error for themselves, and therefore be able to fix it.
57
66
58
67
.. important :: Please **use a descriptive "subject" line in your
59
68
email!** Some Open MPI question-answering people decide whether
@@ -75,82 +84,152 @@ places. If you have:
75
84
there.
76
85
77
86
If you're unsure where to send your question, subscribe and send an
78
- email to the user's mailing list.
87
+ email to the user's mailing list (i.e., option #1, above) .
79
88
80
- .. _getting-help-run -time-label :
89
+ .. _getting-help-compile -time-label :
81
90
82
- For run-time problems
83
- ---------------------
91
+ For problems building or installing Open MPI
92
+ --------------------------------------------
84
93
85
- Please provide *all * of the following information:
94
+ If you cannot successfully configure, build, or install Open MPI,
95
+ please provide *all * of the following information:
86
96
87
- .. important :: The more information you include in your report, the
88
- better. E-mails/bug reports simply stating, "It doesn't work!"
89
- are not helpful; we need to know as much information about your
90
- environment as possible in order to provide meaningful assistance.
97
+ #. The version of Open MPI that you're using.
91
98
92
- **The best way to get help ** is to provide a "recipe" for
93
- reproducing the problem. This will allow the Open MPI developers
94
- to see the error for themselves, and therefore be able to fix it.
99
+ #. The stdout and stderr from running ``configure ``.
95
100
96
- #. The version of Open MPI that you're using.
101
+ #. All ``config.log `` files from the Open MPI build tree.
102
+
103
+ #. Output from when you ran ``make V=1 all `` to build Open MPI.
104
+
105
+ #. Output from when you ran ``make install `` to install Open MPI.
106
+
107
+ The script below may be helpful to gather much of the above
108
+ information (adjust as necessary for your specific environment):
109
+
110
+ .. code-block :: bash
111
+
112
+ #! /usr/bin/env bash
113
+
114
+ set -euxo pipefail
115
+
116
+ # Make a directory for the output files
117
+ dir=" ` pwd` /ompi-output"
118
+ mkdir $dir
119
+
120
+ # Fill in the options you want to pass to configure here
121
+ options=" "
122
+ ./configure $options 2>&1 | tee $dir /config.out
123
+ tar -cf - ` find . -name config.log` | tar -x -C $dir -
124
+
125
+ # Build and install Open MPI
126
+ make V=1 all 2>&1 | tee $dir /make.out
127
+ make install 2>&1 | tee $dir /make-install.out
128
+
129
+ # Bundle up all of these files into a tarball
130
+ filename=" ompi-output.tar.bz2"
131
+ tar -jcf $filename ` basename $dir `
132
+ echo " Tarball $filename created"
133
+
134
+ Then attach the resulting ``ompi-output.tar.bz2 `` file to your report.
135
+
136
+ .. caution :: The mailing lists have **a 150 KB size limit on
137
+ messages** (this is a limitation of the mailing list web archives).
138
+ If attaching the tarball makes your message larger than 150 KB, you
139
+ may need to post the tarball elsewhere and include a link to that
140
+ tarball in your mail to the list.
97
141
98
- #. The ``config.log `` file from the top-level Open MPI directory, if
99
- available (**compress or post to a Github gist or Pastebin **).
142
+ .. _getting-help-launching-label :
143
+
144
+ For problems launching MPI or OpenSHMEM applications
145
+ ----------------------------------------------------
146
+
147
+ If you cannot successfully launch simple applications across multiple
148
+ nodes (e.g., the non-MPI ``hostname `` command, or the MPI "hello world"
149
+ or "ring" sample applications in the ``examples/ `` directory), please
150
+ provide *all * of the information from the :ref: `For problems building
151
+ or installing Open MPI <getting-help-compile-time-label>` section, and
152
+ *all * of the following additional information:
100
153
101
154
#. The output of the ``ompi_info --all `` command from the node where
102
- you're invoking ``mpirun ``.
103
-
104
- #. If you have questions or problems about process affinity /
105
- binding, send the output from running the ``lstopo -v ``
106
- command from a recent version of `Hwloc
107
- <https://www.open-mpi.org/projects/hwloc/> `_. *The detailed
108
- text output is preferable to a graphical output. *
109
-
110
- #. If running on more than one node |mdash | especially if you're
111
- having problems launching Open MPI processes |mdash | also include
112
- the output of the ``ompi_info --version `` command **from each node
113
- on which you're trying to run **.
114
-
115
- #. If you are able to launch MPI processes, you can use
116
- ``mpirun `` to gather this information. For example, if
117
- the file ``my_hostfile.txt `` contains the hostnames of the
118
- machines on which you are trying to run Open MPI
119
- processes::
120
-
121
- shell$ mpirun --map-by node --hostfile my_hostfile.txt --output tag ompi_info --version
122
-
123
-
124
- #. If you cannot launch MPI processes, use some other mechanism
125
- |mdash | such as ``ssh `` |mdash | to gather this information. For
126
- example, if the file ``my_hostfile.txt `` contains the hostnames
127
- of the machines on which you are trying to run Open MPI
128
- processes:
129
-
130
- .. code-block :: sh
131
-
132
- # Bourne-style shell (e.g., bash, zsh, sh)
133
- shell$ for h in ` cat my_hostfile.txt`
134
- > do
135
- > echo " === Hostname: $h "
136
- > ssh $h ompi_info --version
137
- > done
138
-
139
- .. code-block :: sh
140
-
141
- # C-style shell (e.g., csh, tcsh)
142
- shell% foreach h (` cat my_hostfile.txt` )
143
- foreach? echo " === Hostname: $h "
144
- foreach? ssh $h ompi_info --version
145
- foreach? end
146
-
147
- #. A *detailed * description of what is failing. The more
148
- details that you provide, the better. E-mails saying "My
149
- application doesn't work!" will inevitably be answered with
150
- requests for more information about *exactly what doesn't
151
- work *; so please include as much information detailed in your
152
- initial e-mail as possible. We strongly recommend that you
153
- include the following information:
155
+ you are invoking :ref: `mpirun(1) <man1-mpirun >`.
156
+
157
+ #. If you have questions or problems about process mapping or binding,
158
+ send the output from running the ``lstopo -v `` and ``lstopo --of
159
+ xml `` commands from a recent version of `Hwloc
160
+ <https://www.open-mpi.org/projects/hwloc/> `_.
161
+
162
+ #. If running on more than one node, also include the output of the
163
+ ``ompi_info --version `` command **from each node on which you are
164
+ trying to run **.
165
+
166
+ #. The output of running ``mpirun --map-by ppr:1:node --prtemca
167
+ plm_base_verbose 100 --prtemca rmaps_base_verbose 100 --display
168
+ alloc hostname ``. Add in a ``--hostfile `` argument if needed for
169
+ your environment.
170
+
171
+ The script below may be helpful to gather much of the above
172
+ information (adjust as necessary for your specific environment).
173
+
174
+ .. note :: It is safe to run this script after running the script from
175
+ the :ref: `building and installing
176
+ <getting-help-compile-time-label>` section.
177
+
178
+ .. code-block :: bash
179
+
180
+ #! /usr/bin/env bash
181
+
182
+ set -euxo pipefail
183
+
184
+ # Make a directory for the output files
185
+ dir=" ` pwd` /ompi-output"
186
+ mkdir -p $dir
187
+
188
+ # Get installation and system information
189
+ ompi_info --all 2>&1 | tee $dir /ompi-info-all.out
190
+ lstopo -v | tee $dir /lstopo-v.txt
191
+ lstopo --of xml | tee $dir /lstopo.xml
192
+
193
+ # Have a text file "my_hostfile.txt" containing the hostnames on
194
+ # which you are trying to launch
195
+ for host in ` cat my_hostfile.txt` ; do
196
+ ssh $host ompi_info --version 2>&1 | tee $dir /ompi_info-version-$host .out
197
+ ssh $host lstopo -v | tee $dir /lstopo-v-$host .txt
198
+ ssh $host lstopo --of xml | tee $dir /lstopo-$host .xml
199
+ done
200
+
201
+ # Have a my_hostfile.txt file if needed for your environment, or
202
+ # remove the --hostfile argument altogether if not needed.
203
+ set +e
204
+ mpirun \
205
+ --hostfile my_hostfile.txt \
206
+ --map-by ppr:1:node \
207
+ --prtemca plm_base_verbose 100 \
208
+ --prtemca rmaps_base_verbose 100 \
209
+ --display alloc \
210
+ hostname 2>&1 | tee $dir /mpirun-hostname.out
211
+
212
+ # Bundle up all of these files into a tarball
213
+ filename=" ompi-output.tar.bz2"
214
+ tar -jcf $filename ` basename $dir `
215
+ echo " Tarball $filename created"
216
+
217
+ .. _getting-help-running-label :
218
+
219
+ For problems running MPI or OpenSHMEM applications
220
+ --------------------------------------------------
221
+
222
+ If you can successfully launch parallel MPI or OpenSHMEM applications,
223
+ but the jobs fail during the run, please provide *all * of the
224
+ information from the :ref: `For problems building or installing Open
225
+ MPI <getting-help-compile-time-label>` section, *all * of the
226
+ information from the :ref: `For problems launching MPI or OpenSHMEM
227
+ applications <getting-help-launching-label>` section, and then *all *
228
+ of the following additional information:
229
+
230
+ #. A *detailed * description of what is failing. *The more details
231
+ that you provide, the better. * Please include at least the
232
+ following information:
154
233
155
234
* The exact command used to run your application.
156
235
@@ -164,77 +243,21 @@ Please provide *all* of the following information:
164
243
any required support libraries, such as libraries required
165
244
for high-speed networks such as InfiniBand).
166
245
167
- #. Detailed information about your network:
246
+ #. The source code of a short sample program (preferably in C or
247
+ Fortran) that exhibits the problem.
248
+
249
+ #. If you are experiencing networking problems, include detailed
250
+ information about your network.
168
251
169
252
.. error :: TODO Update link to IB FAQ entry.
170
253
171
254
#. For RoCE- or InfiniBand-based networks, include the information
172
255
:ref: `in this FAQ entry <faq-ib-troubleshoot-label >`.
173
256
174
- #. For Ethernet-based networks (including RoCE-based networks,
257
+ #. For Ethernet-based networks (including RoCE-based networks) ,
175
258
include the output of the ``ip addr `` command (or the legacy
176
259
``ifconfig `` command) on all relevant nodes.
177
260
178
261
.. note :: Some Linux distributions do not put ``ip`` or
179
262
``ifconfig `` in the default ``PATH `` of normal users.
180
263
Try looking for it in ``/sbin `` or ``/usr/sbin ``.
181
-
182
- .. _getting-help-compile-time-label :
183
-
184
- For compile problems
185
- --------------------
186
-
187
- Please provide *all * of the following information:
188
-
189
- .. important :: The more information you include in your report, the
190
- better. E-mails/bug reports simply stating, "It doesn't work!"
191
- are not helpful; we need to know as much information about your
192
- environment as possible in order to provide meaningful assistance.
193
-
194
- **The best way to get help ** is to provide a "recipe" for
195
- reproducing the problem. This will allow the Open MPI developers
196
- to see the error for themselves, and therefore be able to fix it.
197
-
198
- #. The version of Open MPI that you're using.
199
-
200
- #. All output (both compilation output and run time output, including
201
- all error messages).
202
-
203
- #. Output from when you ran ``./configure `` to configure Open MPI
204
- (**compress or post to a GitHub gist or Pastebin! **).
205
-
206
- #. The ``config.log `` file from the top-level Open MPI directory
207
- (**compress or post to a GitHub gist or Pastebin! **).
208
-
209
- #. Output from when you ran ``make V=1 `` to build Open MPI (**compress
210
- or post to a GitHub gist or Pastebin! **).
211
-
212
- #. Output from when you ran ``make install `` to install Open MPI
213
- (**compress or post to a GitHub gist or Pastebin! **).
214
-
215
- To capture the output of the configure and make steps, you can use the
216
- script command or the following technique to capture all the files in
217
- a unique directory, suitable for tarring and compressing into a single
218
- file:
219
-
220
- .. code-block :: sh
221
-
222
- # Bourne-style shell (e.g., bash, zsh, sh)
223
- shell$ mkdir $HOME /ompi-output
224
- shell$ ./configure {options} 2>&1 | tee $HOME /ompi-output/config.out
225
- shell$ make all 2>&1 | tee $HOME /ompi-output/make.out
226
- shell$ make install 2>&1 | tee $HOME /ompi-output/make-install.out
227
- shell$ cd $HOME
228
- shell$ tar jcvf ompi-output.tar.bz2 ompi-output
229
-
230
- .. code-block :: sh
231
-
232
- # C-style shell (e.g., csh, tcsh)
233
- shell% mkdir $HOME /ompi-output
234
- shell% ./configure {options} | & tee $HOME /ompi-output/config.out
235
- shell% make all | & tee $HOME /ompi-output/make.out
236
- shell% make install | & tee $HOME /ompi-output/make-install.out
237
- shell% cd $HOME
238
- shell% tar jcvf ompi-output.tar.bz2 ompi-output
239
-
240
- Then attach the resulting ``ompi-output.tar.bz2 `` file to your report.
0 commit comments