@@ -73,146 +73,70 @@ to send bytes across different types underlying networks. The ``tcp``
73
73
``btl ``, for example, sends messages across TCP-based networks; the
74
74
``ucx `` ``pml `` sends messages across InfiniBand-based networks.
75
75
76
+ MCA parameter notes
77
+ -------------------
78
+
76
79
Each component typically has some tunable parameters that can be
77
- changed at run-time. Use the ``ompi_info `` command to check a component
78
- to see what its tunable parameters are. For example:
80
+ changed at run-time. Use the :ref: `ompi_info(1) <man1-ompi_info >`
81
+ command to check a component to see what its tunable parameters are.
82
+ For example:
79
83
80
84
.. code-block :: sh
81
85
82
86
shell$ ompi_info --param btl tcp
83
87
84
88
shows some of the parameters (and default values) for the ``tcp `` ``btl ``
85
- component (use ``--level `` to show *all * the parameters; see below).
86
-
87
- Note that ``ompi_info `` only shows a small number a component's MCA
88
- parameters by default. Each MCA parameter has a "level" value from 1
89
- to 9, corresponding to the MPI-3 MPI_T tool interface levels. In Open
90
- MPI, we have interpreted these nine levels as three groups of three:
91
-
92
- #. End user / basic
93
- #. End user / detailed
94
- #. End user / all
95
- #. Application tuner / basic
96
- #. Application tuner / detailed
97
- #. Application tuner / all
98
- #. MPI/OpenSHMEM developer / basic
99
- #. MPI/OpenSHMEM developer / detailed
100
- #. MPI/OpenSHMEM developer / all
101
-
102
- Here's how the three sub-groups are defined:
103
-
104
- #. End user: Generally, these are parameters that are required for
105
- correctness, meaning that someone may need to set these just to
106
- get their MPI/OpenSHMEM application to run correctly.
107
- #. Application tuner: Generally, these are parameters that can be
108
- used to tweak MPI application performance.
109
- #. MPI/OpenSHMEM developer: Parameters that either don't fit in the
110
- other two, or are specifically intended for debugging /
111
- development of Open MPI itself.
112
-
113
- Each sub-group is broken down into three classifications:
114
-
115
- #. Basic: For parameters that everyone in this category will want to
116
- see.
117
- #. Detailed: Parameters that are useful, but you probably won't need
118
- to change them often.
119
- #. All: All other parameters -- probably including some fairly
120
- esoteric parameters.
121
-
122
- To see *all * available parameters for a given component, specify that
123
- ompi_info should use level 9:
124
-
125
- .. code-block :: sh
126
-
127
- shell$ ompi_info --param btl tcp --level 9
128
-
129
- .. error :: TODO The following content seems redundant with the FAQ.
130
- Additionally, information about how to set MCA params should be
131
- prominently documented somewhere that is easy for users to find --
132
- not buried here in the developer's section.
133
-
134
- These values can be overridden at run-time in several ways. At
135
- run-time, the following locations are examined (in order) for new
136
- values of parameters:
137
-
138
- #. ``PREFIX/etc/openmpi-mca-params.conf ``:
139
- This file is intended to set any system-wide default MCA parameter
140
- values -- it will apply, by default, to all users who use this Open
141
- MPI installation. The default file that is installed contains many
142
- comments explaining its format.
143
-
144
- #. ``$HOME/.openmpi/mca-params.conf ``:
145
- If this file exists, it should be in the same format as
146
- ``PREFIX/etc/openmpi-mca-params.conf ``. It is intended to provide
147
- per-user default parameter values.
148
-
149
- #. environment variables of the form ``OMPI_MCA_<name> `` set equal to a
150
- ``VALUE ``:
151
-
152
- Where ``<name> `` is the name of the parameter. For example, set the
153
- variable named ``OMPI_MCA_btl_tcp_frag_size `` to the value 65536
154
- (Bourne-style shells):
155
-
156
- .. code-block :: sh
157
-
158
- shell$ OMPI_MCA_btl_tcp_frag_size=65536
159
- shell$ export OMPI_MCA_btl_tcp_frag_size
160
-
161
- .. error :: TODO Do we need content here about PMIx and PRTE env vars?
162
-
163
- #. the ``mpirun ``/``oshrun `` command line: ``--mca NAME VALUE ``
164
-
165
- Where ``<name> `` is the name of the parameter. For example:
166
-
167
- .. code-block :: sh
168
-
169
- shell$ mpirun --mca btl_tcp_frag_size 65536 -n 2 hello_world_mpi
170
-
171
- .. error :: TODO Do we need content here about PMIx and PRTE MCA vars
172
- and corresponding command line switches?
173
-
174
- These locations are checked in order. For example, a parameter value
175
- passed on the ``mpirun `` command line will override an environment
176
- variable; an environment variable will override the system-wide
177
- defaults.
178
-
179
- Each component typically activates itself when relevant. For example,
180
- the usNIC component will detect that usNIC devices are present and
181
- will automatically be used for MPI communications. The Slurm
182
- component will automatically detect when running inside a Slurm job
183
- and activate itself. And so on.
184
-
185
- Components can be manually activated or deactivated if necessary, of
186
- course. The most common components that are manually activated,
187
- deactivated, or tuned are the ``btl `` components -- components that are
188
- used for MPI point-to-point communications on many types common
189
- networks.
190
-
191
- For example, to *only * activate the ``tcp `` and ``self `` (process loopback)
192
- components are used for MPI communications, specify them in a
193
- comma-delimited list to the ``btl `` MCA parameter:
194
-
195
- .. code-block :: sh
196
-
197
- shell$ mpirun --mca btl tcp,self hello_world_mpi
198
-
199
- To add shared memory support, add ``sm `` into the command-delimited list
200
- (list order does not matter):
201
-
202
- .. code-block :: sh
203
-
204
- shell$ mpirun --mca btl tcp,sm,self hello_world_mpi
205
-
206
- .. note :: There used to be a ``vader`` ``btl`` component for shared
207
- memory support; it was renamed to ``sm `` in Open MPI v5.0.0,
208
- but the alias ``vader `` still works as well.
209
-
210
- To specifically deactivate a specific component, the comma-delimited
211
- list can be prepended with a ``^ `` to negate it:
212
-
213
- .. code-block :: sh
214
-
215
- shell$ mpirun --mca btl ^tcp hello_mpi_world
216
-
217
- The above command will use any other ``btl `` component other than the
218
- ``tcp `` component.
89
+ component (use ``--all `` or ``--level 9 `` to show *all * the parameters).
90
+
91
+ Note that ``ompi_info `` (without ``--all `` or a specified level) only
92
+ shows a small number a component's MCA parameters by default. Each
93
+ MCA parameter has a "level" value from 1 to 9, corresponding to the
94
+ MPI-3 MPI_T tool interface levels. :ref: `See the LEVELS section in
95
+ the ompi_info(1) man page <man1-ompi_info-levels>` for an explanation
96
+ of the levels and how they correspond to Open MPI's code.
97
+
98
+ Here's rules of thumb to keep in mind when using Open MPI's levels:
99
+
100
+ * Levels 1-3:
101
+
102
+ * These levels should contain only a few MCA parameters.
103
+ * Generally, only put MCA parameters in these levels that matter to
104
+ users who just need to *run * Open MPI applications (and don't
105
+ know/care anything about MPI). Examples (these are not
106
+ comprehensive):
107
+
108
+ * Selection of which network interfaces to use.
109
+ * Selection of which MCA components to use.
110
+ * Selective disabling of warning messages (e.g., show warning
111
+ message XYZ unless a specific MCA parameter is set, which
112
+ disables showing that warning message).
113
+ * Enabling additional stderr logging verbosity. This allows a
114
+ user to run with this logging enabled, and then use that output
115
+ to get technical assistance.
116
+
117
+ * Levels 4-6:
118
+
119
+ * These levels should contain any other MCA parameters that are
120
+ useful to expose to end users.
121
+ * There is an expectation that "power users" will utilize these MCA
122
+ parameters |mdash | e.g., those who are trying to tune the system
123
+ and extract more performance.
124
+ * Here's some examples of MCA parameters suitable for these levels
125
+ (these are not comprehensive):
126
+
127
+ * When you could have hard-coded a constant size of a resource
128
+ (e.g., a resource pool size or buffer length), make it an MCA
129
+ parameter instead.
130
+ * When there are multiple different algorithms available for a
131
+ particular operation, code them all up and provide an MCA
132
+ parameter to let the user select between them.
133
+
134
+ * Levels 7-9:
135
+
136
+ * Put any other MCA parameters here.
137
+ * It's ok for these MCA parameters to be esoteric and only relevant
138
+ to deep magic / the internals of Open MPI.
139
+ * There is little expectation of users using these MCA parameters.
140
+
141
+ See :ref: `this section <label-running-setting-mca-param-values >` for
142
+ details on how to set MCA parameters at run time.
0 commit comments