Skip to content

Commit 7f694b9

Browse files
authored
[doc] Document known MPICH issue about gethostbyname failing (#825)
1 parent 694ea8f commit 7f694b9

File tree

2 files changed

+48
-0
lines changed

2 files changed

+48
-0
lines changed

.github/workflows/Documenter.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,10 @@ jobs:
2626
runs-on: ubuntu-latest
2727
steps:
2828
- uses: actions/checkout@v4
29+
- uses: julia-actions/setup-julia@v1
30+
with:
31+
version: '1'
32+
- uses: julia-actions/cache@v1
2933
- name: Install dependencies
3034
shell: julia --color=yes --project=docs/ {0}
3135
run: |

docs/src/knownissues.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,50 @@ export OMPI_MCA_coll_hcoll_enable="0"
6363

6464
before starting the MPI process.
6565

66+
## MPICH
67+
68+
### `gethostbyname` failure in `internal_Init_thread`
69+
70+
When your internal network stack/route is not correctly configured for the local loopback device, MPICH may fail to initialize with an error message which looks like the following:
71+
72+
```
73+
Fatal error in internal_Init_thread: Other MPI error, error stack:
74+
internal_Init_thread(67)...........: MPI_Init_thread(argc=0x0, argv=0x0, required=2, provided=0x16db94160) failed
75+
MPII_Init_thread(234)..............:
76+
MPID_Init(67)......................:
77+
init_world(171)....................: channel initialization failed
78+
MPIDI_CH3_Init(84).................:
79+
MPID_nem_init(314).................:
80+
MPID_nem_tcp_init(175).............:
81+
MPID_nem_tcp_get_business_card(397):
82+
GetSockInterfaceAddr(370)..........: gethostbyname failed, bogon (errno 0)
83+
```
84+
85+
A workaround is provided in the [documentation of the MOOSE framework](https://mooseframework.inl.gov/help/troubleshooting.html) and we report it here for reference:
86+
87+
* obtain your hostname
88+
```console
89+
$ hostname
90+
mycoolname
91+
```
92+
* for both Linux and macOS systems, in your `/etc/hosts` file map the hostname you obtained at the previous step to the [localhost address `127.0.0.1`](https://en.wikipedia.org/wiki/Localhost), if not already present.
93+
_**Note**_: this step requires root access, to modify the system configuration file `/etc/hosts`, if you don't have it talk to your system administrator.
94+
For example, open the file `/etc/hosts` with `sudo` access with your favorite text editor (e.g. `sudo vi /etc/hosts`, or `sudo emacs /etc/hosts`) and add the line
95+
```
96+
127.0.0.1 mycoolname
97+
```
98+
to the end of the file
99+
* as an alternative to the previous step, only for macOS systems, run the command
100+
```
101+
sudo scutil --set HostName mycoolname
102+
```
103+
However it has been reported that this method may not always be effective.
104+
105+
For further information see
106+
107+
- [MPI.jl issue #824](https://github.com/JuliaParallel/MPI.jl/issues/824)
108+
- [MOOSE discussion #23610](https://github.com/idaholab/moose/discussions/23610)
109+
66110
## UCX
67111

68112
[UCX](https://www.openucx.org/) is a communication framework used by several MPI implementations.

0 commit comments

Comments
 (0)