2nd standby in my setup is continuously marked as unhealthy #1044
krishnakongara123
started this conversation in
General
Replies: 1 comment
-
@dimitri - request you provide your insight on this issue. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Environment details
i've done source code installation of pg_auto_failover extension 2.1 and using postgres 16 community.
I've multi standby setup,
1 monitor + 1 Primary + 2 secondary.
`[postgres@xl8xxxxrpgs3 ~]$ pg_autoctl show state
Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State
-------+-------+--------------------+----------------+--------------+---------------------+--------------------
node_1 | 1 | 11.254.118.18:1521 | 5: 0/542DE20 | read-write | primary | primary
node_2 | 2 | 11.254.118.19:1521 | 5: 0/5000000 | read-only | secondary | secondary
node_3 | 3 | 11.254.118.24:1521 | 5: 0/5000000 | read-only | secondary | secondary
[postgres@xl8dt360rpgs3 ~]$`
Problem is that, my 3rd node, 11.254.118.24 secondary is database, continuously going down and coming up!, not sure why ?
this is what observed in postgres logs ?
`[postgres@xl8xxxxxrpgs3 log]$ more postgresql-2024-09-11_092605.log
2024-09-11 09:26:05.905 UTC [2368308] LOG: starting PostgreSQL 16.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-22), 64-bit
2024-09-11 09:26:05.910 UTC [2368308] LOG: listening on IPv4 address "0.0.0.0", port 1521
2024-09-11 09:26:05.913 UTC [2368308] LOG: could not create IPv6 socket for address "::": Address family not supported by protocol
2024-09-11 09:26:05.918 UTC [2368308] LOG: listening on Unix socket "/tmp/.s.PGSQL.1521"
2024-09-11 09:26:05.940 UTC [2368314] LOG: database system was shut down in recovery at 2024-09-11 09:26:05 UTC
2024-09-11 09:26:05.942 UTC [2368314] LOG: entering standby mode
2024-09-11 09:26:05.955 UTC [2368314] LOG: redo starts at 0/542DD38
2024-09-11 09:26:05.955 UTC [2368314] LOG: consistent recovery state reached at 0/542DE20
2024-09-11 09:26:05.955 UTC [2368314] LOG: invalid record length at 0/542DE20: expected at least 24, got 0
2024-09-11 09:26:05.956 UTC [2368308] LOG: database system is ready to accept read-only connections
2024-09-11 09:26:05.977 UTC [2368315] LOG: started streaming WAL from primary at 0/5000000 on timeline 5
2024-09-11 09:26:15.593 UTC [2368308] LOG: received fast shutdown request
2024-09-11 09:26:15.595 UTC [2368308] LOG: aborting any active transactions
2024-09-11 09:26:15.596 UTC [2368315] FATAL: terminating walreceiver process due to administrator command
2024-09-11 09:26:15.598 UTC [2368312] LOG: shutting down
2024-09-11 09:26:15.632 UTC [2368308] LOG: database system is shut down
[postgres@xl8xxxxrpgs3 log]$ more postgresql-2024-09-11_092615.log
2024-09-11 09:26:15.883 UTC [2368390] LOG: starting PostgreSQL 16.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-22), 64-bit
2024-09-11 09:26:15.889 UTC [2368390] LOG: listening on IPv4 address "0.0.0.0", port 1521
2024-09-11 09:26:15.892 UTC [2368390] LOG: could not create IPv6 socket for address "::": Address family not supported by protocol
2024-09-11 09:26:15.895 UTC [2368390] LOG: listening on Unix socket "/tmp/.s.PGSQL.1521"
2024-09-11 09:26:15.904 UTC [2368396] LOG: database system was shut down in recovery at 2024-09-11 09:26:15 UTC
2024-09-11 09:26:15.904 UTC [2368396] LOG: entering standby mode
2024-09-11 09:26:15.934 UTC [2368396] LOG: redo starts at 0/542DD38
2024-09-11 09:26:15.934 UTC [2368396] LOG: consistent recovery state reached at 0/542DE20
2024-09-11 09:26:15.934 UTC [2368396] LOG: invalid record length at 0/542DE20: expected at least 24, got 0
2024-09-11 09:26:15.934 UTC [2368390] LOG: database system is ready to accept read-only connections
2024-09-11 09:26:15.957 UTC [2368397] LOG: started streaming WAL from primary at 0/5000000 on timeline 5
[postgres@xl8dt360rpgs3 log]$
`
Questions
this piece of info in the logfile, giving us an indication, it's happening by pg_auto_failver only, but why ?
2024-09-11 09:26:15.596 UTC [2368315] FATAL: terminating walreceiver process due to administrator command
2024-09-11 09:26:15.598 UTC [2368312] LOG: shutting down
2024-09-11 09:26:15.632 UTC [2368308] LOG: database system is shut down
Beta Was this translation helpful? Give feedback.
All reactions