2nd standby in my setup is continuously marked as unhealthy #1044

krishnakongara123 · 2024-09-11T09:32:59Z

krishnakongara123
Sep 11, 2024

Environment details
i've done source code installation of pg_auto_failover extension 2.1 and using postgres 16 community.

I've multi standby setup,
1 monitor + 1 Primary + 2 secondary.

[postgres@xl8dt360rpgs3 ~]$`

Problem is that, my 3rd node, 11.254.118.24 secondary is database, continuously going down and coming up!, not sure why ?
this is what observed in postgres logs ?

`[postgres@xl8xxxxxrpgs3 log]$ more postgresql-2024-09-11_092605.log
2024-09-11 09:26:05.905 UTC [2368308] LOG: starting PostgreSQL 16.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-22), 64-bit
2024-09-11 09:26:05.910 UTC [2368308] LOG: listening on IPv4 address "0.0.0.0", port 1521
2024-09-11 09:26:05.913 UTC [2368308] LOG: could not create IPv6 socket for address "::": Address family not supported by protocol
2024-09-11 09:26:05.918 UTC [2368308] LOG: listening on Unix socket "/tmp/.s.PGSQL.1521"
2024-09-11 09:26:05.940 UTC [2368314] LOG: database system was shut down in recovery at 2024-09-11 09:26:05 UTC
2024-09-11 09:26:05.942 UTC [2368314] LOG: entering standby mode
2024-09-11 09:26:05.955 UTC [2368314] LOG: redo starts at 0/542DD38
2024-09-11 09:26:05.955 UTC [2368314] LOG: consistent recovery state reached at 0/542DE20
2024-09-11 09:26:05.955 UTC [2368314] LOG: invalid record length at 0/542DE20: expected at least 24, got 0
2024-09-11 09:26:05.956 UTC [2368308] LOG: database system is ready to accept read-only connections
2024-09-11 09:26:05.977 UTC [2368315] LOG: started streaming WAL from primary at 0/5000000 on timeline 5
2024-09-11 09:26:15.593 UTC [2368308] LOG: received fast shutdown request
2024-09-11 09:26:15.595 UTC [2368308] LOG: aborting any active transactions
2024-09-11 09:26:15.596 UTC [2368315] FATAL: terminating walreceiver process due to administrator command
2024-09-11 09:26:15.598 UTC [2368312] LOG: shutting down
2024-09-11 09:26:15.632 UTC [2368308] LOG: database system is shut down

[postgres@xl8xxxxrpgs3 log]$ more postgresql-2024-09-11_092615.log
2024-09-11 09:26:15.883 UTC [2368390] LOG: starting PostgreSQL 16.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-22), 64-bit
2024-09-11 09:26:15.889 UTC [2368390] LOG: listening on IPv4 address "0.0.0.0", port 1521
2024-09-11 09:26:15.892 UTC [2368390] LOG: could not create IPv6 socket for address "::": Address family not supported by protocol
2024-09-11 09:26:15.895 UTC [2368390] LOG: listening on Unix socket "/tmp/.s.PGSQL.1521"
2024-09-11 09:26:15.904 UTC [2368396] LOG: database system was shut down in recovery at 2024-09-11 09:26:15 UTC
2024-09-11 09:26:15.904 UTC [2368396] LOG: entering standby mode
2024-09-11 09:26:15.934 UTC [2368396] LOG: redo starts at 0/542DD38
2024-09-11 09:26:15.934 UTC [2368396] LOG: consistent recovery state reached at 0/542DE20
2024-09-11 09:26:15.934 UTC [2368396] LOG: invalid record length at 0/542DE20: expected at least 24, got 0
2024-09-11 09:26:15.934 UTC [2368390] LOG: database system is ready to accept read-only connections
2024-09-11 09:26:15.957 UTC [2368397] LOG: started streaming WAL from primary at 0/5000000 on timeline 5
[postgres@xl8dt360rpgs3 log]$
`

Questions

where does pg_auto_failver writes logfile on linux server ? is there a parameter that i can set to write in proper location ?
Despite the fact i've set log_rotation_age = 1d & log_rotation_size = 10MB in both postgresql.conf and postgresql-auto-failover.conf, why the new log is getting created after 4KB size ?
Which configuration file, i should set parameters, is it postgressql.conf or postgresql-auto-failover.conf ?
Finally, how to identify the culprit ?
this piece of info in the logfile, giving us an indication, it's happening by pg_auto_failver only, but why ?

2024-09-11 09:26:15.596 UTC [2368315] FATAL: terminating walreceiver process due to administrator command
2024-09-11 09:26:15.598 UTC [2368312] LOG: shutting down
2024-09-11 09:26:15.632 UTC [2368308] LOG: database system is shut down

krishnakongara123 · 2024-09-12T05:25:20Z

krishnakongara123
Sep 12, 2024
Author

@dimitri - request you provide your insight on this issue.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

2nd standby in my setup is continuously marked as unhealthy #1044

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

2nd standby in my setup is continuously marked as unhealthy #1044

Uh oh!

krishnakongara123 Sep 11, 2024

Replies: 1 comment

Uh oh!

krishnakongara123 Sep 12, 2024 Author

krishnakongara123
Sep 11, 2024

krishnakongara123
Sep 12, 2024
Author