diff --git a/docs/modules/zookeeper/pages/troubleshooting-guide/index.adoc b/docs/modules/zookeeper/pages/troubleshooting-guide/index.adoc new file mode 100644 index 00000000..b19b36ab --- /dev/null +++ b/docs/modules/zookeeper/pages/troubleshooting-guide/index.adoc @@ -0,0 +1,3 @@ += Troubleshooting guide + +This section tries to help you in case something isn't working as expected. diff --git a/docs/modules/zookeeper/pages/troubleshooting-guide/zookeeper-cluster-unhealthy.adoc b/docs/modules/zookeeper/pages/troubleshooting-guide/zookeeper-cluster-unhealthy.adoc new file mode 100644 index 00000000..360c5aad --- /dev/null +++ b/docs/modules/zookeeper/pages/troubleshooting-guide/zookeeper-cluster-unhealthy.adoc @@ -0,0 +1,29 @@ += Zookeeper cluster unhealthy + +== Quorum hostname verification failing + +In the past we have noticed problems with mutual TLS in quorums, notably with the hostname verification. +We reported the problems upstream in https://issues.apache.org/jira/browse/ZOOKEEPER-4790[ZOOKEEPER-4790]. + +The exception looks something like + +[source] +---- +2024-01-23 07:01:46,432 [myid:] - INFO [ListenerHandler-zk-server-default-0.zka1-zk-server-default.default.svc.cluster.local/100.64.9.69:3888:o.a.z.s.q.QuorumCnxManager$Listener$ListenerHandler@1076] - Received connection request from /100.64.11.99:58368 +2024-01-23 07:01:46,446 [myid:] - ERROR [ListenerHandler-zk-server-default-0.zka1-zk-server-default.default.svc.cluster.local/100.64.9.69:3888:o.a.z.c.ZKTrustManager@161] - Failed to verify host address: 100.64.11.99 +javax.net.ssl.SSLPeerUnverifiedException: Certificate for <100.64.11.99> doesn't match any of the subject alternative names: [zk-server-default.default.svc.cluster.local, zk-server-default-1.zk-server-default.default.svc.cluster.local, 10.8.XXX.XXX, 10.8.XXX.XXX, 10.8.XXX.XXX, 10.XXX.XXX.XXX, 10.8.XXX.XXX, 10.8.XXX.XXX, 10.8.XXX.XXX, 10.XXX.XXX.XXX] + at org.apache.zookeeper.common.ZKHostnameVerifier.matchIPAddress(ZKHostnameVerifier.java:197) + at org.apache.zookeeper.common.ZKHostnameVerifier.verify(ZKHostnameVerifier.java:165) +---- + +In case you are running into issues with hostname verification, a workaround - until the problem is fixed - is to turn off hostname verification for the quorum. + +[source,yaml] +---- +servers: + configOverrides: + zoo.cfg: + ssl.quorum.hostnameVerification: "false" +---- + +WARNING: This imposes a security risk, so we don't disable the check by default. Any possessor of a certificate signed by the ca (even for a totally different host) can pretend to be a Zookeeper server to a Zookeeper server. diff --git a/docs/modules/zookeeper/partials/nav.adoc b/docs/modules/zookeeper/partials/nav.adoc index 5281680f..00bc7d59 100644 --- a/docs/modules/zookeeper/partials/nav.adoc +++ b/docs/modules/zookeeper/partials/nav.adoc @@ -18,6 +18,8 @@ *** xref:zookeeper:usage_guide/operations/pod-placement.adoc[] *** xref:zookeeper:usage_guide/operations/pod-disruptions.adoc[] *** xref:zookeeper:usage_guide/operations/graceful-shutdown.adoc[] +* xref:zookeeper:troubleshooting-guide/index.adoc[] +** xref:zookeeper:troubleshooting-guide/zookeeper-cluster-unhealthy.adoc[] * xref:zookeeper:reference/index.adoc[] ** xref:zookeeper:reference/crds.adoc[] *** {crd-docs}/zookeeper.stackable.tech/zookeepercluster/v1alpha1/[ZookeeperCluster {external-link-icon}^]