Skip to content

Failure to access a ZooKeeper cluster via IP address with TLS #959

@NickLarsenNZ

Description

@NickLarsenNZ

This is actually a problem caused by ZooKeeper client (ie: what is called via zkCli.sh)

Problem

Accessing a cluster with valid SAN entry:

openssl s_client -connect 172.18.0.2:30504 | openssl x509 -noout -text
...
            X509v3 Subject Alternative Name: critical
                IP Address:172.18.0.2

using the Zookeeper Client:

/stackable/zookeeper/bin/zkCli.sh -server 172.18.0.2:30504 ls /

results in a connection failure:

Caused by: java.security.cert.CertificateException: No subject alternative DNS name matching 172-18-0-2.kubernetes.default.svc.cluster.local found.

Steps to reproduce

Deploy a ZookeeperCluster with a listenerClass of external-unstable on a KinD cluster and with TLS enabled:

apiVersion: zookeeper.stackable.tech/v1alpha1
kind: ZookeeperCluster
metadata:
  name: test-zk
spec:
  clusterConfig:
    authentication:
    - authenticationClass: zk-client-auth-tls
    tls:
      quorumSecretClass: tls
      serverSecretClass: zk-client-secret
  image:
    productVersion: 3.9.3
  servers:
    config:
      resources:
        cpu:
          max: 500m
          min: 250m
        memory:
          limit: 512Mi
        storage:
          data:
            capacity: 1Gi
    roleConfig:
      # 👇 see here
      listenerClass: external-unstable
    roleGroups:
      primary:
        replicas: 3

Note

todo: add complete minimal example.

FWIW, I launched this with:

scripts/run-tests --test smoke_zookeeper-3.9.3_use-server-tls-true_use-client-auth-tls-true_openshift-false --parallel 1 --skip-delete

And then manually updated the listenerClass on the ZookeeperCluster.

Get the node hostname (in this case, IP) and node port:

kubectl -n kuttl-test-musical-stork get listener test-zk-server -o 'jsonpath={.status.ingressAddresses[0].address}:{.status.nodePorts.zk}'

Shell into the first replica, and run:

export CLIENT_STORE_SECRET="$(< /stackable/rwconfig/zoo.cfg grep "ssl.keyStore.password" | cut -d "=" -f2)"
export CLIENT_JVMFLAGS="
-Dzookeeper.authProvider.x509=org.apache.zookeeper.server.auth.X509AuthenticationProvider
-Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty
-Dzookeeper.client.secure=true
-Dzookeeper.ssl.keyStore.location=/stackable/server_tls/keystore.p12
-Dzookeeper.ssl.keyStore.password=${CLIENT_STORE_SECRET}
-Dzookeeper.ssl.trustStore.location=/stackable/server_tls/truststore.p12
-Dzookeeper.ssl.trustStore.password=${CLIENT_STORE_SECRET}"

and then

# replace the IP and port with what was returned in the earlier kubectl command
/stackable/zookeeper/bin/zkCli.sh -server 172.18.0.2:30504 ls /

The client will fail to connect due to an invalid name.

Explanation

The ZooKeeper client is doing a reverse DNS lookup on the IP provided in the command line, and then using that to connect to ZooKeeper. But the reverse DNS record is not in the SAN entries (this is expected).

@nightkr: in this case it seems to come up because it's running on the same control plane node as the apiserver, but pretty sure any hostNetworking pod that uses a service in the same way would trigger the same bug.

Considerations:

  • Document that ZK cannot be exposed when the Listener reports back an IP address instead of hostname.
  • Fix the ZK Client upstream.
  • Add Reverse DNS entries to TLS certificate SANs. The reverse DNS record is not a reliable identifier to base trust on.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions