-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
Related problem
The current version of the KafkaMirrorMaker2 CRD requires the following example CR:
kind: KafkaMirrorMaker2
metadata:
name: my-mirror-maker-2
spec:
version: 4.0.0
replicas: 1
connectCluster: "cluster-b"
clusters:
- alias: "cluster-a"
bootstrapServers: cluster-a-kafka-bootstrap:9092
- alias: "cluster-b"
bootstrapServers: cluster-b-kafka-bootstrap:9092
config:
config.storage.replication.factor: -1
offset.storage.replication.factor: -1
status.storage.replication.factor: -1
mirrors:
- sourceCluster: "cluster-a"
targetCluster: "cluster-b"
sourceConnector:
tasksMax: 1
config:
replication.factor: -1
offset-syncs.topic.replication.factor: -1
sync.topic.acls.enabled: "false"
refresh.topics.interval.seconds: 600
checkpointConnector:
tasksMax: 1
config:
checkpoints.topic.replication.factor: -1
sync.group.offsets.enabled: "false"
refresh.groups.interval.seconds: 600
topicsPattern: ".*"
groupsPattern: ".*"
This will deploy a Kafka Connect cluster that is storing it's data in cluster-b and deploy the MirrorSourceConnector and MirrorCheckpointConnector mirroring from cluster-a to cluster-b.
Currently the names for the config, offset and status topics are not required, but if the user does want to set them they must be set under the cluster that is listed as the targetCluster
in each mirror. Strimzi also enforces that the connectCluster
property be set to the same as the targetCluster
of each mirror. This is not strictly required by Kafka but is generally recommended for at least the MirrorSourceConnector and MirrorCheckpointConnector.
Although in theory the API allows the user to specify many different clusters and many different mirroring routes in the same file, the reality is that since the target must match connectCluster
all routes must replicate to the same Kafka cluster.
Based on chatting to users of Strimzi the KafkaMirrorMaker2 CR doesn't seem the easiest for users. The introduction of the v1 API gives us the opportunity to change the KafkaMirrorMaker2 CRD in a way to make it more intuitive.
Suggested solution
I propose the following CR:
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaMirrorMaker2
metadata:
name: my-mirror-maker-2
spec:
version: 4.0.0
replicas: 1
targetCluster:
alias: "cluster-c"
bootstrapServers: cluster-c-kafka-bootstrap:9092
config:
config.storage.topic: my-mirror-maker-config
config.storage.replication.factor: -1
offset.storage.topic: my-mirror-maker-offset
offset.storage.replication.factor: -1
status.storage.topic: my-mirror-maker-status
status.storage.replication.factor: -1
sourceClusters:
- alias: "cluster-a"
bootstrapServers: cluster-a-kafka-bootstrap:9092
- alias: "cluster-b"
bootstrapServers: cluster-b-kafka-bootstrap:9092
mirrors:
- sourceCluster: "cluster-a"
sourceConnector:
tasksMax: 1
config:
replication.factor: -1
offset-syncs.topic.replication.factor: -1
sync.topic.acls.enabled: "false"
refresh.topics.interval.seconds: 600
checkpointConnector:
tasksMax: 1
config:
checkpoints.topic.replication.factor: -1
sync.group.offsets.enabled: "false"
refresh.groups.interval.seconds: 600
topicsPattern: ".*"
groupsPattern: ".*"
- sourceCluster: "cluster-b"
sourceConnector:
tasksMax: 1
config:
replication.factor: -1
offset-syncs.topic.replication.factor: -1
sync.topic.acls.enabled: "false"
refresh.topics.interval.seconds: 600
checkpointConnector:
tasksMax: 1
config:
checkpoints.topic.replication.factor: -1
sync.group.offsets.enabled: "false"
refresh.groups.interval.seconds: 600
topicsPattern: ".*"
groupsPattern: ".*"
So the key changes are:
- The user specifies a single
targetCluster
in their CR, and this cluster is used for configuring the storage of the underlying Connect cluster - The user is required to specify the topic names for the
targetCluster
- The user specifies a set of
sourceClusters
rather than genericclusters
- For each
mirror
the user only specifies thesourceCluster
, since thetargetCluster
is set at the CR level - As before the connector names are generated as <SOURCE_CLUSTER_ALIAS>-><TARGET_CLUSTER_ALIAS>, e.g.
cluster-a->cluster-c
This API better guides users to deploy MirrorMaker2 in the recommended way. If a user wants to deploy a more complex or non-recommended topology they can always use the KafkaConnect
and KafkaConnector
CRs directly.
MirrorHeartbeatConnector
The MirrorHeartbeatConnector is confusing to configure using the existing CR. For a MirrorHeartbeatConnector that is related to a MirrorSourceConnector replicating from cluster-a to cluster-c, the underlying Connect cluster should be associated with the source cluster (cluster-a), not the target (cluster-c) (since that is where it produces messages). However the source.alias
and target.alias
must be set as cluster-a
and cluster-c
respectively, otherwise the contents of the messages in the heartbeat
topic don't make sense (they include these aliases). With the existing CR this means when using the MirrorHeartbeatConnector the user must specify connection details for the source cluster under the target cluster alias (see this comment for an example #11695 (comment)).
Since the MirrorHeartbeatConnector is not that commonly used and is confusing to configure we have a few options to make the situation better:
- Removing it entirely from the
KafkaMirrorMaker2
CRD, and instead provide anexample
file for how to configure it using the standardKafkaConnect
andKafkaConnector
CRs - Providing a new
KafkaMirrorMaker2Heartbeat
CR that is tailored for the MirrorHeartbeatConnector - Update the
heartbeatConnector
section of theKafkaMirrorMaker2
CR to allow the user to specify the Connect topics and other Connect cluster configs directly there and having Strimzi create a second Connect cluster connected to the source Kafka cluster if theheartbeatConnector
is configured. - Require the user to specify the Connect topics and other Connect cluster for the
sourceCluster
under thesourceClusters
section when theheartbeatConnector
is configured.
Given the fact that the MirrorHeartbeatConnector is not so commonly used, I would propose option 1.
Alternatives
No response
Additional context
No response