Skip to content

Conversation

mxm
Copy link
Contributor

@mxm mxm commented Sep 18, 2025

When using the DynamicKafkaSink, topics can be spread across multiple clusters. This used to work fine, but a regression has been added which considers partitions across different clusters to be identical. This limits the scale out of the source operator.

Here is an example:

"1.Source__Kafka_Source_(testTopic).kafkaCluster.my-cluster-1.KafkaSourceReader.topic.testTopic.partition.0.currentOffset",
"1.Source__Kafka_Source_(testTopic).kafkaCluster.my-cluster-2.KafkaSourceReader.topic.testTopic.partition.0.currentOffset"

Those would result be treated as one partition, but there are two partitions from separate Kafka clusters.

When using the DynamicKafkaSink, topics can be spread across multiple
clusters. This used to work fine, but a regression has been added which
considers partitions across different clusters to be identical. This limits the
scale out of the source operator.

Here is an example:

```
"1.Source__Kafka_Source_(testTopic).kafkaCluster.my-cluster-1.KafkaSourceReader.topic.testTopic.partition.0.currentOffset",
"1.Source__Kafka_Source_(testTopic).kafkaCluster.my-cluster-2.KafkaSourceReader.topic.testTopic.partition.0.currentOffset"
```

Those would result be treated as one partition, but there are two partitions from separate kafka clusters.
@mxm mxm requested a review from gyfora September 18, 2025 13:51
@mxm mxm merged commit 003e594 into apache:main Sep 19, 2025
232 of 235 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants