[Bug]: Kafka operator 0.46 not working on eks on kubernates 1.32 and 1.33 #11610
mohamedsorour1998
started this conversation in
General
Replies: 1 comment 12 replies
-
This error indicates that it is taking two long to get response from the Kubernetes API. That is usually caused by insufficient resources or by some issues with networking or with the Kubernetes API server. |
Beta Was this translation helpful? Give feedback.
12 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Bug Description
my kafka opertor can not communicate with the api server today, it use to communicate yesterday, no changes happened on my part:
Main:187 - Cluster Operator verticle started in namespace kafka without label selector
2025-05-27 08:56:54 WARN KafkaAssemblyOperator:798 - KafkaNodePool dual-role in namespace kafka was ADDED, but the Kafka cluster my-cluster to which it belongs does not exist
2025-05-27 08:56:54 INFO AbstractOperator:520 - Reconciliation #1(watch) Kafka(kafka/my-cluster): Kafka my-cluster in namespace kafka was ADDED
2025-05-27 08:56:54 INFO AbstractOperator:266 - Reconciliation #1(watch) Kafka(kafka/my-cluster): Kafka my-cluster will be checked for creation or modification
2025-05-27 08:56:54 INFO AbstractOperator:520 - Reconciliation #2(watch) Kafka(kafka/my-cluster): Kafka my-cluster in namespace kafka was MODIFIED
2025-05-27 08:56:54 INFO CrdOperator:123 - Reconciliation #1(watch) Kafka(kafka/my-cluster): Status of Kafka my-cluster in namespace kafka has been updated
2025-05-27 08:56:55 INFO Ca:987 - Reconciliation #1(watch) Kafka(kafka/my-cluster): Generating CA with subject=Subject(organizationName='io.strimzi', commonName='cluster-ca v0', dnsNames=[], ipAddresses=[])
2025-05-27 08:56:57 INFO Ca:987 - Reconciliation #1(watch) Kafka(kafka/my-cluster): Generating CA with subject=Subject(organizationName='io.strimzi', commonName='clients-ca v0', dnsNames=[], ipAddresses=[])
2025-05-27 08:56:57 WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 2718 ms, time limit is 2000 ms
2025-05-27 08:56:58 WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 3718 ms, time limit is 2000 ms
2025-05-27 08:57:01 WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 2611 ms, time limit is 2000 ms
2025-05-27 08:57:02 WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 3613 ms, time limit is 2000 ms
2025-05-27 08:57:03 WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 4613 ms, time limit is 2000 ms
2025-05-27 08:57:04 WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 5613 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
at jdk.internal.misc.Unsafe.park(Native Method) ~[?:?]
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:211) ~[?:?]
at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1864) ~[?:?]
at java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3465) ~[?:?]
at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3436) ~[?:?]
at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1898) ~[?:?]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2072) ~[?:?]
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:491) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:524) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:340) ~[io.fab
So i have called AWS Support they told me that many customer faced this problem due to large object side that is being send to the master node's api server.
Steps to reproduce
1- download the operator on eks (i use eks auto)
2- apply my KafkaNodePool file
Expected behavior
to work seamlessly
Strimzi version
0.46
Kubernetes version
1.32, 1.33
Installation method
helm chart of operator
Infrastructure
EKS Auto
Configuration files and logs
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaNodePool
metadata:
name: kafka-pool
labels:
strimzi.io/cluster: kafka
spec:
replicas: 1
roles:
- controller
- broker
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
class: auto-ebs-sc
size: 10Gi
deleteClaim: true
kraftMetadata: shared
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: kafka
annotations:
strimzi.io/node-pools: enabled
strimzi.io/kraft: enabled
spec:
kafka:
version: 3.9.0
metadataVersion: 3.9-IV0
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
config:
offsets.topic.replication.factor: 1
transaction.state.log.replication.factor: 1
transaction.state.log.min.isr: 1
default.replication.factor: 1
min.insync.replicas: 1
template:
pod:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values:
- stg
tolerations:
- key: node-type
operator: Equal
value: stg
effect: NoSchedule
entityOperator:
topicOperator: {}
userOperator: {}
template:
pod:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values:
- stg
tolerations:
- key: node-type
operator: Equal
value: stg
effect: NoSchedule
Additional context
No response
Beta Was this translation helpful? Give feedback.
All reactions