Kafka streams: consumer poll timeout has expired #45151
-
Describe the bugOur service is using extension quarkus-kafka-streams@3.8.5 to handle kafka messages. Everything was working fine until we added some new logic which took some more time for each message. The service didn't receive or consume any kafka messages. But actually the service was not down, it just stopped consuming messages. Any help or suggestion is appreciated, thanks in advance. Expected behaviorNo response Actual behaviorNo response How to Reproduce?No response Output of
|
Beta Was this translation helpful? Give feedback.
Replies: 5 comments
-
/cc @alesj (kafka,kafka-streams), @cescoffier (kafka), @gunnarmorling (kafka-streams), @ozangunalp (kafka,kafka-streams), @rquinio (kafka-streams) |
Beta Was this translation helpful? Give feedback.
-
/cc @alesj (kafka,kafka-streams), @cescoffier (kafka), @gunnarmorling (kafka-streams), @ozangunalp (kafka,kafka-streams), @rquinio (kafka-streams) |
Beta Was this translation helpful? Give feedback.
-
You can increase the In missed poll timeout however, I am not sure how Kafka Streams behaves, it may go to rebalancing state and stay there forever because its consumer is kicked out of the group. Hope this helps. |
Beta Was this translation helpful? Give feedback.
-
I've tried overriding the related configurations as following, but they have no effects.
I guess the reason is they are not defined/declared in StreamsConfig.class. |
Beta Was this translation helpful? Give feedback.
-
So actually the correct configuration is:
For our case, the consuming rate seems not to be the root cause. For some reasons the stream threads stopped consuming messages => missed a poll => got kicked out of consumer group. During that period, the node cpu utilization was really high, so we suspect that could affect the consumer pod, but not sure. |
Beta Was this translation helpful? Give feedback.
So actually the correct configuration is:
For our case, the consuming rate seems not to be the root cause. For some reasons the stream threads stopped consuming messages => missed a poll => got kicked out of consumer group. During that period, the node cpu utilization was really high, so we suspect that could affect the consumer pod, but not sure.