Skip to content

Conversation

@piochelepiotr
Copy link
Contributor

@piochelepiotr piochelepiotr commented Oct 24, 2025

What does this PR do?

Adds Kafka cluster monitoring capabilities to the kafka_consumer integration (preview feature). When enable_cluster_monitoring: true is set, the integration collects:

  • Broker metrics: count, leader count, partition count, and configurations
  • Topic & partition metrics: sizes, offsets, replication status, message rates
  • Consumer group metrics: member count, state, and details
  • Schema Registry metrics: subjects, versions, and full schemas (if URL provided)

Motivation

While the existing kafka_consumer integration provides consumer lag monitoring, customers need deeper visibility into their Kafka clusters without relying solely on JMX-based monitoring. This feature enables:

  • Proactive capacity planning: Track topic/partition growth and broker load distribution
  • Configuration auditing: Monitor broker and topic configurations with automatic change detection
  • Schema management: Track schema evolution and usage across topics
  • Operational insights: Consumer group health, under-replicated partitions, and offline detection

This complements the existing JMX-based kafka integration by providing Admin API-based metadata collection.

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
  • If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

@codecov
Copy link

codecov bot commented Oct 24, 2025

Codecov Report

❌ Patch coverage is 79.60644% with 114 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.00%. Comparing base (bcd706c) to head (03ccd9c).
⚠️ Report is 27 commits behind head on master.

Additional details and impacted files
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- **Consumer group metadata**: Member details and group state
- **Schema registry**: Schema information (if schema_registry_url is provided)

All cluster monitoring metrics are tagged with `kafka_cluster_id` for easy filtering.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not broker id as well? Cardinality? No info available? Something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most metrics are cluster wide metrics, so not specific to only one broker. I will make sure that metrics specific to one broker are tagged with broker id.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants