Categorygithub.com/grafana/kafka_exporter
modulepackage
0.0.0-20240409084445-5e3488ad9f9a
Repository: https://github.com/grafana/kafka_exporter.git
Documentation: pkg.go.dev

# README

kafka_exporter

kafka_exporter

Kafka exporter for Prometheus. For other metrics from Kafka, have a look at the JMX exporter.

Table of Contents

Compatibility

Support Apache Kafka version 0.10.1.0 (and later).

Dependency

Download

Binary can be downloaded from Releases page.

Compile

Build Binary

make

Build Docker Image

make docker

Docker Hub Image

docker pull grafana/kafka-exporter:latest

It can be used directly instead of having to build the image yourself. (Docker Hub grafana/kafka-exporter)

Run

Run Binary

kafka_exporter --kafka.server=kafka:9092 [--kafka.server=another-server ...]

Run Docker Image

docker run -ti --rm -p 9308:9308 grafana/kafka-exporter --kafka.server=kafka:9092 [--kafka.server=another-server ...]

Run Docker Compose

make a docker-compose.yml flie

services:
  kafka-exporter:
    image: danielqsj/kafka-exporter
    command: ["--kafka.server=kafka:9092", "[--kafka.server=another-server ...]"]
    ports:
      - 9308:9308

then run it

docker-compose up -d

Flags

This image is configurable using different flags

Flag nameDefaultDescription
kafka.serverkafka:9092Addresses (host:port) of Kafka server
kafka.version2.0.0Kafka broker version
sasl.enabledfalseConnect using SASL/PLAIN
sasl.handshaketrueOnly set this to false if using a non-Kafka SASL proxy
sasl.usernameSASL user name
sasl.passwordSASL user password
sasl.mechanismSASL mechanism can be plain, scram-sha512, scram-sha256
sasl.service-nameService name when using Kerberos Auth
sasl.kerberos-config-pathKerberos config path
sasl.realmKerberos realm
sasl.keytab-pathKerberos keytab file path
sasl.kerberos-auth-typeKerberos auth type. Either 'keytabAuth' or 'userAuth'
tls.enabledfalseConnect to Kafka using TLS
tls.server-nameUsed to verify the hostname on the returned certificates unless tls.insecure-skip-tls-verify is given. The kafka server's name should be given
tls.ca-fileThe optional certificate authority file for Kafka TLS client authentication
tls.cert-fileThe optional certificate file for Kafka client authentication
tls.key-fileThe optional key file for Kafka client authentication
tls.insecure-skip-tls-verifyfalseIf true, the server's certificate will not be checked for validity
server.tls.enabledfalseEnable TLS for web server
server.tls.mutual-auth-enabledfalseEnable TLS client mutual authentication
server.tls.ca-fileThe certificate authority file for the web server
server.tls.cert-fileThe certificate file for the web server
server.tls.key-fileThe key file for the web server
topic.filter.*Regex that determines which topics to collect
topic.exclude^$Regex that determines which topics to exclude
group.filter.*Regex that determines which consumer groups to collect
group.exclude^$Regex that determines which consumer groups to exclude
web.listen-address:9308Address to listen on for web interface and telemetry
web.telemetry-path/metricsPath under which to expose metrics
log.enable-saramafalseTurn on Sarama logging
use.consumelag.zookeeperfalseif you need to use a group from zookeeper
zookeeper.serverlocalhost:2181Address (hosts) of zookeeper server
kafka.labelsKafka cluster name
refresh.metadata30sMetadata refresh interval
offset.show-alltrueWhether show the offset/lag for all consumer group, otherwise, only show connected consumer groups
concurrent.enablefalseIf true, all scrapes will trigger kafka operations otherwise, they will share results. WARN: This should be disabled on large clusters
topic.workers100Number of topic workers
max.offsets1000Maximum number of offsets to store in the interpolation table for a partition

Notes

Boolean values are uniquely managed by Kingpin. Each boolean flag will have a negative complement: --<name> and --no-<name>.

For example:

If you need to disable sasl.handshake, you could add flag --no-sasl.handshake

Metrics

Documents about exposed Prometheus metrics.

For details on the underlying metrics please see Apache Kafka.

Brokers

Metrics details

NameExposed information
kafka_brokersNumber of Brokers in the Kafka Cluster
kafka_broker_infoInformation about the Kafka Broker

Metrics output example

# HELP kafka_brokers Number of Brokers in the Kafka Cluster.
# TYPE kafka_brokers gauge
kafka_brokers 3

Topics

Metrics details

NameExposed information
kafka_topic_partitionsNumber of partitions for this Topic
kafka_topic_partition_current_offsetCurrent Offset of a Broker at Topic/Partition
kafka_topic_partition_oldest_offsetOldest Offset of a Broker at Topic/Partition
kafka_topic_partition_in_sync_replicaNumber of In-Sync Replicas for this Topic/Partition
kafka_topic_partition_leaderLeader Broker ID of this Topic/Partition
kafka_topic_partition_leader_is_preferred1 if Topic/Partition is using the Preferred Broker
kafka_topic_partition_replicasNumber of Replicas for this Topic/Partition
kafka_topic_partition_under_replicated_partition1 if Topic/Partition is under Replicated

Metrics output example

# HELP kafka_topic_partitions Number of partitions for this Topic
# TYPE kafka_topic_partitions gauge
kafka_topic_partitions{topic="__consumer_offsets"} 50

# HELP kafka_topic_partition_current_offset Current Offset of a Broker at Topic/Partition
# TYPE kafka_topic_partition_current_offset gauge
kafka_topic_partition_current_offset{partition="0",topic="__consumer_offsets"} 0

# HELP kafka_topic_partition_oldest_offset Oldest Offset of a Broker at Topic/Partition
# TYPE kafka_topic_partition_oldest_offset gauge
kafka_topic_partition_oldest_offset{partition="0",topic="__consumer_offsets"} 0

# HELP kafka_topic_partition_in_sync_replica Number of In-Sync Replicas for this Topic/Partition
# TYPE kafka_topic_partition_in_sync_replica gauge
kafka_topic_partition_in_sync_replica{partition="0",topic="__consumer_offsets"} 3

# HELP kafka_topic_partition_leader Leader Broker ID of this Topic/Partition
# TYPE kafka_topic_partition_leader gauge
kafka_topic_partition_leader{partition="0",topic="__consumer_offsets"} 0

# HELP kafka_topic_partition_leader_is_preferred 1 if Topic/Partition is using the Preferred Broker
# TYPE kafka_topic_partition_leader_is_preferred gauge
kafka_topic_partition_leader_is_preferred{partition="0",topic="__consumer_offsets"} 1

# HELP kafka_topic_partition_replicas Number of Replicas for this Topic/Partition
# TYPE kafka_topic_partition_replicas gauge
kafka_topic_partition_replicas{partition="0",topic="__consumer_offsets"} 3

# HELP kafka_topic_partition_under_replicated_partition 1 if Topic/Partition is under Replicated
# TYPE kafka_topic_partition_under_replicated_partition gauge
kafka_topic_partition_under_replicated_partition{partition="0",topic="__consumer_offsets"} 0

Consumer Groups

Metrics details

NameExposed informations
kafka_consumergroup_current_offsetCurrent Offset of a ConsumerGroup at Topic/Partition
kafka_consumergroup_current_offset_sumCurrent Offset of a ConsumerGroup at Topic for all partitions
kafka_consumergroup_uncommitted_offsetsCurrent Approximate count of uncommitted offsets for a ConsumerGroup at Topic/Partition
kafka_consumergroup_uncommitted_offsets_sumCurrent Approximate count of uncommitted offsets for a ConsumerGroup at Topic for all partitions
kafka_consumergroup_membersAmount of members in a consumer group"
kafka_consumergroupzookeeper_uncommitted_offsets_zookeeperCurrent Approximate count of uncommitted offsets(zookeeper) for a ConsumerGroup at Topic/Partition"

Important Note

To be able to collect the metrics kafka_consumergroupzookeeper_uncommitted_offsets_zookeeper, you must set the following flags:

  • use.consumelag.zookeeper: enable collect consume lag from zookeeper
  • zookeeper.server: address for connection to zookeeper

Metrics output example

# HELP kafka_consumergroup_current_offset Current Offset of a ConsumerGroup at Topic/Partition
# TYPE kafka_consumergroup_current_offset gauge
kafka_consumergroup_current_offset{consumergroup="KMOffsetCache-kafka-manager-3806276532-ml44w",partition="0",topic="__consumer_offsets"} -1

# HELP kafka_consumergroup_uncommitted_offsets Current Approximate count of uncommitted offsets for a ConsumerGroup at Topic/Partition
# TYPE kafka_consumergroup_uncommitted_offsets gauge
kafka_consumergroup_uncommitted_offsets{consumergroup="KMOffsetCache-kafka-manager-3806276532-ml44w",partition="0",topic="__consumer_offsets"} 1

Consumer Lag

Metric Details

NameExposed information
kafka_consumer_lag_millisCurrent approximation of consumer lag for a ConsumerGroup at Topic/Partition
kafka_consumer_lag_extrapolationIndicates that a consumer group lag estimation used extrapolation
kafka_consumer_lag_interpolationIndicates that a consumer group lag estimation used interpolation

Metrics output example

# HELP kafka_consumer_lag_extrapolation Indicates that a consumer group lag estimation used extrapolation
# TYPE kafka_consumer_lag_extrapolation counter
kafka_consumer_lag_extrapolation{consumergroup="perf-consumer-74084",partition="0",topic="test"} 1

# HELP kafka_consumer_lag_interpolation Indicates that a consumer group lag estimation used interpolation
# TYPE kafka_consumer_lag_interpolation counter
kafka_consumer_lag_interpolation{consumergroup="perf-consumer-74084",partition="0",topic="test"} 1

# HELP kafka_consumer_lag_millis Current approximation of consumer lag for a ConsumerGroup at Topic/Partition
# TYPE kafka_consumer_lag_millis gauge
kafka_consumer_lag_millis{consumergroup="perf-consumer-74084",partition="0",topic="test"} 3.4457231197552e+10

Grafana Dashboard

Grafana Dashboard ID: 7589, name: Kafka Exporter Overview.

For details of the dashboard please see Kafka Exporter Overview.

Lag Estimation

The technique to estimate lag for a consumer group, topic, and partition is taken from the Lightbend Kafka Lag Exporter.

Once the exporter starts up, sampling of the next offset to be produced begins. The interpolation table is built from these samples, and the current offset for each monitored consumer group are compared against values in the table. If an upper and lower bound for the current offset of a consumer group are in the table, the interpolation technique is used. If only an upper bound is container within the table, extrapolation is used.

For the lag computation, the number of offsets for each partition is trimmed down to max.offsets (default 1000), with the oldest offsets removed first.

Contribute

To contribute to the upstream project, please open a pull request.

To contribute to this fork please open a pull request here

Donation

To donate to the developer of the project this is forked from please use the donation link below

License

Code is licensed under the Apache License 2.0.

# Packages

No description provided by the author