Kafka Streams exports its internal state as metrics, and understanding how to export and monitor these is key to keeping your applications healthy.
Let’s see some Kafka Streams metrics in action. Imagine a simple stream processing application that reads from an input topic, transforms the data, and writes to an output topic.
StreamsBuilder builder = new StreamsBuilder();
builder.stream("input-topic")
.mapValues(value -> value.toString().toUpperCase())
.to("output-topic");
KafkaStreams streams = new KafkaStreams(builder.build(), streamsConfig);
streams.start();
When this application runs, Kafka Streams generates a plethora of metrics. You can access them via JMX. For example, you might see a metric like kafka.streams.processor-node-metrics.records-in-total for a specific processor. This metric, a simple counter, increments every time a record enters that processor. Another example is kafka.streams.state-store-metrics.records-total for a state store, tracking total record operations.
The core problem Kafka Streams metrics solve is providing visibility into the internal workings of your distributed stream processing application. Without them, debugging performance bottlenecks, diagnosing failures, or understanding resource consumption would be largely guesswork. These metrics give you concrete data points to analyze.
Internally, Kafka Streams uses the Micrometer metrics facade. This means you can easily integrate with various monitoring systems like Prometheus, Datadog, or Graphite by configuring an appropriate exporter. The default way to access these is through JMX.
Here’s a typical configuration snippet for exporting Kafka Streams metrics to Prometheus using the kafka-streams-prometheus-metrics library. First, add the dependency:
<dependency>
<groupId>io.prometheus.jmx</groupId>
<artifactId>jmx_prometheus_javaagent</artifactId>
<version>0.16.0</version>
</dependency>
Then, create a prometheus_config.yaml file:
---
rules:
- pattern: 'kafka.streams<type=(.+), clientid=(.+), taskid=(.+)><>(\w+)'
name: kafka_streams_$1_$3_$2_$4
type: GAUGE
- pattern: 'kafka.streams<type=(.+), clientid=(.+), taskid=(.+)><>(\w+)'
name: kafka_streams_$1_$3_$2_$4_total
type: COUNTER
- pattern: 'kafka.streams<type=(.+), clientid=(.+), taskid=(.+)><>(\w+)'
name: kafka_streams_$1_$3_$2_$4_rate
type: COUNTER
And start your Kafka Streams application with the Java agent:
java -javaagent:path/to/jmx_prometheus_javaagent.jar=9090:path/to/prometheus_config.yaml -jar your-app.jar
This setup exposes metrics on port 9090 in a Prometheus-friendly format. You’d then configure your Prometheus server to scrape this endpoint.
The levers you control are primarily through the StreamsConfig and the choice of metric reporters. Key configurations include:
application.id: Uniquely identifies your Kafka Streams application, crucial for partitioning metrics.metric.reporters: Allows you to specify custom reporters beyond JMX.window.size.ms(for certain windowed operations): Affects metrics related to aggregated windows.
When you query Prometheus, you’ll see metrics like kafka_streams_processor_node_records_in_total or kafka_streams_state_store_records_total. You can then build dashboards to visualize kafka.streams.task-manager-metrics.num-num-running-tasks to ensure all your processing tasks are active, or kafka.streams.stream-thread-metrics.num-active-threads to monitor the health of your stream threads. High values in kafka.streams.processor-node-metrics.late-record-total might indicate downstream processing is too slow or upstream data is arriving out of order.
The most surprising thing is that Kafka Streams automatically registers a default MetricsReporter that uses the JMX MBeanServer. This means even without any explicit configuration for metric export, all these metrics are available via standard JMX tools. You don’t have to set up Prometheus or another system to see them; they are there from the moment your KafkaStreams object starts.
Understanding how to aggregate and alert on these metrics is the next logical step in operationalizing your Kafka Streams applications.