Kafka Streams’ dead-letter topic feature is less about "dead" messages and more about isolating problematic records that your application can’t process, preventing them from halting the entire stream.

Let’s see it in action. Imagine a simple Kafka Streams application that doubles every integer it receives.

import org.apache.kafka.common.serialization.Serdes;
import org.apache.kafka.streams.KafkaStreams;
import org.apache.kafka.streams.StreamsBuilder;
import org.apache.kafka.streams.StreamsConfig;
import org.apache.kafka.streams.kstream.KStream;

import java.util.Properties;

public class DoublerApp {

    public static void main(String[] args) {
        Properties props = new Properties();
        props.put(StreamsConfig.APPLICATION_ID_CONFIG, "doubler-app");
        props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put(StreamsConfig.DEFAULT_KEY_SERDE_CONFIG, Serdes.String().getClass().getName());
        props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CONFIG, Serdes.Long().getClass().getName());

        // Configure the dead-letter topic
        props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, "exactly_once"); // DLQ often used with EOS
        props.put(StreamsConfig.consumerPrefix(StreamsConfig.PROCESSING_GUARANTEE_CONFIG), "exactly_once_v2"); // For EOS v2
        props.put(StreamsConfig.DLT_RETRY_ATTEMPTS_CONFIG, "3"); // How many times to retry processing
        props.put(StreamsConfig.DLT_RETRY_BACKOFF_MS_CONFIG, "1000"); // Wait 1 second between retries
        props.put(StreamsConfig.DLT_RETRY_ENABLE_CONFIG, "true"); // Enable retries

        StreamsBuilder builder = new StreamsBuilder();
        KStream<String, Long> source = builder.stream("input-topic");

        KStream<String, Long> processed = source.mapValues(value -> {
            if (value != null && value % 2 != 0) { // Simulate a "poison pill" - odd numbers
                throw new RuntimeException("Cannot process odd number: " + value);
            }
            return value * 2;
        });

        processed.to("output-topic");

        KafkaStreams streams = new KafkaStreams(builder.build(), props);
        streams.start();

        // Clean shutdown hook
        Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
    }
}

In this DoublerApp, we’ve configured Kafka Streams to use a dead-letter topic (DLT). The DLT_RETRY_ENABLE_CONFIG, DLT_RETRY_ATTEMPTS_CONFIG, and DLT_RETRY_BACKOFF_MS_CONFIG are crucial. When mapValues encounters an odd number, it throws an exception. Instead of crashing the application or silently dropping the record, Kafka Streams will:

  1. Attempt to process the record up to DLT_RETRY_ATTEMPTS_CONFIG times, waiting DLT_RETRY_BACKOFF_MS_CONFIG between attempts.
  2. If processing still fails after all retries, the original record (key and value) will be sent to a Kafka topic named application-id-dlt (e.g., doubler-app-dlt).
  3. The original record is not committed to the output topic.

This mechanism is vital for maintaining data integrity and application availability. Without it, a single malformed or unprocessable message could stop your entire stream processing pipeline. The DLT acts as a quarantine zone, allowing you to inspect, fix, and reprocess problematic messages separately, without impacting the flow of valid data.

The DLT is automatically created by Kafka Streams if it doesn’t exist. The topic name follows the pattern <application-id>-dlt. You can consume from this topic using standard Kafka consumers to investigate the problematic records.

When you configure DLT_RETRY_ENABLE_CONFIG to true, Kafka Streams will attempt to reprocess the failing record locally within the same task. The DLT_RETRY_ATTEMPTS_CONFIG controls how many times this local retry happens. If, after these retries, the record still fails processing (e.g., due to an unrecoverable deserialization error or a persistent application logic bug), it is then sent to the DLT. The DLT_RETRY_BACKOFF_MS_CONFIG provides a delay between these local retry attempts.

When a record lands in the DLT, it’s not automatically re-injected into the main processing flow. You’ll need a separate process to:

  1. Consume from the DLT topic.
  2. Analyze the records to understand why they failed.
  3. Correct the data or the processing logic.
  4. Manually republish the corrected records to the original input topic (or a dedicated reprocessing topic) for the Streams application to pick up again.

The default behavior for DLT messages, if retries are not enabled or exhausted, is that the record is not committed to the output topic. This means the message is effectively dropped from the main stream flow but preserved in the DLT for later inspection.

You might think that enabling the DLT automatically means your application will retry indefinitely, but DLT_RETRY_ATTEMPTS_CONFIG limits this. Once those attempts are exhausted, the message will be sent to the DLT topic. If you want to effectively retry indefinitely, you’d need to implement a custom retry mechanism outside of Kafka Streams, consuming from the DLT and resending to the input.

The next challenge you’ll likely encounter is how to efficiently reprocess messages from the DLT without causing infinite loops or reprocessing already-processed-and-fixed messages.

Want structured learning?

Take the full Kafka-streams course →