The Kafka broker rejected a producer’s request because it lost track of the producer’s identity, leading to UnknownProducerId errors. This happens when a producer is configured for exactly-once semantics and the broker’s internal state for that producer becomes inconsistent, typically during a broker restart or network partition.

Common Causes and Fixes

1. Producer enable.idempotence is set to true but max.in.flight.requests.per.connection is greater than 1.

  • Diagnosis: Check your producer configuration.
    kafka-console-producer --broker-list localhost:9092 --topic my-topic --producer-property enable.idempotence=true --producer-property max.in.flight.requests.per.connection=5
    
    (This command demonstrates the incorrect configuration, not a diagnostic tool. You’d check your application’s producer config files or code.)
  • Fix: Set max.in.flight.requests.per.connection to 1 when enable.idempotence is true.
    kafka-console-producer --broker-list localhost:9092 --topic my-topic --producer-property enable.idempotence=true --producer-property max.in.flight.requests.per.connection=1
    
  • Why it works: Idempotence relies on the producer sending requests in order and the broker tracking the last successfully acknowledged sequence number for each producer ID. If max.in.flight.requests.per.connection is greater than 1, multiple requests can be in flight simultaneously. If a request fails and the producer retries, the broker might receive a later request first, or a different in-flight request might succeed while the first one is still being processed, leading to a loss of ordering or an incorrect sequence number count, thus invalidating the producer’s ID.

2. Producer transactional.id is not configured or incorrectly configured for transactional producers.

  • Diagnosis: Verify that your producer is explicitly configured with a transactional.id if you intend to use transactions, and that this ID is unique for each producer instance that needs to participate in transactions.
    Properties props = new Properties();
    props.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "my-transactional-id-123");
    // ... other producer configs
    KafkaProducer<String, String> producer = new KafkaProducer<>(props);
    producer.initTransactions();
    
  • Fix: Ensure a unique and persistent transactional.id is set for all producer instances that require transactional guarantees.
    Properties props = new Properties();
    props.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "my-unique-transactional-id-for-this-instance");
    // ... other producer configs
    KafkaProducer<String, String> producer = new KafkaProducer<>(props);
    producer.initTransactions();
    
  • Why it works: The transactional.id is the key that Kafka uses to manage transactional state across producer restarts. If it’s missing or inconsistent, the broker cannot correctly associate incoming requests with ongoing transactions, leading to UnknownProducerId.

3. Broker transaction.state.log.replication.factor is too low, causing transaction coordinator failure.

  • Diagnosis: Check the broker configuration for transaction.state.log.replication.factor.
    # On a broker
    grep transaction.state.log.replication.factor server.properties
    
    If the value is 1 and you have multiple brokers, this can be an issue.
  • Fix: Set transaction.state.log.replication.factor to at least 3 for production environments, assuming you have at least 3 brokers.
    # In server.properties on all brokers
    transaction.state.log.replication.factor=3
    transaction.state.log.segment.bytes=1073741824 # Optional but recommended
    
    Restart brokers after changing this setting.
  • Why it works: The transaction state log (a Kafka topic itself) needs to be replicated reliably to ensure transaction atomicity and recovery. A low replication factor increases the risk of losing transaction state if a broker fails, which can lead to producers losing their ProducerId or transaction state.

4. Broker transactional.id.replication.factor is too low (Kafka 2.8+).

  • Diagnosis: Similar to the state log, check transactional.id.replication.factor.
    # On a broker
    grep transactional.id.replication.factor server.properties
    
  • Fix: Set transactional.id.replication.factor to at least 3 for production environments.
    # In server.properties on all brokers
    transactional.id.replication.factor=3
    
    Restart brokers after changing this setting.
  • Why it works: This configuration controls the replication factor for the topic that stores transactional.id mappings. Insufficient replication can lead to inconsistencies in these mappings, causing the broker to not recognize a transactional.id it previously knew.

5. Producer acks setting is not all when enable.idempotence is true.

  • Diagnosis: Check producer configuration for acks.
    kafka-console-producer --broker-list localhost:9092 --topic my-topic --producer-property enable.idempotence=true --producer-property acks=1
    
  • Fix: Ensure acks is set to all when enable.idempotence is true.
    kafka-console-producer --broker-list localhost:9092 --topic my-topic --producer-property enable.idempotence=true --producer-property acks=all
    
  • Why it works: For idempotence to work correctly, the producer needs confirmation that its messages have been durably written by the leader and replicated to at least one follower (if min.insync.replicas is set on the broker). Setting acks=all ensures this level of confirmation. If acks is set to 0 or 1, the producer might not wait for sufficient acknowledgment, and if a leader fails before replication, the broker might lose track of the producer’s sequence number.

6. ZooKeeper session expiration or network issues between brokers.

  • Diagnosis: Check broker logs for ZooKeeper connection errors or network flapping. Look for messages like ZooKeeper connection lost or Connection refused between brokers.
  • Fix: Ensure ZooKeeper is stable and accessible from all brokers. Verify network connectivity and firewall rules between brokers. For ZooKeeper, ensure tickTime, syncTime, and initLimit are appropriately configured for your cluster size and network latency.
  • Why it works: Kafka brokers use ZooKeeper for leader election, metadata management, and coordination. If a broker loses its ZooKeeper connection or experiences network partitions with other brokers, it can lose its view of the cluster state, including which producers are active and their associated IDs.

7. Incorrectly configured producer_id_expiration_ms on the broker.

  • Diagnosis: Check broker configuration for producer_id_expiration_ms.
    # On a broker
    grep producer_id_expiration_ms server.properties
    
    If this value is very low, producers might be considered expired prematurely.
  • Fix: Increase producer_id_expiration_ms to a sufficiently large value. The default is 7 days (604800000 ms). For most use cases, the default is fine, but if producers are expected to be idle for extended periods, you might need to increase it.
    # In server.properties on all brokers
    producer_id_expiration_ms=604800000 # 7 days (default)
    
    Restart brokers.
  • Why it works: Kafka brokers expire producer IDs after a period of inactivity to clean up state. If this expiration period is too short, a long-running but temporarily idle producer might have its ID expired, forcing it to re-establish its identity and potentially leading to UnknownProducerId on subsequent requests.

If you’ve addressed all these, the next error you’ll likely see is a NotControllerException if your controller is down, or potentially a LeaderNotAvailableException if a topic partition leader is unavailable.

Want structured learning?

Take the full Kafka course →