The Kafka broker rejected a producer’s request because it lost track of the producer’s identity, leading to UnknownProducerId errors. This happens when a producer is configured for exactly-once semantics and the broker’s internal state for that producer becomes inconsistent, typically during a broker restart or network partition.
Common Causes and Fixes
1. Producer enable.idempotence is set to true but max.in.flight.requests.per.connection is greater than 1.
- Diagnosis: Check your producer configuration.
(This command demonstrates the incorrect configuration, not a diagnostic tool. You’d check your application’s producer config files or code.)kafka-console-producer --broker-list localhost:9092 --topic my-topic --producer-property enable.idempotence=true --producer-property max.in.flight.requests.per.connection=5 - Fix: Set
max.in.flight.requests.per.connectionto1whenenable.idempotenceistrue.kafka-console-producer --broker-list localhost:9092 --topic my-topic --producer-property enable.idempotence=true --producer-property max.in.flight.requests.per.connection=1 - Why it works: Idempotence relies on the producer sending requests in order and the broker tracking the last successfully acknowledged sequence number for each producer ID. If
max.in.flight.requests.per.connectionis greater than 1, multiple requests can be in flight simultaneously. If a request fails and the producer retries, the broker might receive a later request first, or a different in-flight request might succeed while the first one is still being processed, leading to a loss of ordering or an incorrect sequence number count, thus invalidating the producer’s ID.
2. Producer transactional.id is not configured or incorrectly configured for transactional producers.
- Diagnosis: Verify that your producer is explicitly configured with a
transactional.idif you intend to use transactions, and that this ID is unique for each producer instance that needs to participate in transactions.Properties props = new Properties(); props.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "my-transactional-id-123"); // ... other producer configs KafkaProducer<String, String> producer = new KafkaProducer<>(props); producer.initTransactions(); - Fix: Ensure a unique and persistent
transactional.idis set for all producer instances that require transactional guarantees.Properties props = new Properties(); props.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "my-unique-transactional-id-for-this-instance"); // ... other producer configs KafkaProducer<String, String> producer = new KafkaProducer<>(props); producer.initTransactions(); - Why it works: The
transactional.idis the key that Kafka uses to manage transactional state across producer restarts. If it’s missing or inconsistent, the broker cannot correctly associate incoming requests with ongoing transactions, leading toUnknownProducerId.
3. Broker transaction.state.log.replication.factor is too low, causing transaction coordinator failure.
- Diagnosis: Check the broker configuration for
transaction.state.log.replication.factor.
If the value is# On a broker grep transaction.state.log.replication.factor server.properties1and you have multiple brokers, this can be an issue. - Fix: Set
transaction.state.log.replication.factorto at least3for production environments, assuming you have at least 3 brokers.
Restart brokers after changing this setting.# In server.properties on all brokers transaction.state.log.replication.factor=3 transaction.state.log.segment.bytes=1073741824 # Optional but recommended - Why it works: The transaction state log (a Kafka topic itself) needs to be replicated reliably to ensure transaction atomicity and recovery. A low replication factor increases the risk of losing transaction state if a broker fails, which can lead to producers losing their
ProducerIdor transaction state.
4. Broker transactional.id.replication.factor is too low (Kafka 2.8+).
- Diagnosis: Similar to the state log, check
transactional.id.replication.factor.# On a broker grep transactional.id.replication.factor server.properties - Fix: Set
transactional.id.replication.factorto at least3for production environments.
Restart brokers after changing this setting.# In server.properties on all brokers transactional.id.replication.factor=3 - Why it works: This configuration controls the replication factor for the topic that stores
transactional.idmappings. Insufficient replication can lead to inconsistencies in these mappings, causing the broker to not recognize atransactional.idit previously knew.
5. Producer acks setting is not all when enable.idempotence is true.
- Diagnosis: Check producer configuration for
acks.kafka-console-producer --broker-list localhost:9092 --topic my-topic --producer-property enable.idempotence=true --producer-property acks=1 - Fix: Ensure
acksis set toallwhenenable.idempotenceistrue.kafka-console-producer --broker-list localhost:9092 --topic my-topic --producer-property enable.idempotence=true --producer-property acks=all - Why it works: For idempotence to work correctly, the producer needs confirmation that its messages have been durably written by the leader and replicated to at least one follower (if
min.insync.replicasis set on the broker). Settingacks=allensures this level of confirmation. Ifacksis set to0or1, the producer might not wait for sufficient acknowledgment, and if a leader fails before replication, the broker might lose track of the producer’s sequence number.
6. ZooKeeper session expiration or network issues between brokers.
- Diagnosis: Check broker logs for ZooKeeper connection errors or network flapping. Look for messages like
ZooKeeper connection lostorConnection refusedbetween brokers. - Fix: Ensure ZooKeeper is stable and accessible from all brokers. Verify network connectivity and firewall rules between brokers. For ZooKeeper, ensure
tickTime,syncTime, andinitLimitare appropriately configured for your cluster size and network latency. - Why it works: Kafka brokers use ZooKeeper for leader election, metadata management, and coordination. If a broker loses its ZooKeeper connection or experiences network partitions with other brokers, it can lose its view of the cluster state, including which producers are active and their associated IDs.
7. Incorrectly configured producer_id_expiration_ms on the broker.
- Diagnosis: Check broker configuration for
producer_id_expiration_ms.
If this value is very low, producers might be considered expired prematurely.# On a broker grep producer_id_expiration_ms server.properties - Fix: Increase
producer_id_expiration_msto a sufficiently large value. The default is 7 days (604800000 ms). For most use cases, the default is fine, but if producers are expected to be idle for extended periods, you might need to increase it.
Restart brokers.# In server.properties on all brokers producer_id_expiration_ms=604800000 # 7 days (default) - Why it works: Kafka brokers expire producer IDs after a period of inactivity to clean up state. If this expiration period is too short, a long-running but temporarily idle producer might have its ID expired, forcing it to re-establish its identity and potentially leading to
UnknownProducerIdon subsequent requests.
If you’ve addressed all these, the next error you’ll likely see is a NotControllerException if your controller is down, or potentially a LeaderNotAvailableException if a topic partition leader is unavailable.