Message Queue Delivery: Guarantees Explained

Kafka’s delivery guarantees aren’t about if a message gets there, but how many times and when.

Let’s see it in action. Imagine a simple producer sending messages to a topic called orders and a consumer reading from it.

# Producer (simplified)
from kafka import KafkaProducer
import json

producer = KafkaProducer(
    bootstrap_servers='localhost:9092',
    value_serializer=lambda x: json.dumps(x).encode('utf-8')
)

for i in range(10):
    data = {'order_id': i, 'item': 'widget'}
    producer.send('orders', value=data)
    print(f"Sent: {data}")
producer.flush() # Ensure all messages are sent

# Consumer (simplified)
from kafka import KafkaConsumer
import json

consumer = KafkaConsumer(
    'orders',
    bootstrap_servers='localhost:9092',
    auto_offset_reset='earliest',
    enable_auto_commit=True, # Default for at-most-once
    group_id='order-processor',
    value_deserializer=lambda x: json.loads(x.decode('utf-8'))
)

for message in consumer:
    print(f"Received: {message.value}")

This basic setup, with enable_auto_commit=True (which is the default), is Kafka’s at-most-once guarantee. The consumer automatically commits its offset to Kafka after it receives the message, but before it has necessarily finished processing it. If the consumer crashes between receiving and processing, that message is lost forever. It was delivered at most once.

To get at-least-once, we need to control when the offset is committed. This means disabling auto-commit and manually committing after processing.

# Consumer for at-least-once
from kafka import KafkaConsumer
import json

consumer = KafkaConsumer(
    'orders',
    bootstrap_servers='localhost:9092',
    auto_offset_reset='earliest',
    enable_auto_commit=False, # Disable auto-commit
    group_id='order-processor-v2',
    value_deserializer=lambda x: json.loads(x.decode('utf-8'))
)

for message in consumer:
    try:
        print(f"Processing: {message.value}")
        # Simulate processing
        if message.value['order_id'] == 5:
            raise ValueError("Simulated processing error")
        # If processing is successful, commit the offset
        consumer.commit()
        print(f"Committed offset for message: {message.offset}")
    except Exception as e:
        print(f"Error processing message {message.value}: {e}. Will retry.")
        # Do NOT commit offset on error. The message will be re-delivered.

With enable_auto_commit=False and manual consumer.commit(), the offset is only committed after the print(f"Processing: {message.value}") line (and any simulated processing) completes successfully. If a crash occurs during processing, the consumer restarts from the last committed offset, meaning the message that was being processed will be delivered again. This gives you at-least-once. The downside? You might process the same message twice, which is why your application logic needs to be idempotent (able to handle duplicate messages gracefully).

Achieving exactly-once semantics in Kafka is more complex and relies on a combination of producer idempotence, transactional APIs, and consumer logic. The core idea is to ensure that a message is written to Kafka and processed by the consumer as a single, atomic operation.

For exactly-once, both the producer and consumer need to be configured correctly, and the Kafka broker version must support transactions (0.11.0+).

Producer Configuration for Exactly-Once:

enable.idempotence=true: This is crucial. It ensures that a message sent by the producer will be written to the Kafka log exactly once, even if the producer retries sending it due to network issues. The producer assigns a unique producer ID (PID) and sequence number to each message. The broker tracks these per partition and discards duplicates.
acks=all: The producer waits for acknowledgments from all in-sync replicas.
retries: Set to a high value (e.g., Integer.MAX_VALUE).

Consumer/Processing Logic for Exactly-Once:

This typically involves Kafka transactions. The consumer reads messages, processes them, and then writes any results (e.g., to a database) and commits the Kafka offset within the same transaction.

# Consumer for exactly-once (conceptual, requires transactional logic)
from kafka import KafkaProducer, KafkaConsumer
import json

# Producer used within the transaction to potentially write results
# Or the consumer's own commit operation acts as the transaction boundary
producer = KafkaProducer(
    bootstrap_servers='localhost:9092',
    enable_idempotence=True,
    acks='all',
    retries=100000 # Effectively infinite
)

consumer = KafkaConsumer(
    'orders',
    bootstrap_servers='localhost:9092',
    auto_offset_reset='earliest',
    enable_auto_commit=False,
    group_id='order-processor-v3',
    value_deserializer=lambda x: json.loads(x.decode('utf-8'))
)

for message in consumer:
    transaction_id = f"order-tx-{message.partition}-{message.offset}" # Unique transaction ID
    producer.init_transactions() # Initialize for the session
    producer.begin_transaction() # Start a new transaction

    try:
        print(f"Processing for transaction {transaction_id}: {message.value}")
        # Simulate processing and potential side effects
        if message.value['order_id'] == 7:
            raise ValueError("Simulated processing error during transaction")

        # If processing is successful, commit the Kafka offset as part of the transaction
        # This requires a producer that is also acting as the transaction coordinator
        # or a mechanism where the consumer can signal commit for the read messages.
        # In practice, this often means the consumer *itself* uses transactional writes
        # to its output or to Kafka, and commits offsets atomically.
        # A common pattern is:
        # 1. Read messages (offsets not committed yet)
        # 2. Begin transaction
        # 3. Process messages
        # 4. Write results to an external system (e.g., DB) *and* potentially send a "processed" marker back to Kafka
        # 5. Commit the transaction (which commits the Kafka offsets atomically with the output)

        # For simplicity here, we'll simulate the offset commit as the transactional step.
        # In a real system, this would involve a producer writing to a transactional topic
        # or updating offsets in a transactional store.
        # The Kafka client library doesn't directly expose consumer-side transactional commits
        # of *read* offsets in the same way as producer writes. The guarantee comes from
        # coordinating reads, processing, and *writes* (which include offset commits) atomically.

        # Let's assume a simplified scenario where the consumer commits offsets transactionally:
        # This is a conceptual representation; actual implementation details vary.
        # A common library pattern:
        # consumer.send_offsets_to_transaction(message, consumer.group_id)
        # producer.commit_transaction()

        # For demonstration, we'll just acknowledge and assume the transactional commit happens.
        # The key is that the offset commit and any side effects happen together.
        print(f"Transaction {transaction_id} successful. Committing.")
        producer.commit_transaction() # This would commit offsets and any other transactional writes
        print(f"Committed transaction for message: {message.offset}")

    except Exception as e:
        print(f"Error in transaction {transaction_id} for message {message.value}: {e}. Aborting transaction.")
        producer.abort_transaction() # Abort the transaction on error
        # Do NOT commit offset. The message will be re-read and processed in a new transaction.

The most surprising true thing about Kafka’s exactly-once semantics is that it’s not a single switch but a coordinated effort between the producer, the broker, and the consumer, often requiring transactional capabilities. The consumer doesn’t "commit its offset exactly once" in isolation; rather, the entire operation of processing a message and updating its state (including the offset) is made atomic.

The core mechanism for exactly-once involves producer idempotence (enable.idempotence=true) and transactional APIs. When a producer is idempotent, it can retry sending messages without fear of duplicates. Transactions allow a set of produce requests to be treated as a single atomic unit. For consumers, achieving exactly-once means that the act of processing a message and committing its offset must be an atomic operation. This is typically achieved by performing the processing and the offset commit (or a signal that the processing is done) within the same transaction. If the transaction fails, nothing is committed, and the message will be retried. If it succeeds, both the processed data and the committed offset are finalized.

The one thing most people don’t realize is that enable.idempotence=true on the producer is a prerequisite for transactional producers, and it handles duplicate writes from the producer to Kafka. The "exactly-once" for the consumer then relies on coordinating the processing with the commit of the read offset, often by making the commit part of a larger transaction that also includes any side effects of processing.

The next concept you’ll likely encounter is how to handle failed transactions and how to manage the lifecycle of transactional IDs.