MQTT’s Quality of Service (QoS) levels aren’t just about reliability; they’re a surprisingly nuanced dance between delivery guarantees and network overhead.
Let’s see this in action. Imagine a simple sensor publishing its temperature every second.
import paho.mqtt.client as mqtt
import time
broker_address = "localhost"
broker_port = 1883
client = mqtt.Client()
client.connect(broker_address, broker_port, 60)
topic = "sensor/temperature"
for i in range(10):
temperature = 20 + i * 0.5
message = f"Temperature: {temperature}°C"
# For QoS 1, change the 0 to 1
# For QoS 2, change the 0 to 2
client.publish(topic, message, qos=0)
print(f"Published: {message} (QoS 0)")
time.sleep(1)
client.disconnect()
If we run this with qos=0, the client sends the message and immediately forgets about it. The broker might receive it, but there’s no confirmation.
Now, let’s look at the levels:
-
QoS 0: At Most Once
- What it is: Fire and forget. The message is sent once, and if it gets lost, it’s gone.
- When to use it: For data that’s frequently updated and where occasional loss is acceptable, like real-time sensor readings where a slightly stale reading is fine. Think high-volume, low-criticality data.
- Overhead: Minimal. No acknowledgments, no retransmissions.
-
QoS 1: At Least Once
- What it is: The sender receives an acknowledgment (PUBACK) from the receiver. If no PUBACK is received within a timeout, the sender resends the message. This guarantees the message arrives at least once, but it could arrive multiple times if the PUBACK is lost but the message was actually delivered.
- When to use it: When you need to ensure a message is delivered, but duplicates are manageable. For example, logging events where you need to know an event happened, even if you might log it twice.
- How it works:
- Publisher sends PUBLISH packet.
- Broker receives PUBLISH, stores it, and sends a PUBACK to the publisher.
- Publisher receives PUBACK and considers the message delivered.
- If the publisher doesn’t receive PUBACK within its timeout, it resends PUBLISH.
- If the broker receives a duplicate PUBLISH after already processing one, it discards the duplicate but still sends a PUBACK for it.
- Overhead: Moderate. Requires PUBACK handshake, potential for retransmissions.
-
QoS 2: Exactly Once
- What it is: The most robust guarantee. The sender and receiver engage in a four-way handshake (PUBLISH, PUBREC, PUBREL, PUBCOMP) to ensure the message is delivered precisely one time.
- When to use it: For critical operations where duplicates or lost messages are unacceptable. Think financial transactions, commands that trigger state changes (like turning a critical machine on/off), or data that must be processed uniquely.
- How it works:
- Step 1 (Publish): Publisher sends PUBLISH packet with QoS 2 flag.
- Step 2 (Receive): Broker receives PUBLISH. It stores the message and sends a PUBREC (Received) back to the publisher.
- Step 3 (Release): Publisher receives PUBREC, acknowledges receipt, and sends a PUBREL (Release) back to the broker. This tells the broker it’s safe to remove the message from its temporary storage.
- Step 4 (Complete): Broker receives PUBREL, removes the message, and sends a PUBCOMP (Complete) back to the publisher.
- If a publisher doesn’t receive PUBREC, it resends PUBLISH. If it receives PUBREC, it sends PUBREL. If it doesn’t receive PUBCOMP, it resends PUBREL.
- If a broker receives a duplicate PUBLISH and has already completed the handshake (sent PUBCOMP), it will discard the message but still send a PUBCOMP. If it received a duplicate PUBLISH but hasn’t sent PUBREC yet (e.g., network issue after receiving PUBLISH but before sending PUBREC), it will send PUBREC.
- Overhead: Highest. Requires a four-way handshake, more state to manage on both ends, and more network traffic.
The magic of QoS 2 lies in its ability to handle network glitches at each step of the handshake. If the PUBREC from the broker is lost, the publisher will re-send PUBLISH. If the PUBREL from the publisher is lost, the broker will re-send PUBREC. This back-and-forth ensures that both sender and receiver agree on whether the message has been fully processed, preventing both loss and duplication.
The primary trade-off is network bandwidth and latency. QoS 0 is the lightest, QoS 1 adds a single round trip, and QoS 2 adds two round trips. Choosing the right QoS level is about balancing the certainty of delivery against the cost of achieving it.
What many people overlook is how QoS levels impact the state managed by both the client and the broker. For QoS 1 and 2, both sides must maintain state about pending messages and acknowledgments, which can become a significant memory and processing burden for clients with many concurrent connections or very high message throughput.
The next step is understanding how these QoS levels interact with retained messages and last will and testament (LWT) messages.