MQTT brokers are surprisingly bad at handling high-volume, high-velocity sensor telemetry, often leading to dropped messages and high latency.

Here’s how it actually looks when a bunch of sensors start chattering:

Imagine a single MQTT broker, say Mosquitto, running on a modest EC2 instance. We’ve got 10,000 sensors, each publishing a small JSON payload every second to a unique topic like sensors/room1/temperature.

# On the broker machine
# Monitor active connections
netstat -anp | grep 1883 | grep ESTABLISHED | wc -l

# Monitor network traffic (example for eth0)
nload eth0

The broker, by default, tries to keep all these connections alive, buffer outgoing messages for each subscriber, and manage the routing. When the message rate hits, say, 10,000 messages per second, the broker’s CPU can spike, its network buffer can fill up, and disk I/O might become a bottleneck if persistence is enabled. Subscriptions themselves consume memory. A single client subscribing to 5,000 topics can eat up a noticeable chunk of RAM.

The core problem is that traditional brokers are designed for many-to-many communication, but sensor telemetry is often a one-to-many broadcast. Each sensor is a publisher, and potentially many applications (dashboards, analytics engines, control systems) are subscribers. The broker has to duplicate and send each message to every interested subscriber.

Let’s look at a typical setup. A single Mosquitto instance might be configured like this:

# mosquitto.conf
listener 1883
max_connections 10000
# ... other settings

And a client publishing:

import paho.mqtt.client as mqtt
import time
import json

client = mqtt.Client()
client.connect("your_broker_ip", 1883, 60)
client.loop_start()

sensor_id = "sensor_abc"
for i in range(100000):
    payload = {"sensor_id": sensor_id, "timestamp": time.time(), "value": i % 100}
    client.publish(f"sensors/{sensor_id}/data", json.dumps(payload), qos=1)
    time.sleep(0.01) # Simulate ~100 messages/sec per sensor

If you have 100 such clients, you’re hitting 10,000 messages/sec. The broker’s clients.connected metric will climb, and messages.in and messages.out will surge. If messages.out per second exceeds the broker’s ability to push data through its network interface or dispatch it to subscribers, you’ll see latency and dropped messages.

The most surprising thing about MQTT telemetry is that the broker often becomes the bottleneck for data distribution, not just message routing. It’s not just about whether the broker can receive the message, but whether it can send it to potentially thousands of subscribers, each with their own network conditions and processing speeds.

Consider this scenario: a streaming analytics job needs to subscribe to sensors/#. The broker, for every single sensor message, has to check its internal subscription table, find the sensors/# subscription, and then enqueue that message for the analytics client. If there are many such wildcard subscriptions or many specific topic subscriptions, the overhead of matching and enqueuing becomes significant.

Here’s a look at the internal state of a busy broker (this is conceptual, as direct inspection is hard without custom tooling):

  • Connection Management: Each ESTABLISHED TCP connection has overhead. Maintaining TLS sessions for thousands of clients is CPU-intensive.
  • Subscription Table: A massive hash map or trie storing every client’s subscriptions. Wildcard matching (#, +) is computationally more expensive.
  • Message Queues: Per-subscriber, per-topic queues. If a subscriber is slow, its queue grows, consuming memory and potentially leading to backpressure.
  • Network I/O: The broker’s event loop is constantly polling sockets, reading incoming data, and writing outgoing data. This is often the primary CPU consumer.

This is where specialized telemetry platforms shine. Instead of a single broker trying to be everything, you might have a multi-broker cluster, or even a system that bypasses the traditional broker for high-volume data. For instance, you could use a Kafka cluster as the actual backend for telemetry, with an MQTT broker acting as a lightweight gateway that only forwards messages to Kafka. The Kafka brokers are designed for high-throughput, durable streaming and can handle fan-out to consumers much more efficiently.

The pattern is often: IoT Devices -> MQTT Broker (Gateway) -> Kafka Cluster -> Consumers (Dashboards, Analytics). The MQTT broker’s job simplifies to just receiving and forwarding, offloading the heavy lifting of distribution and buffering to Kafka.

The critical insight for optimizing is to realize that a single MQTT broker is not a scalable data bus for thousands of high-frequency sensors. It’s a messaging router. For telemetry, you need a streaming platform.

A common pattern for high-throughput is to use a single, highly optimized MQTT broker as a gateway and immediately forward messages to a distributed log like Apache Kafka. The MQTT broker’s role becomes simply receiving and publishing to Kafka topics. For example, using something like mqtt-kafka-bridge. This shifts the burden of fan-out and buffering to Kafka, which is designed for it. The MQTT broker still handles connections, but it doesn’t have to manage thousands of individual message queues for subscribers.

The next step in dealing with IoT data streams is often implementing efficient data filtering and aggregation before it hits your primary analytics or storage systems.

Want structured learning?

Take the full Mqtt course →