MQTT brokers, when scaled to handle millions of devices, aren’t just about more RAM or faster CPUs; they’re about managing the state of those connections efficiently.

Let’s see a basic MQTT flow with a few simulated clients connecting to a Mosquitto broker.

# On one terminal, start the Mosquitto broker
mosquitto -v

# On other terminals, simulate clients
# Client 1
mosquitto_sub -t "sensors/+/temperature" -u alice -P password -h localhost -p 1883

# Client 2
mosquitto_pub -t "commands/lights/bedroom" -m "ON" -u bob -P password -h localhost -p 1883

# Client 3 (also subscribing)
mosquitto_sub -t "sensors/+/temperature" -u charlie -P password -h localhost -p 1883

When you scale this, the broker has to keep track of every single connection, its subscriptions, its Quality of Service (QoS) levels, and potentially, unsent messages for offline clients. This state management is the core challenge at scale.

The problem is that a single broker process, even on a powerful machine, has limits. These limits aren’t just CPU or RAM, but how quickly it can iterate through its internal data structures representing connections and subscriptions. When you hit millions, the latency of these operations becomes a significant bottleneck.

Here’s how you architect for millions:

1. Horizontal Scaling with Clustering:

The most direct approach is to run multiple broker instances that form a cluster. This distributes the load and state. Tools like EMQX are built with clustering as a first-class citizen.

  • Diagnosis: If your single broker is maxing out CPU or memory, and you’re seeing connection errors like Connection refused or Resource temporarily unavailable during high connection churn, it’s time to scale horizontally.
  • Check: Look at netstat -anp | grep 1883 on your broker machine to see the number of active connections. If it’s in the tens or hundreds of thousands and the CPU is pegged, you’re hitting limits. For EMQX, emqx_ctl listeners and emqx_ctl clients count are your friends.
  • Fix: Deploy multiple broker nodes. For EMQX, you’d start them with shared cluster.discovery.static.nodes or use a discovery mechanism like etcd or Kubernetes. Example emqx.conf snippet for static discovery:
    cluster {
      discovery {
        type = static
        static {
          nodes = ["node1@127.0.0.1", "node2@127.0.0.1"]
        }
      }
      # ... other cluster settings
    }
    
  • Why it works: Each node handles a subset of the total connections and subscriptions, and they coordinate to route messages between nodes for topics that have subscribers on different instances.

2. Load Balancing and Connection Routing:

You need a way to distribute incoming connections across your broker cluster.

  • Diagnosis: Clients are experiencing intermittent connection failures, or some broker nodes are heavily loaded while others are idle.
  • Check: Monitor connection counts per broker instance. If uneven, your load balancer isn’t distributing effectively.
  • Fix: Implement a TCP load balancer (e.g., HAProxy, AWS ELB/NLB, Nginx stream module) in front of your broker nodes. Configure it for sticky sessions if your broker implementation requires it (though most modern clustered brokers don’t). For HAProxy, a basic config:
    frontend mqtt_in
        bind *:1883
        mode tcp
        default_backend mqtt_servers
    
    backend mqtt_servers
        mode tcp
        balance roundrobin
        server node1 192.168.1.10:1883 check
        server node2 192.168.1.11:1883 check
    
  • Why it works: The load balancer acts as a single entry point, distributing new connections evenly across available broker nodes, preventing any single node from becoming a bottleneck.

3. Efficient Subscription Management:

When a topic has millions of subscribers, especially wildcard subscriptions (# or +/), the broker’s internal subscription tree can become enormous.

  • Diagnosis: High CPU usage related to subscription lookups or message routing, even with moderate connection counts. Slow publishing to widely subscribed topics.
  • Check: Observe broker metrics for "subscription count" and "message routing time." In EMQX, emqx_ctl subscriptions list can show the sheer volume.
  • Fix: Use brokers designed for efficient subscription indexing. EMQX, for example, uses a trie data structure optimized for wildcard matching. Ensure your broker is configured to leverage these optimizations. If using EMQX, tuning mqtt.max_topic_len and mqtt.max_subscription_len can help, though the core structure is key. Avoid overly broad wildcards if possible in your application design.
  • Why it works: Optimized data structures allow the broker to find matching subscribers for a published message much faster, reducing the time spent on lookups.

4. Connection Pooling and Re-use:

While standard MQTT clients manage their own connections, at the server side, efficient handling of TLS handshakes and connection state is crucial.

  • Diagnosis: High CPU spikes during periods of high connection churn, especially with TLS enabled.
  • Check: Monitor CPU usage on the broker nodes. Look for spikes correlating with new connection attempts.
  • Fix: Utilize brokers that support connection pooling or efficient TLS session resumption. EMQX, for example, has optimizations for this. Ensure your TLS configuration is set up for session caching/resumption. This isn’t typically a direct client-side fix but a broker configuration.
  • Why it works: Re-using existing TLS sessions avoids the expensive cryptographic operations of a full handshake for returning clients, significantly reducing CPU load.

5. Persistent Sessions and Message Queuing:

Handling clean session = false and QoS 1/2 messages for offline devices requires significant state.

  • Diagnosis: Brokers running out of memory, or messages being lost when clients disconnect and reconnect. High disk I/O if using disk-based persistence for messages.
  • Check: Monitor broker memory usage and disk I/O. Check broker logs for messages related to "out of memory" or "persistence queue full."
  • Fix:
    • Tune persistence: For EMQX, configure persistence.shared_sub_offline_msg.backend (e.g., mnesia for in-memory, mysql, pgsql, redis for external persistence) and persistence.shared_sub_offline_msg.queue_depth.
    • External persistence: For very high message volumes or many offline clients, offloading message persistence to a dedicated, scalable database (like Redis or a relational DB) is often necessary.
    • Memory limits: If using in-memory persistence (like Mnesia in EMQX), ensure the broker nodes have sufficient RAM and that vm.max_map_count is high enough on Linux (sudo sysctl -w vm.max_map_count=262144).
    # Example EMQX config for Redis persistence
    persistence {
      shared_sub_offline_msg {
        backend = redis
        queue_depth = 10000 # Adjust based on expected offline messages
        redis {
          # ... redis connection details ...
        }
      }
    }
    
  • Why it works: Distributing the state of offline messages across multiple nodes or an external, scalable store prevents a single broker from exhausting its resources managing this data.

6. Network Optimization:

Even with efficient software, network saturation or misconfiguration can be a bottleneck.

  • Diagnosis: High network traffic on broker nodes, packet loss, or slow message delivery.
  • Check: Use tools like iftop, nload, or cloud provider network monitoring to check bandwidth utilization and packet loss on broker instances.
  • Fix:
    • Increase network bandwidth: Ensure your instances have sufficient network throughput.
    • Tune TCP/IP stack: For very high connection counts, increasing kernel parameters like net.core.somaxconn and net.ipv4.tcp_max_syn_backlog can help.
    # On the broker nodes
    sudo sysctl -w net.core.somaxconn=4096
    sudo sysctl -w net.ipv4.tcp_max_syn_backlog=4096
    
    • Use a performant load balancer: Ensure your load balancer can handle the connection volume.
  • Why it works: Prevents the underlying network infrastructure or OS kernel from becoming a choke point for the sheer volume of TCP connections and data.

The next challenge you’ll face after successfully scaling to millions of devices is managing the security and authentication for that many unique identities, especially with dynamic credential rotation.

Want structured learning?

Take the full Mqtt course →