NATS Horizontal Scaling: Add Servers for High Throughput (2026)

Adding more NATS servers to your cluster is the primary way to scale for higher throughput, but it’s not as simple as just spinning up more instances.

Here’s a NATS cluster in action, showing a simple publisher and subscriber communicating through a central NATS server.

// publisher.go
package main

import (
	"log"
	"time"

	"github.com/nats-io/nats.go"
)

func main() {
	// Connect to NATS
	nc, err := nats.Connect(nats.DefaultURL) // Assumes NATS is running on nats://localhost:4222
	if err != nil {
		log.Fatalf("Error connecting to NATS: %v", err)
	}
	defer nc.Close()

	log.Println("Publisher connected to NATS.")

	// Publish messages
	for i := 0; ; i++ {
		message := []byte("Hello NATS #" + string(rune(i)))
		err := nc.Publish("updates", message)
		if err != nil {
			log.Printf("Error publishing message: %v", err)
		} else {
			log.Printf("Published: %s", message)
		}
		time.Sleep(500 * time.Millisecond)
	}
}

// subscriber.go
package main

import (
	"log"
	"time"

	"github.com/nats-io/nats.go"
)

func main() {
	// Connect to NATS
	nc, err := nats.Connect(nats.DefaultURL) // Assumes NATS is running on nats://localhost:4222
	if err != nil {
		log.Fatalf("Error connecting to NATS: %v", err)
	}
	defer nc.Close()

	log.Println("Subscriber connected to NATS.")

	// Subscribe to messages
	sub, err := nc.Subscribe("updates", func(msg *nats.Msg) {
		log.Printf("Received message: %s on subject %s", msg.Data, msg.Subject)
	})
	if err != nil {
		log.Fatalf("Error subscribing to subject: %v", err)
	}
	defer sub.Unsubscribe()

	// Keep the subscriber running
	select {}
}

When you run go run publisher.go and go run subscriber.go against a single NATS server, you’ll see messages flowing between them. This is the baseline.

To scale horizontally, you introduce more NATS servers and configure them to form a cluster. Each server in the cluster can handle connections from clients and route messages to other servers as needed. This distributes the load and increases the overall capacity of the NATS system.

The core mechanism for clustering is NATS SuperClusters and Cluster URLs. When NATS servers are configured with a cluster.listen address and a cluster.routes array pointing to other servers, they establish persistent connections. These connections allow them to forward messages that are not directly handled by the server a client is connected to.

A client connects to a single NATS server within the cluster. The NATS client library is cluster-aware. If the client sends a message to a subject that no subscriber is currently connected to on that specific server, the server will forward the message to other servers in the cluster that might have subscribers for that subject. This dynamic routing is key to achieving high throughput.

The cluster.routes configuration is crucial. It defines the initial known peers for a server to connect to. Once connected, servers will discover other members of the cluster through these routes.

For instance, a nats-server configuration file (nats-server.conf) might look like this:

# nats-server-1.conf
server_name: nats-server-1
listen: 0.0.0.0:4222
cluster {
  listen: 0.0.0.0:6222
  routes = [
    "nats://nats-server-2:6222",
    "nats://nats-server-3:6222"
  ]
}

And for nats-server-2:

# nats-server-2.conf
server_name: nats-server-2
listen: 0.0.0.0:4222
cluster {
  listen: 0.0.0.0:6222
  routes = [
    "nats://nats-server-1:6222",
    "nats://nats-server-3:6222"
  ]
}

When clients connect, they’d use the listen address of any server, e.g., nats://nats-server-1:4222. The NATS client library will automatically discover other cluster members and provide a more resilient and scalable experience.

The actual levers you control are the cluster.listen port, which is how servers talk to each other, and cluster.routes, which tells a server how to find its peers. The listen port is where clients connect.

One thing that often surprises people is how NATS handles message distribution within a cluster. If you have multiple subscribers on different servers for the same subject (and it’s not a queue group), each subscriber will receive a copy of the message. This is NATS’s default behavior for fan-out and is highly efficient for broadcasting information. You only get deduplication if you explicitly use NATS queue groups, where only one subscriber within a group receives a message.

As you add more servers and clients connect to different nodes, the NATS client library’s internal logic for selecting a server to connect to and for publishing messages becomes critical for maximizing throughput and resilience.

The next step in scaling beyond just adding more NATS servers is understanding NATS JetStream for persistence and advanced messaging patterns.