A NATS Supercluster can span multiple cloud providers, but it doesn’t magically make them behave like a single, unified network; you’re still dealing with the inherent latency and reliability differences of each cloud.

Let’s see what that looks like in practice. Imagine a simple publisher-subscriber scenario. We’ll have a publisher in AWS us-east-1 sending messages to a subscriber in GCP us-central1.

package main

import (
	"log"
	"time"

	"github.com/nats-io/nats.go"
)

func main() {
	// Publisher in AWS
	natsURLAWS := "nats://nats-aws.example.com:4222"
	ncAWS, err := nats.Connect(natsURLAWS)
	if err != nil {
		log.Fatalf("AWS NATS Connect Error: %v", err)
	}
	defer ncAWS.Close()
	log.Printf("Connected to AWS NATS at %s", natsURLAWS)

	// Subscriber in GCP
	natsURLGCP := "nats://nats-gcp.example.com:4222"
	ncGCP, err := nats.Connect(natsURLGCP)
	if err != nil {
		log.Fatalf("GCP NATS Connect Error: %v", err)
	}
	defer ncGCP.Close()
	log.Printf("Connected to GCP NATS at %s", natsURLGCP)

	// Subscribe to a subject
	subject := "global.events"
	_, err = ncGCP.Subscribe(subject, func(msg *nats.Msg) {
		log.Printf("GCP Subscriber received on '%s': %s", msg.Subject, string(msg.Data))
	})
	if err != nil {
		log.Fatalf("GCP Subscribe Error: %v", err)
	}
	log.Printf("GCP Subscriber subscribed to '%s'", subject)

	// Publish a message from AWS
	message := "Hello from AWS to GCP!"
	err = ncAWS.Publish(subject, []byte(message))
	if err != nil {
		log.Fatalf("AWS Publish Error: %v", err)
	}
	log.Printf("AWS Publisher sent on '%s': %s", subject, message)

	// Keep the subscriber alive for a bit
	time.Sleep(5 * time.Second)
	log.Println("Exiting.")
}

In this setup, nats-aws.example.com and nats-gcp.example.com represent the endpoints of NATS servers in different cloud environments. The nats.Connect calls establish connections to these respective NATS clusters. The subscriber in GCP listens on global.events, and the publisher in AWS sends messages to the same subject. When the message travels from AWS to GCP, it traverses the internet, and the latency is dictated by the network path between those two regions.

The core problem NATS Supercluster solves here is distributed messaging across geographically disparate NATS deployments. A "supercluster" is essentially a federation of NATS clusters. Each individual cluster (cluster1, cluster2, etc.) can be a single server or a cluster of servers within a single data center or cloud region. These individual clusters then form a supercluster by establishing persistent, bi-directional connections to each other. This allows messages published to a subject in one cluster to be routed to subscribers in other clusters, even if they are in entirely different cloud providers or continents.

The key configuration for this is the cluster directive in the NATS server configuration file. For example, within the NATS configuration for a cluster in AWS, you might have:

{
  "cluster": {
    "name": "aws-cluster",
    "routes": [
      "nats://nats-gcp.example.com:6222",
      "nats://nats-azure.example.com:6222"
    ],
    "no_advertise": false,
    "connect_timeout": 5,
    "ping_interval": 10,
    "ping_max": 5
  },
  "listen": 4222,
  "http": 8222
}

Here, nats-gcp.example.com:6222 and nats-azure.example.com:6222 are the cluster ports (defaulting to 6222) of other NATS clusters that this AWS cluster wants to connect to. The routes array tells the NATS server in AWS where to find other clusters. The NATS server will attempt to establish and maintain a connection to each of these specified addresses. Once connected, they exchange information about their local subjects and clients, enabling message routing.

When a message is published in the AWS cluster, the NATS server there checks its routing table. If the subject is known to be subscribed to in another cluster (e.g., the GCP cluster), the message is forwarded over the established route to the GCP cluster’s NATS server. The GCP server then delivers it to its local subscribers. This process is symmetrical; messages published in GCP can be routed to AWS.

The "supercluster" itself is not a single entity but a logical grouping. Each participating NATS server acts as a gateway to its local clients and a client to other NATS servers in the supercluster. The no_advertise: false setting is crucial for enabling dynamic routing and discovery within the supercluster. If set to true, the server would not advertise its presence or its known subjects to other clusters, severely limiting its ability to participate in supercluster routing.

The connect_timeout, ping_interval, and ping_max settings are vital for maintaining the health of these inter-cluster connections. connect_timeout (e.g., 5 seconds) is how long a server will wait to establish a connection to a remote cluster. ping_interval (e.g., 10 seconds) is how often it sends a keep-alive ping, and ping_max (e.g., 5) is the number of unanswered pings before the connection is considered dead. These parameters must be tuned based on the expected network latency and reliability between the cloud providers. High latency or intermittent connectivity might require longer intervals and more retries.

One aspect that often surprises people is how NATS handles subject mapping and routing across superclusters. It’s not a strict, predefined map; rather, it’s a dynamic, learned behavior. When a NATS server in cluster A receives a message for subject foo.bar and needs to send it to cluster B, it learns that cluster B has interest in foo.bar. Conversely, if a client in cluster B subscribes to foo.bar, cluster B’s NATS server will advertise this interest to cluster A. This distributed, gossip-like mechanism allows the supercluster to efficiently route messages without a central orchestrator for subject mapping, but it also means there can be a slight delay (measured in milliseconds to seconds, depending on network conditions and configuration) between a subscription occurring in one cluster and the first message being routed to it from another.

The next challenge you’ll likely encounter is managing security and authentication across these distinct cloud environments.

Want structured learning?

Take the full Nats course →