NATS queue groups don’t actually load balance subscribers; they distribute messages to one subscriber per message within the group.

Let’s see this in action. Imagine we have a service that processes user signups. We want to ensure that each signup request is handled by only one instance of our signup processor, but we want to be able to scale our signup processors horizontally. We’ll use NATS queue groups for this.

First, start a NATS server. If you don’t have one, you can download it from nats.io or run it via Docker:

docker run -p 4222:4222 nats:latest

Now, let’s simulate some signup requests. We’ll publish messages to the users.signup subject.

package main

import (
	"log"
	"time"

	"github.com/nats-io/nats.go"
)

func main() {
	// Connect to NATS
	nc, err := nats.Connect(nats.DefaultURL)
	if err != nil {
		log.Fatalf("Failed to connect to NATS: %v", err)
	}
	defer nc.Close()

	log.Println("Publishing signup requests...")

	for i := 1; i <= 10; i++ {
		subject := "users.signup"
		message := []byte("user_id_" + string(rune('0'+i)))
		if err := nc.Publish(subject, message); err != nil {
			log.Printf("Failed to publish message %d: %v", i, err)
		} else {
			log.Printf("Published: %s", message)
		}
		time.Sleep(500 * time.Millisecond)
	}

	nc.Flush()
	log.Println("Finished publishing.")
}

Next, we’ll create two (or more!) instances of our signup processor. Each processor will subscribe to users.signup but join the same queue group, let’s call it signup_processors.

package main

import (
	"log"
	"os"
	"os/signal"
	"runtime"
	"syscall"
	"time"

	"github.com/nats-io/nats.go"
)

func main() {
	// Connect to NATS
	nc, err := nats.Connect(nats.DefaultURL)
	if err != nil {
		log.Fatalf("Failed to connect to NATS: %v", err)
	}
	defer nc.Close()

	queueGroupName := "signup_processors"
	subject := "users.signup"

	// Subscribe to the subject with a queue group
	_, err = nc.QueueSubscribe(subject, queueGroupName, func(msg *nats.Msg) {
		log.Printf("Processor [%s] received message: %s", queueGroupName, string(msg.Data))
		// Simulate processing time
		time.Sleep(2 * time.Second)
		log.Printf("Processor [%s] finished processing: %s", queueGroupName, string(msg.Data))
	})
	if err != nil {
		log.Fatalf("Failed to subscribe to %s in queue group %s: %v", subject, queueGroupName, err)
	}

	log.Printf("Processor [%s] started, listening on subject '%s'", queueGroupName, subject)

	// Keep the subscriber running
	runtime.Goexit() // This will cause the main goroutine to exit, but other goroutines (like NATS client) will continue.
	// A more robust way would be to use a channel and wait for a signal.

	// Graceful shutdown
	sigChan := make(chan os.Signal, 1)
	signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)
	<-sigChan
	log.Println("Shutting down...")
	nc.Drain()
	log.Println("NATS connection drained.")
}

Save this as subscriber.go. Now, open two separate terminal windows and run:

Terminal 1:

go run subscriber.go

Terminal 2:

go run subscriber.go

And in a third terminal, run the publisher code (save it as publisher.go):

go run publisher.go

As the publisher sends messages, you’ll see that each message is printed by only one of the subscriber terminals. One processor will pick up user_id_1, another will pick up user_id_2, and so on. They will not process the same message. This is the core behavior of queue groups: message distribution, not parallel processing of the same message. The NATS server internally manages which subscriber in the queue group gets the next message.

The mental model here is that NATS treats all subscribers in the same queue group as a single logical consumer. When a message arrives for a subject with an associated queue group, the NATS server picks one subscriber from that group to deliver the message to. This is typically done in a round-robin fashion, but the exact mechanism is an internal detail you don’t usually need to worry about. The key is that it’s one-to-one delivery within the group.

If you wanted true parallel processing of each message, you would have each subscriber subscribe to the subject without a queue group. In that scenario, every subscriber would receive every message published to the subject. Queue groups are for distributing the workload, ensuring each message is handled exactly once by some member of the group.

The power of queue groups comes from their ability to provide both high availability and scalability. If one subscriber in the signup_processors queue group goes down, the NATS server will simply start delivering messages to the remaining subscribers. If you need to handle more signup requests, you just spin up more instances of your subscriber application, and they’ll automatically join the signup_processors queue group and start taking messages. The NATS server handles the distribution automatically.

A common misconception is that queue groups automatically "load balance" in the sense of sending multiple copies of a message to different subscribers for parallel processing. This is not the case. The primary purpose is to ensure that a message is processed by exactly one member of a group of consumers, thereby distributing the load of handling messages across multiple instances of a service.

The next concept you’ll likely explore is how to implement more sophisticated message routing and filtering within NATS, such as using subjects with wildcards or exploring NATS JetStream for durable message queues.

Want structured learning?

Take the full Nats course →