NATS doesn’t actually deduplicate messages; it prevents deduplication by ensuring each message has a unique identifier.

Let’s see this in action. Imagine you have a service publishing messages to a NATS subject.

package main

import (
	"fmt"
	"log"
	"time"

	"github.com/nats-io/nats.go"
)

func main() {
	// Connect to NATS
	nc, err := nats.Connect(nats.DefaultURL)
	if err != nil {
		log.Fatalf("Failed to connect to NATS: %v", err)
	}
	defer nc.Close()

	log.Println("Connected to NATS")

	// Publish a message with a unique ID
	msgID := fmt.Sprintf("msg-%d", time.Now().UnixNano())
	err = nc.Publish("my.subject", []byte("Hello, NATS!"))
	if err != nil {
		log.Fatalf("Failed to publish message: %v", err)
	}
	nc.Flush() // Ensure the message is sent

	log.Printf("Published message with ID: %s", msgID)

	// Simulate a scenario where the same message might be sent again
	// In a real-world scenario, this could be due to a publisher retry mechanism
	// or a bug. NATS will accept this as a *new* message because its ID is different.
	// However, if we were using a NATS JetStream context with explicit deduping,
	// this is where the configuration would matter.

	// For demonstration, let's just acknowledge the publish.
	// The core concept here is that NATS *itself* doesn't inherently
	// prevent duplicate *content*, but JetStream leverages message IDs
	// to provide deduplication guarantees for producers.
}

The real power for deduplication in NATS comes with JetStream, NATS’s persistence and streaming layer. JetStream’s producer deduplication feature ensures that if a client publishes the same message multiple times within a configured window, only the first successful publication is processed. This is crucial for building reliable, exactly-once processing systems.

The Msg-ID Dedup Window is a server-side configuration that defines how long JetStream will remember a message ID. When a producer publishes a message with a Msg-ID header, JetStream checks its internal cache. If it finds a matching Msg-ID within the configured dedup window, it will reject the subsequent publication, returning an error. If the Msg-ID is not found or is outside the window, the message is processed, and its Msg-ID is added to the cache.

Here’s how it works internally: JetStream maintains a cache of recently seen Msg-IDs. This cache is typically held in memory for performance. When a message arrives with a Msg-ID, JetStream hashes the Msg-ID and checks if that hash exists in the cache. If it does, it performs a full comparison. If the Msg-ID matches a previously seen one within the dedup window, the message is discarded. If it’s new or outside the window, it’s accepted, and its Msg-ID is added to the cache. The cache is a sliding window; older Msg-IDs eventually expire and are removed to prevent unbounded memory growth.

The Msg-ID Dedup Window is configured on the NATS server itself, typically in the nats-server.conf file. For example, to set the dedup window to 5 minutes for all JetStream streams:

jetstream {
  max_memory_stream_bytes: 67108864
  max_file_stream_bytes: 1073741824
  # Dedup window for producers, in seconds. Default is 120 seconds (2 minutes).
  # Setting it to 300 seconds (5 minutes)
  max_dedup_pending: 300
}

When a producer publishes a message using JetStream, it must include the Msg-ID header. This can be done programmatically:

package main

import (
	"fmt"
	"log"
	"time"

	"github.com/nats-io/nats.go"
)

func main() {
	// Connect to NATS
	nc, err := nats.Connect(nats.DefaultURL)
	if err != nil {
		log.Fatalf("Failed to connect to NATS: %v", err)
	}
	defer nc.Close()

	// Create a JetStream context
	js, err := nc.JetStream()
	if err != nil {
		log.Fatalf("Failed to create JetStream context: %v", err)
	}

	log.Println("Connected to NATS JetStream")

	// Publish a message with a unique Msg-ID
	msgID := fmt.Sprintf("my-unique-message-%d", time.Now().UnixNano())
	_, err = js.Publish("orders.new", []byte("New order details"), nats.MsgId(msgID))
	if err != nil {
		log.Fatalf("Failed to publish message with Msg-ID: %v", err)
	}
	log.Printf("Published message with Msg-ID: %s", msgID)

	// Simulate sending the same message again within the dedup window
	// This second publish *should* be deduplicated by JetStream if the
	// max_dedup_pending is configured and the server is running JetStream.
	time.Sleep(1 * time.Second) // Short delay to ensure it's within the window

	log.Printf("Attempting to publish the same message again with Msg-ID: %s", msgID)
	_, err = js.Publish("orders.new", []byte("New order details"), nats.MsgId(msgID))
	if err != nil {
		// This error is expected if deduplication works
		log.Printf("Second publish failed as expected due to deduplication: %v", err)
	} else {
		log.Println("Second publish succeeded unexpectedly - deduplication may not be configured or working.")
	}
}

The max_dedup_pending configuration directly controls the server’s memory usage for tracking these Msg-IDs. A larger value means JetStream can track more unique message IDs for longer, providing a more robust guarantee against accidental duplicate processing, but at the cost of increased memory consumption. If you find that duplicate messages are still being processed, it’s a strong indicator that either your max_dedup_pending is too small for your producer’s retry window, or the NATS server isn’t running with JetStream enabled.

The most common mistake is assuming that simply setting max_dedup_pending is enough without understanding that the producer must also actively use the Msg-ID header for JetStream to perform deduplication.

Want structured learning?

Take the full Nats course →