Auto-Scaling vs. Load Balancing: Which Controls Your App's Fate?

Nginx can dynamically discover and load balance across a fleet of backend servers that are constantly joining and leaving the pool, without requiring a restart or reload.

Let’s say you have a microservice running in a containerized environment like Kubernetes or Docker Swarm. As user load increases, you spin up more instances of that service. As load decreases, you scale them down. Your Nginx load balancer needs to keep up.

Here’s a simplified Nginx configuration that accomplishes this:

http {
    upstream backend_services {
        # This is the magic line for dynamic discovery
        # It tells Nginx to look for services named 'my-app'
        # in the 'default' namespace via the Kubernetes API
        # and use their IP addresses and ports.
        # The 'port 80' specifies the port the service is listening on.
        # The 'service my-app resolve' tells Nginx to watch for changes.
        server kube_backend_services:80 resolve;
    }

    server {
        listen 80;
        server_name example.com;

        location / {
            proxy_pass http://backend_services;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }
    }
}

In this example, upstream backend_services defines a group of servers. The key part is server kube_backend_services:80 resolve;. This directive, when used with specific Nginx modules (like ngx_http_vhost_traffic_status_module or ngx_http_upstream_jdomain for general DNS SRV records, but more commonly ngx_http_core_module’s resolver directive in conjunction with a service discovery mechanism), tells Nginx to use a DNS resolver to find the addresses for kube_backend_services.

When resolve is used, Nginx doesn’t just perform a one-time DNS lookup. It continuously monitors the DNS records for kube_backend_services. If the DNS records change (because new instances of your backend service have registered or old ones have deregistered), Nginx automatically updates its list of available upstream servers without needing a nginx -s reload or nginx -s reconfig.

The resolver directive, often configured globally or within the http block, is crucial. It specifies the DNS servers Nginx should query. For Kubernetes, this is typically the cluster’s internal DNS service, like 10.96.0.10 (CoreDNS or kube-dns).

http {
    # Specify the DNS resolver(s) to use.
    # For Kubernetes, this is usually the cluster's DNS service IP.
    resolver 10.96.0.10 valid=30s ipv6=off;

    upstream backend_services {
        # 'my-app.default.svc.cluster.local' is the fully qualified domain name (FQDN)
        # for a Kubernetes service named 'my-app' in the 'default' namespace.
        # 'port 80' is the target port of the service.
        # 'resolve' tells Nginx to dynamically discover backend IPs via DNS.
        server my-app.default.svc.cluster.local:80 resolve;
    }

    server {
        listen 80;
        server_name example.com;

        location / {
            proxy_pass http://backend_services;
            # ... (standard proxy headers)
        }
    }
}

When Nginx receives a request for example.com, it looks at the proxy_pass directive. It then queries the resolver (e.g., 10.96.0.10) for the A records of my-app.default.svc.cluster.local. The DNS server returns a list of IP addresses for the currently running pods backing that service. Nginx selects one of these IPs based on its load balancing algorithm (round-robin by default) and forwards the request. If a pod dies and its IP is no longer returned by DNS, Nginx stops sending traffic to it. If a new pod starts and gets an IP, DNS will return it, and Nginx will start sending traffic to it.

The valid=30s in the resolver directive means Nginx will cache DNS responses for 30 seconds. It will re-query DNS for my-app.default.svc.cluster.local at least every 30 seconds to ensure it has the latest list of backend IPs. This interval should be tuned based on how quickly your backend instances are expected to come and go.

This dynamic discovery mechanism is fundamental for cloud-native architectures. It decouples the load balancer from the ephemeral nature of containerized applications, allowing for true elasticity. Instead of manually updating Nginx config files or relying on external orchestration tools to signal Nginx, the system naturally adapts to changes in the backend pool.

The most surprising thing about this setup is that Nginx itself is performing the service discovery, leveraging standard DNS protocols rather than needing a proprietary agent or complex integration. It’s essentially treating your service discovery mechanism (like Kubernetes DNS) as a dynamic, changing DNS zone.

The proxy_pass directive, when combined with resolve, doesn’t just resolve the name to an IP once. It resolves the name to a set of IPs and continuously updates that set as the DNS records for that name change. Nginx is acting as a smart DNS client that applies the results directly to its upstream server list. The load balancing algorithm then operates on this dynamically updated list.

The specific DNS name Nginx resolves depends entirely on your service discovery system. For Kubernetes, it’s the Kubernetes Service FQDN. For other systems, it might be a specific SRV record or a custom DNS entry managed by your orchestration. The key is that Nginx is configured to watch that DNS name.

The next concept you’ll need to grapple with is how Nginx handles health checks in a dynamic environment. While DNS tells Nginx what IPs are available, it doesn’t tell Nginx if those IPs are actually healthy.