Nginx Load Balancer: Configure Upstream Routing (2026)

Nginx can route requests to multiple backend servers, but it doesn’t actually balance them by default; it just picks one and sticks with it until it fails.

Let’s see Nginx in action, routing traffic to a couple of simple backend Python Flask apps.

First, we need some backend servers. We’ll spin up two basic Flask apps that just report which server they are.

# app1.py
from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello():
    return "Hello from Server 1!"

if __name__ == '__main__':
    app.run(host='127.0.0.1', port=5001)

# app2.py
from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello():
    return "Hello from Server 2!"

if __name__ == '__main__':
    app.run(host='127.0.0.1', port=5002)

Save these as app1.py and app2.py. Run them in separate terminals:

python app1.py

python app2.py

Now, let’s configure Nginx. We’ll create a simple Nginx configuration file, say nginx.conf, in a directory like /etc/nginx/conf.d/.

# /etc/nginx/conf.d/loadbalancer.conf

http {
    upstream backend_servers {
        server 127.0.0.1:5001;
        server 127.0.0.1:5002;
    }

    server {
        listen 80;
        server_name localhost;

        location / {
            proxy_pass http://backend_servers;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }
    }
}

The upstream block is where the magic happens. We define a named group of servers, backend_servers. Inside this block, we list our actual backend servers using the server directive. Nginx will use these defined servers to fulfill requests directed to http://backend_servers. The proxy_pass directive in the location block tells Nginx to forward requests to this upstream group.

With this configuration, Nginx will listen on port 80. When a request comes in for localhost, it will hit the location / block. proxy_pass http://backend_servers; tells Nginx to send that request to one of the servers defined in the upstream backend_servers block.

To test this, make sure Nginx is running with this configuration (you might need to reload it: sudo nginx -s reload). Then, open your web browser or use curl to repeatedly request http://localhost.

curl http://localhost
# Expected output: Hello from Server 1!

curl http://localhost
# Expected output: Hello from Server 2!

curl http://localhost
# Expected output: Hello from Server 1!

You’ll see the responses alternating between "Hello from Server 1!" and "Hello from Server 2!". By default, Nginx uses a round-robin algorithm, sending requests sequentially to each server in the upstream group.

The server directives within upstream can also take parameters to control how Nginx interacts with them. For instance, weight=N assigns a higher weight to a server, meaning it will receive a proportionally larger share of traffic. max_fails=N and fail_timeout=T define how Nginx detects and handles unhealthy backend servers. If a server fails max_fails times within fail_timeout seconds, Nginx will temporarily consider it unavailable.

The proxy_set_header directives are crucial for passing important information about the original client request to the backend servers. Without them, the backend servers would only see requests originating from Nginx itself, losing valuable context like the client’s IP address. X-Forwarded-For is particularly important as it builds a list of IP addresses the request has passed through, including the original client.

One aspect of upstream blocks that often surprises people is that the default load balancing method isn’t just round-robin for new connections; it’s a form of sticky sessions based on the client’s IP address if you use ip_hash. If you don’t specify ip_hash or another method, it’s a simple round-robin that can result in a single client hitting the same backend server multiple times in a row, but it’s not guaranteed. The round-robin distribution is for the pool of incoming connections, not necessarily individual client requests over time unless you explicitly configure sticky sessions.

The next step is to explore more advanced load balancing algorithms like least-connected or weighted round-robin.