Tune HAProxy nbproc and nbthread for Multi-Core Performance (2026)

HAProxy’s nbproc and nbthread settings don’t just distribute work; they fundamentally change how HAProxy handles network connections, often leading to performance gains that feel more like a system-level optimization than an application tweak.

Let’s see it in action. Imagine you have a multi-core server and a busy HAProxy instance. Without tuning, HAProxy runs as a single process.

# ps aux | grep haproxy
haproxy  12345  0.0  0.1 123456 7890 ?        Ss   Oct26   0:15 /usr/local/sbin/haproxy -f /etc/haproxy/haproxy.cfg

Now, let’s say we want to leverage our 8 cores. We’d add these to our global section in haproxy.cfg:

global
    log /dev/log local0
    maxconn 4096
    nbproc 8
    nbthread 4
    cpu-map 1 0-1
    cpu-map 2 2-3
    cpu-map 3 4-5
    cpu-map 4 6-7
    cpu-map 5 0-1
    cpu-map 6 2-3
    cpu-map 7 4-5
    cpu-map 8 6-7
    daemon
    stats socket /run/haproxy.sock mode 660 level admin expose-fd listeners
    stats timeout 30s
    user haproxy
    group haproxy
    pidfile /var/run/haproxy.pid

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http

listen http_front
    bind *:80
    balance roundrobin
    server s1 192.168.1.10:80 check
    server s2 192.168.1.11:80 check

After reloading HAProxy, you’ll see multiple processes, each bound to specific CPU cores.

# ps aux | grep haproxy
haproxy  12345  0.0  0.1 123456 7890 ?        Ss   Oct26   0:15 /usr/local/sbin/haproxy -f /etc/haproxy/haproxy.cfg
haproxy  54321  1.2  0.5 123456 12345 ?       S    10:00   0:05 /usr/local/sbin/haproxy -f /etc/haproxy/haproxy.cfg
haproxy  54322  1.1  0.5 123456 12345 ?       S    10:00   0:04 /usr/local/sbin/haproxy -f /etc/haproxy/haproxy.cfg
haproxy  54323  1.0  0.4 123456 11111 ?       S    10:00   0:04 /usr/local/sbin/haproxy -f /etc/haproxy/haproxy.cfg
haproxy  54324  0.9  0.4 123456 11111 ?       S    10:00   0:03 /usr/local/sbin/haproxy -f /etc/haproxy/haproxy.cfg
haproxy  54325  0.8  0.4 123456 11111 ?       S    10:00   0:03 /usr/local/sbin/haproxy -f /etc/haproxy/haproxy.cfg
haproxy  54326  0.7  0.3 123456 10000 ?       S    10:00   0:02 /usr/local/sbin/haproxy -f /etc/haproxy/haproxy.cfg
haproxy  54327  0.6  0.3 123456 10000 ?       S    10:00   0:02 /usr/local/sbin/haproxy -f /etc/haproxy/haproxy.cfg
haproxy  54328  0.5  0.3 123456 10000 ?       S    10:00   0:01 /usr/local/sbin/haproxy -f /etc/haproxy/haproxy.cfg

The problem this solves is the single-threaded nature of network I/O. A single HAProxy process, even on a multi-core CPU, can only process so many network events concurrently without blocking. nbproc creates multiple independent HAProxy processes, each capable of handling its own set of connections. nbthread then allows each of these processes to use multiple threads for I/O, further increasing concurrency.

The nbproc setting determines the number of HAProxy processes. Each process will bind to a specific set of CPU cores. nbthread determines the number of threads within each process that will handle network I/O. The cpu-map directive explicitly assigns which CPU cores each process (identified by a sequential number starting from 1) will use. If you don’t use cpu-map, HAProxy will attempt to distribute them automatically, but explicit mapping is usually better for predictable performance. nbproc should generally be set to the number of CPU cores you want HAProxy to utilize. nbthread is often set to 1 or 2; higher values might not yield significant gains due to thread-contention overhead.

When you set nbproc 8 and nbthread 4, you’re telling HAProxy to launch 8 worker processes. The cpu-map directives then tell HAProxy how to pin these processes to your CPU cores. For instance, cpu-map 1 0-1 means the first worker process will be pinned to CPU cores 0 and 1. If you have 8 cores, you’d typically want to map each process to at least one core. If you set nbthread greater than 1, each of those 8 processes will then spawn 4 threads to handle network I/O, further segmenting the workload. The master process (the one that started the others) will not have its I/O handled by these threads.

The most surprising outcome of this configuration is often not just increased throughput, but also a reduction in latency, especially under high load. This is because each process and thread is less likely to be starved by a single, overwhelming connection or a long-running internal operation, leading to more consistent response times.

The next logical step after tuning nbproc and nbthread is to investigate how HAProxy’s connection management, such as maxconn and timeout settings, interact with these multi-process/multi-threaded configurations.