Running Gatling load tests on Kubernetes is more about orchestrating distributed execution than about Gatling itself.
Let’s see Gatling in action, not on a single machine, but across multiple pods.
Here’s a minimal gatling.conf that tells Gatling how to behave in a distributed setup. It’s not about http.port or ssl.port here, but about coordinating multiple Gatling instances.
gatling {
core {
directory {
reports = "target/gatling"
}
}
http {
ssl {
useOpenSsl := false
}
}
charting {
noReports = false
}
simulation {
className = "com.example.MySimulation"
}
distribution {
# This is the magic part for Kubernetes
# We'll dynamically set the clusterSize and nodeIndex
# via environment variables or command-line args
# clusterSize = ${?GATLING_CLUSTER_SIZE}
# nodeIndex = ${?GATLING_NODE_INDEX}
}
}
The core idea is that instead of one Gatling instance generating all the load, you have many instances, each running a part of the total load. Kubernetes is perfect for this because it can manage these instances as pods, scaling them up and down, and ensuring they communicate correctly.
Imagine you want to simulate 10,000 users. Instead of trying to cram all 10,000 users onto one overloaded machine, you can spin up 100 pods, each simulating 100 users. Kubernetes handles the distribution.
The key configuration in gatling.conf for this is the distribution block. clusterSize tells Gatling how many total instances (pods) are participating in the test, and nodeIndex tells each specific instance which number it is (0 to clusterSize - 1).
Here’s a simple Kubernetes Deployment manifest to get Gatling running. Notice how we inject the GATLING_CLUSTER_SIZE and GATLING_NODE_INDEX environment variables.
apiVersion: apps/v1
kind: Deployment
metadata:
name: gatling-load-test
spec:
replicas: 10 # This will be our clusterSize
selector:
matchLabels:
app: gatling
template:
metadata:
labels:
app: gatling
spec:
containers:
- name: gatling
image: catalinm.azurecr.io/gatling:3.8.0 # Or your custom Gatling image
env:
- name: GATLING_CLUSTER_SIZE
value: "10" # Matches replicas
- name: GATLING_NODE_INDEX
valueFrom:
fieldRef:
fieldPath: metadata.labels['pod-index'] # This needs to be set by a mutating webhook or similar
# A more direct way to set nodeIndex if you can't use fieldPath for labels
# You'd typically use a Job with completions and parallelism, or an operator
# For a simple Deployment, managing nodeIndex directly is tricky.
# Let's assume a Job for proper index assignment.
command: ["/bin/sh", "-c"]
args:
- >
echo "Starting Gatling node ${GATLING_NODE_INDEX} of ${GATLING_CLUSTER_SIZE}"
/opt/gatling/bin/gatling.sh -rd "My Distributed Test" \
-s com.example.MySimulation \
-rf /opt/gatling/results \
-b $GATLING_NODE_INDEX \
-bf $GATLING_CLUSTER_SIZE \
-sims 100 # Users per node
The GATLING_NODE_INDEX is the tricky part in a standard Deployment. For proper, sequential indexing, a Kubernetes Job with completions and parallelism is often a better fit, or a dedicated Kubernetes Operator for Gatling. The Job automatically assigns indices. For a Deployment, you’d typically need a mutating webhook or a sidecar to inject this unique index per pod.
When gatling.sh runs, it reads GATLING_CLUSTER_SIZE and GATLING_NODE_INDEX. It then uses this information to:
- Distribute simulations: Each node only runs a fraction of the total simulated users. If
clusterSizeis 10 andnodeIndexis 3, and you want to simulate 10,000 users total, this node will simulate 1,000 users. - Aggregate results: Gatling instances communicate (often via a shared filesystem like NFS or S3, or through Gatling Enterprise’s reporting features) to combine their individual results into a single, comprehensive report. The standard
gatling.shcommand with distribution enabled expects to find a shared directory for results.
The essential command-line arguments for distributed mode are -b (for nodeIndex) and -bf (for clusterSize). These override the configuration file.
/opt/gatling/bin/gatling.sh -rd "My Distributed Test" \
-s com.example.MySimulation \
-rf /opt/gatling/results \
-b $GATLING_NODE_INDEX \
-bf $GATLING_CLUSTER_SIZE \
-sims 100 # Users per node
The crucial part that most people miss is how Gatling aggregates the results. By default, it expects a shared filesystem (like an NFS mount or a cloud storage bucket mounted as a volume) where each Gatling pod can write its partial results, and then a final aggregation step can combine them. If you don’t have this shared filesystem configured, each pod will generate its own independent (and thus incomplete) report.
If you’re using Gatling Enterprise, this aggregation is handled automatically by the controller. For open-source Gatling on Kubernetes, you’ll need to manage the shared storage for results yourself.
Once your Gatling pods are running and generating reports to a shared location, you’ll typically want to expose these reports. A Kubernetes Service of type LoadBalancer or NodePort pointing to a web server (like Nginx) that serves the target/gatling directory is a common pattern.
The next hurdle you’ll likely face is efficiently collecting and viewing these distributed reports, especially if you’re not using Gatling Enterprise.