Jenkins agent disconnected during build because the agent process crashed unexpectedly, leaving the Jenkins controller unable to communicate with it.

This usually happens because the agent process itself is running out of resources, the network connection between the controller and agent is unstable, or a misconfiguration on the agent is causing it to exit.

Cause 1: Agent JVM Out of Memory

The Jenkins agent runs as a Java Virtual Machine (JVM). If the build process on the agent consumes too much memory, the JVM can crash with an OutOfMemoryError.

Diagnosis: Check the Jenkins agent’s log files for OutOfMemoryError. These are typically found in the agent’s working directory, often under a logs subdirectory. Look for lines containing "java.lang.OutOfMemoryError".

Fix: Increase the JVM heap space allocated to the agent. Edit the agent’s launch script (e.g., agent.sh or agent.bat) or the systemd service file if it’s running as a service. Add or modify the JAVA_OPTS environment variable to include -Xmx4g (for 4 gigabytes of heap space).

Example for agent.sh:

export JAVA_OPTS="-Xmx4g -XX:+HeapDumpOnOutOfMemoryError"
./agent.sh

This tells the JVM to reserve up to 4GB of memory for its heap, giving it more room to operate and preventing it from crashing due to memory exhaustion. The HeapDumpOnOutOfMemoryError flag is useful for post-mortem analysis if it still crashes.

Cause 2: Network Connectivity Issues

Intermittent network problems between the Jenkins controller and the agent can cause the connection to drop. This could be due to flaky Wi-Fi, a failing network cable, or temporary network congestion.

Diagnosis: Monitor network traffic between the controller and agent. Use tools like ping or mtr (My Traceroute) from both the controller to the agent and vice-versa. Look for packet loss or high latency. Check network device logs (switches, routers) for errors.

Fix: If network instability is confirmed, troubleshoot the network infrastructure. This might involve replacing faulty cables, reconfiguring network hardware, or ensuring sufficient bandwidth. For agents on unstable networks (like Wi-Fi), consider moving them to a wired connection.

Cause 3: Agent Process Exited Abruptly

The agent process might be terminated by the operating system due to resource limits (CPU, memory) or by an external process.

Diagnosis: Check the system logs on the agent machine (e.g., /var/log/syslog or journalctl on Linux, Event Viewer on Windows) for any messages indicating the agent process was killed or exited unexpectedly. Look for oom-killer messages on Linux if the system was under heavy load.

Fix: If the OS is killing the process due to resource constraints, you need to either reduce the resource usage of the agent (e.g., by optimizing build steps, using smaller Docker images) or increase the resources available to the agent machine (more RAM, faster CPU). For Linux, you can also adjust the oom_score_adj for the agent process to make it less likely to be killed by the OOM killer, though this is a workaround, not a root cause fix.

Cause 4: Agent Configuration Errors

Incorrect configuration of the agent itself, such as invalid command-line arguments or incorrect workspace paths, can lead to startup failures or immediate exits.

Diagnosis: Review the agent’s configuration files and startup scripts. Ensure all paths are correct and accessible, and that any environment variables are properly set. Check the agent’s startup logs for specific error messages related to configuration.

Fix: Correct any typos or logical errors in the agent’s configuration files or startup scripts. For example, if the agent is configured to use a specific workspace directory that doesn’t exist or has incorrect permissions, fix the path or grant the necessary permissions.

Cause 5: Disk Space Exhaustion on Agent

If the agent’s disk runs out of space, especially in its workspace directory or where temporary files are written, processes can fail or crash.

Diagnosis: Check the available disk space on the agent machine, particularly on the partition where the agent’s workspace and temporary files reside. Use df -h on Linux/macOS or check drive properties on Windows.

Fix: Free up disk space by deleting old build artifacts, logs, or unused files. If this is a recurring problem, consider increasing the disk size or configuring Jenkins to clean up workspaces more aggressively.

Cause 6: Agent Plugin/Dependency Issues

The Jenkins agent might rely on specific plugins or system tools that are either missing, incompatible, or corrupted on the agent machine.

Diagnosis: Verify that all necessary Jenkins agent plugins are installed and up-to-date on the controller, and that the agent is compatible with the controller version. Check the agent’s logs for errors related to missing libraries or tools required by build steps.

Fix: Ensure the agent environment has all required dependencies installed and configured correctly. This might involve installing specific libraries, compilers, or runtime environments on the agent machine that your builds depend on.

Cause 7: Network Timeout or Firewall Rules

The connection between the Jenkins controller and the agent might be terminated due to network timeouts or aggressive firewall rules that close idle connections.

Diagnosis: Examine firewall logs on both the controller and agent machines, as well as any intermediate network devices. Check for connection reset messages. On the Jenkins controller, you can increase the agent connection timeout settings.

Fix: Configure firewalls to allow persistent connections between the controller and agent. On the Jenkins controller, navigate to Manage Jenkins -> System -> Global properties and set Agent connection timeout to a higher value (e.g., 60000 milliseconds or 1 minute) if network latency is a factor. You might also need to adjust TCP keep-alive settings on the agent’s operating system.

The next error you’ll likely encounter after resolving these issues is a build failure due to a specific tool missing or misconfigured on the agent, or the build itself failing due to an application error.

Want structured learning?

Take the full Jenkins course →