Jenkins jobs are stuck in a "waiting for available executors" state because the Jenkins controller doesn’t have any free build agents (executors) to run them on.
Here’s how to diagnose and fix this:
1. Check Jenkins Controller’s Executor Count
Diagnosis: The Jenkins controller itself can have a limited number of executors configured. If all of these are busy, jobs will queue up even if you have external build agents.
- Command/Check: Navigate to
Manage Jenkins->Manage Nodes and Clouds->Built-In Node. Look at the "Number of executors" setting. - Fix: Increase the number of executors on the built-in node. For example, if it’s set to
1, change it to4.- Why it works: This directly increases the number of concurrent builds the Jenkins controller can manage on its own, without relying on external agents.
2. Verify Agent Connection Status
Diagnosis: Your configured build agents (nodes) might be offline or disconnected from the Jenkins controller, rendering their executors unavailable.
- Command/Check: Go to
Manage Jenkins->Manage Nodes and Clouds. Look for your agent nodes. Their status should be "Online" (green arrow). If they are offline (red ball or "offline"), investigate the agent. - Fix:
- For SSH agents: Ensure the SSH service is running on the agent machine and that Jenkins has the correct SSH credentials. Restart the agent process if necessary.
- For JNLP agents: Check the agent’s log file for connection errors. Ensure the agent.jar is running and can reach the Jenkins controller. Restart the agent process.
- Why it works: A disconnected agent cannot accept or run build jobs. Re-establishing the connection makes its executors available.
3. Examine Agent Executor Configuration
Diagnosis: Individual agents might be configured with zero executors or have their executors set to offline, even if the agent itself is connected.
- Command/Check: Click on an agent node in
Manage Jenkins->Manage Nodes and Clouds. Then clickConfigure. Look at the "Number of executors" for that specific agent. - Fix: Ensure the "Number of executors" is set to a value greater than
0(e.g.,1or2). If the "Launch agent" or "Availability" setting is "Temporarily offline," change it to "Online."- Why it works: This ensures that the agent, when connected, is actually configured to accept and run jobs.
4. Check for Executor Leases/In-Use Status
Diagnosis: Even if executors are configured and agents are online, they might all be tied up running other jobs.
- Command/Check: On the Jenkins dashboard, look at the "Build Executor Status" section. It shows how many executors are busy and how many are idle across all nodes. If all are busy, you need more capacity or need to wait for jobs to finish.
- Fix:
- Add more agents: Configure and launch more build agents.
- Increase executors per agent: If agents have capacity, increase their "Number of executors" (as in point 3).
- Optimize existing jobs: Identify long-running jobs and try to optimize them or split them into smaller, parallelizable tasks.
- Why it works: This addresses the fundamental constraint: not enough capacity to run all pending jobs simultaneously.
5. Review Labels and Job Configuration
Diagnosis:
Jobs are assigned to specific labels (e.g., linux, docker), and agents are also assigned labels. If a job’s required labels don’t match any available, online agents’ labels, the job will wait indefinitely.
- Command/Check:
- For a job: Go to the job’s configuration page, scroll down to "General" and look for "Restrict where this project can be run." Note the configured label expression.
- For agents: Go to
Manage Jenkins->Manage Nodes and Clouds. Click on an agent, thenConfigure. Look for "Labels" under the "General" section.
- Fix:
- Match job labels to agent labels: Ensure at least one online agent has a label that satisfies the job’s label expression. For example, if a job requires
docker, ensure an agent is labeleddocker. - Remove restrictive labels: If a job doesn’t strictly need a specific environment, remove the restrictive label in its configuration.
- Add labels to agents: Assign appropriate labels to your agents.
- Why it works: Jenkins uses labels as a matching mechanism. If the job’s requirements (its labels) cannot be met by any available agent’s capabilities (its labels), the job cannot be scheduled.
- Match job labels to agent labels: Ensure at least one online agent has a label that satisfies the job’s label expression. For example, if a job requires
6. Investigate Agent Resource Constraints
Diagnosis: An agent might appear online and have executors configured, but the underlying machine might be out of resources (CPU, RAM, disk space), preventing Jenkins from launching new processes on it.
- Command/Check: SSH into the agent machine. Use commands like
top,htop,free -m, anddf -hto check CPU, memory, and disk usage. - Fix:
- Free up resources: Stop unnecessary processes on the agent machine.
- Increase agent resources: Upgrade the hardware of the agent machine or allocate more resources if it’s a VM.
- Reduce concurrent jobs per agent: Lower the "Number of executors" on the agent if it’s consistently overloaded.
- Why it works: Even if Jenkins wants to start a build, the operating system cannot fulfill the request if resources are exhausted.
7. Check Jenkins Controller CPU/Memory Usage
Diagnosis: If the Jenkins controller itself is overloaded with CPU or memory, it might struggle to even process the queue of jobs or assign them to available executors, leading to delays that appear as "waiting for executors."
- Command/Check: Use system monitoring tools on the Jenkins controller machine (e.g.,
top,htop,vmstat). Check Jenkins’ own JVM memory usage via its process. - Fix:
- Increase controller resources: Allocate more RAM or CPU to the Jenkins controller machine.
- Optimize Jenkins plugins: Disable or remove underutilized or resource-heavy plugins.
- Increase JVM heap size: If Jenkins is running out of memory, increase the
-Xmxparameter in its startup script (e.g.,JENKINS_JAVA_OPTIONS="-Xmx4096m"). - Why it works: A struggling controller cannot efficiently manage its workload, including job queuing and executor assignment.
The next error you might encounter is a "disconnected agent" error if the agent machine itself becomes unreachable or crashes after you’ve fixed the executor availability.