GitLab CI runners are failing to pick up jobs, leaving pipelines stuck in a pending or running state because the GitLab instance can’t find an available runner to execute the job.
Runner Registration Token Mismatch or Expiration
Diagnosis:
Check the runner’s config.toml file for the registration token. Then, navigate to your GitLab project/group/instance settings under "CI/CD" -> "Runners" and compare the token.
Cause: The runner’s registration token has expired or doesn’t match the one currently configured for your GitLab instance or project. This prevents the runner from registering and authenticating with GitLab.
Fix:
- Generate a new token: In GitLab, go to your project/group/instance settings under "CI/CD" -> "Runners". Click the "Register new runner" button to get a fresh registration token.
- Update
config.toml: Edit the runner’sconfig.tomlfile (usually located at/etc/gitlab-runner/config.tomlon Linux) and replace thetokenvalue with the new one.concurrent = 1 check_interval = 10 [[runners]] name = "my-gitlab-runner" url = "https://gitlab.example.com/" token = "YOUR_NEW_REGISTRATION_TOKEN" # Replace with your new token executor = "docker" [runners.docker] tls_verify = false image = "ruby:2.7" privileged = true disable_cache = false volumes = ["/cache"] - Restart the runner service:
sudo gitlab-runner restart
Why it works: The registration token is a security credential. When it’s invalid, the runner cannot establish a secure connection to GitLab, thus it cannot poll for jobs. A valid token re-establishes this trust.
Runner Not Connected or Offline
Diagnosis: Check the runner’s status in GitLab’s UI. Also, verify the runner service is running on the host machine.
Cause: The GitLab Runner service is stopped, crashed, or the machine it’s running on is unreachable by the GitLab instance.
Fix:
- Check service status: On the runner’s host machine, run:
If it’s not running, start it:sudo gitlab-runner statussudo gitlab-runner start - Verify network connectivity: Ensure the runner machine can reach the GitLab instance URL (
https://gitlab.example.com/) viapingorcurl.
If there are network issues, troubleshoot firewalls, DNS, or network routing.ping gitlab.example.com curl -v https://gitlab.example.com/
Why it works: GitLab CI relies on a continuous connection between the GitLab instance and the runners. If the runner process isn’t active or can’t communicate over the network, it cannot receive job assignments.
Insufficient Runner Tags for Job Requirements
Diagnosis:
Examine the .gitlab-ci.yml file for the job that is stuck. Look for the tags keyword. Compare these tags with the tags assigned to your registered runners in the GitLab UI.
Cause: The job is configured with specific tags, but no registered runners have matching tags. Runners use tags to identify which jobs they are capable of executing.
Fix:
- Add tags to runners: In GitLab, navigate to your project/group/instance settings under "CI/CD" -> "Runners". Edit the relevant runner and add the required tags (e.g.,
docker,ruby,production). - Or, update
.gitlab-ci.yml: Modify the job definition in your.gitlab-ci.ymlto match existing runner tags.my_job: stage: build script: - echo "Building..." tags: - docker # Ensure at least one runner has this tag
Why it works: Tags act as filters. When a job has a tag, the GitLab instance will only consider runners that also possess that exact tag for job execution.
Runner Concurrent Job Limit Reached
Diagnosis:
Check the runner’s configuration for the concurrent parameter. Also, observe the "Active runners" count on the runner’s details page in GitLab.
Cause: The runner is configured to handle a maximum number of concurrent jobs, and all its available slots are currently occupied by other running jobs.
Fix:
- Increase
concurrentinconfig.toml: Edit the runner’sconfig.tomlfile and increase theconcurrentvalue.
Then restart the runner service:concurrent = 4 # Increased from default 1 or 2 check_interval = 10 [[runners]] # ... other runner configurationssudo gitlab-runner restart - Add more runners: Register additional runners, especially if you have multiple distinct configurations or environments.
Why it works: The concurrent setting dictates how many jobs a single runner process can handle simultaneously. If this limit is reached, the runner cannot accept new jobs until existing ones complete.
GitLab Instance Runner Queue Full or Unresponsive
Diagnosis: Examine the GitLab instance’s logs for errors related to runner communication or job queuing. Check the "CI/CD" -> "Jobs" section in GitLab for any visible errors or timeouts.
Cause: The GitLab instance itself might be under heavy load, experiencing performance issues, or have an internal problem that prevents it from properly dispatching jobs to available runners. This is less common but can happen on very busy instances.
Fix:
- Check GitLab instance health: Monitor your GitLab server’s CPU, memory, and disk I/O. Address any resource constraints.
- Review GitLab logs: Look for errors in
/var/log/gitlab/gitlab-rails/production.logor/var/log/gitlab/sidekiq/current. - Restart GitLab services:
sudo gitlab-ctl restart
Why it works: A healthy GitLab instance is crucial for managing the CI/CD pipeline. If the instance is struggling, it cannot effectively communicate with or assign tasks to its runners.
Docker Executor Issues (Image Pull Failures, Permissions)
Diagnosis: If using the Docker executor, check the runner’s logs for specific Docker-related errors. Look for messages indicating failure to pull images or permission denied errors.
Cause: The runner’s Docker daemon cannot pull the required Docker image for the job (e.g., due to network issues, registry authentication problems, or image not existing) or lacks the necessary permissions to run containers.
Fix:
- Verify Docker image: Ensure the image specified in
.gitlab-ci.ymlexists and is accessible from the runner. - Check Docker daemon logs: On the runner host, examine Docker logs for pull failures:
sudo journalctl -u docker.service - Authentication for private registries: If pulling from a private registry, ensure Docker credentials are configured correctly for the GitLab runner.
- Permissions: Ensure the user running the
gitlab-runnerprocess has sufficient permissions to interact with the Docker daemon (often by adding the user to thedockergroup).sudo usermod -aG docker gitlab-runner sudo gitlab-runner restart
Why it works: The Docker executor relies on the Docker daemon to create and manage job environments. Failures in image retrieval or container execution directly prevent jobs from starting.
After fixing these issues, the next potential problem you might encounter is jobs failing due to insufficient disk space on the runner or within the Docker containers used for execution.