The GitLab CI Runner system failed because the gitlab-runner service, which is responsible for picking up and executing CI jobs, could not connect to the GitLab API. This means no jobs were being picked up, and existing jobs might have timed out.

Common Causes and Fixes

  1. Runner Not Registered or Authentication Token Expired

    • Diagnosis: Check the runner’s status in your GitLab project/group settings under "CI/CD" -> "Runners." If it shows as disconnected or has a red dot, the registration might be invalid. You can also check the runner’s configuration file (/etc/gitlab-runner/config.toml) for the token field.
    • Fix: Re-register the runner. On the runner machine, execute:
      sudo gitlab-runner register --url "https://your.gitlab.instance.com/" --registration-token "YOUR_REGISTRATION_TOKEN" --description "my-new-runner" --tag-list "docker,aws" --executor "docker" --docker-image "alpine:latest"
      
      Replace "https://your.gitlab.instance.com/" with your GitLab instance URL, "YOUR_REGISTRATION_TOKEN" with a fresh token obtained from your GitLab project/group’s CI/CD settings, and adjust description, tags, and executor as needed. This command re-establishes the secure link between the runner and GitLab using the latest credentials.
    • Why it works: The registration token is a one-time-use credential. If it expires or is revoked, the runner loses its authorization to communicate with the GitLab API. Re-registration provides a new, valid token.
  2. Network Connectivity Issues

    • Diagnosis: From the runner machine, try to curl your GitLab instance:
      curl -v https://your.gitlab.instance.com/api/v4/runners
      
      Look for connection timeouts or SSL certificate errors. Also, check firewall rules on the runner machine and any network firewalls between the runner and GitLab.
    • Fix: If curl fails, ensure the runner machine can resolve and reach your GitLab instance’s hostname and port (usually 443 for HTTPS). If using a corporate network, you might need to configure a proxy in the runner’s config.toml:
      [[runners]]
        url = "https://your.gitlab.instance.com/"
        token = "..."
        executor = "docker"
        [runners.docker]
          tls_verify = false
          image = "alpine:latest"
          privileged = true
          disable_cache = false
          volumes = ["/cache"]
        [runners.cache]
          [runners.cache.s3]
          [runners.cache.gcs]
          [runners.cache.azure]
      
      # Add this section if you need a proxy
      # http_proxy = "http://your.proxy.server:8080"
      # https_proxy = "http://your.proxy.server:8080"
      
      Restart the gitlab-runner service after making changes.
    • Why it works: The runner needs a direct, unhindered network path to the GitLab API to poll for jobs and report status. Proxy settings ensure traffic is routed correctly through intermediary network devices.
  3. Incorrect GitLab Instance URL in config.toml

    • Diagnosis: Examine the config.toml file (/etc/gitlab-runner/config.toml) on the runner machine. Look for the url parameter under the [[runners]] section.
    • Fix: Ensure the url is exactly correct, including https:// and the correct domain. For example:
      [[runners]]
        url = "https://gitlab.example.com/"
        # ... other configurations
      
      After correcting the URL, restart the gitlab-runner service:
      sudo systemctl restart gitlab-runner
      
    • Why it works: A mistyped URL means the runner is attempting to connect to a non-existent or incorrect server, preventing any communication.
  4. gitlab-runner Service Not Running or Crashing

    • Diagnosis: Check the status of the gitlab-runner service:
      sudo systemctl status gitlab-runner
      
      Look for "active (running)" or "inactive (dead)". If it’s not running or has recently failed, check the logs for errors:
      sudo journalctl -u gitlab-runner -f
      
    • Fix: Start or restart the service:
      sudo systemctl start gitlab-runner
      sudo systemctl restart gitlab-runner
      
      If it keeps crashing, investigate the journalctl output for specific errors (e.g., disk full, out of memory, configuration parsing errors) and address those underlying issues.
    • Why it works: The gitlab-runner service is the daemon that continuously polls GitLab for jobs and manages job execution. If it’s not running, no jobs can be processed.
  5. Resource Constraints on the Runner Machine

    • Diagnosis: Monitor the CPU, RAM, and disk space on the machine hosting the GitLab Runner. Use commands like top, htop, free -h, and df -h. If the runner machine is starved of resources, the gitlab-runner process might become unresponsive or crash.
    • Fix: Allocate more resources to the runner machine (e.g., increase RAM, CPU, or disk space). If using Docker executors, ensure the Docker daemon itself has sufficient resources and that child processes aren’t being OOM-killed. Free up disk space if it’s full.
    • Why it works: The runner process and any spawned job processes (like Docker containers) require system resources to operate. Insufficient resources lead to instability and failures.
  6. SSL Certificate Issues (Self-Signed or Expired)

    • Diagnosis: If your GitLab instance uses a self-signed SSL certificate or its certificate has expired, the runner might fail to connect securely. The curl command from step 2 will likely show SSL errors. Check the runner logs (journalctl -u gitlab-runner -f) for messages like "x509: certificate signed by unknown authority" or "certificate has expired".
    • Fix:
      • Option A (Recommended): Configure your GitLab instance with a valid, trusted SSL certificate (e.g., from Let’s Encrypt).
      • Option B (Less Secure): If you must use a self-signed certificate, you need to tell the runner to trust it. Copy the CA certificate (or the self-signed certificate itself if it’s acting as its own CA) to the runner machine, e.g., /etc/gitlab-runner/certs/gitlab.example.com.crt. Then, in /etc/gitlab-runner/config.toml, add or modify the tls-ca-file setting:
        [[runners]]
          url = "https://your.gitlab.instance.com/"
          token = "..."
          tls-ca-file = "/etc/gitlab-runner/certs/gitlab.example.com.crt"
          executor = "docker"
          # ... rest of config
        
        Restart the gitlab-runner service.
      • Option C (Insecure, Not Recommended): For testing or highly controlled environments, you can disable TLS verification entirely, but this is a significant security risk. In config.toml, under [[runners.docker]] (if using Docker executor), set tls_verify = false. This is generally not advisable for production.
    • Why it works: The runner needs to establish a secure TLS connection to the GitLab API. If it cannot verify the server’s identity due to an untrusted or expired certificate, it will refuse to connect. Providing the correct CA certificate or disabling verification allows the connection to proceed.

After resolving these common issues, the next error you might encounter is related to specific job execution failures, such as insufficient disk space within a Docker container, missing dependencies, or permission errors in your CI scripts.

Want structured learning?

Take the full Gitlab-ci course →