This is happening because the GitHub Actions runner is exiting prematurely due to a resource constraint or a detected issue, leading to the job being marked as cancelled rather than completed.

Runner Timeout

Diagnosis: Check the job logs for a message indicating the runner has timed out. This usually appears as a message like "Runner exiting with success code 0" or "Runner exited with failure code 1" after a period of inactivity or exceeding the job duration limit. The default timeout for jobs is 360 minutes (6 hours) for public repositories and 3600 minutes (60 hours) for private repositories.

Fix: If your job legitimately needs more time, you can extend the timeout by adding a timeout-minutes key to your workflow file. For instance, to set a timeout of 8 hours:

jobs:
  my_job:
    runs-on: ubuntu-latest
    timeout-minutes: 480
    steps:
      # ... your steps here

This increases the maximum allowed runtime for the job, preventing it from being cancelled due to exceeding the default limit.

Why it works: This directly tells the Actions runner to allow the job to run for a longer duration before it automatically terminates it.

Runner Out of Disk Space

Diagnosis: Look for errors in the job logs related to disk I/O, such as "No space left on device" or specific command failures like git clone or docker build failing due to insufficient space.

Fix: For self-hosted runners, ensure the disk where the runner is installed has ample free space. A common recommendation is at least 50GB free. For GitHub-hosted runners, this is less common but can occur if a job downloads an excessive amount of data or creates large intermediate files. If using GitHub-hosted runners and encountering this, consider cleaning up temporary files within your job steps using rm -rf /tmp/* or similar commands, or optimizing your workflow to avoid generating large temporary artifacts.

Why it works: By freeing up disk space or ensuring sufficient capacity, the runner can successfully perform operations that require writing data to disk.

Runner Out of Memory (OOM)

Diagnosis: Job logs will often show errors like "Killed" or "Out of memory" messages from the operating system, or specific process failures (e.g., a Python script crashing with a memory error). This can also manifest as the runner process itself crashing.

Fix:

  • GitHub-hosted runners: If you’re hitting memory limits on GitHub-hosted runners, you might need to use a runner with more resources (e.g., ubuntu-20.04 has more memory than ubuntu-16.04) or optimize your code to use less memory. For example, if you’re running memory-intensive tasks like large data processing or compiling large projects, consider breaking them down into smaller steps or using more efficient algorithms.
  • Self-hosted runners: Upgrade the physical or virtual machine running your self-hosted runner to have more RAM. Ensure the runner process itself isn’t being starved by other processes on the host.

Why it works: Providing the runner or the processes within it with sufficient RAM prevents the operating system from terminating them due to memory exhaustion.

Network Connectivity Issues

Diagnosis: Look for errors indicating timeouts when trying to connect to external services, download dependencies, or push artifacts. This might appear as curl timeouts, npm install failures, or git fetch errors.

Fix:

  • GitHub-hosted runners: Ensure your workflow isn’t trying to access resources that are inaccessible from GitHub’s network (e.g., private services without proper network configuration). If you’re downloading large dependencies, consider caching them to speed up subsequent runs and reduce network load.
  • Self-hosted runners: Verify that the runner machine has stable network connectivity to GitHub and any other external services your workflow depends on. Check firewall rules, DNS resolution, and general network health.

Why it works: A stable network connection is crucial for the runner to communicate with GitHub and fetch/upload necessary data.

Runner Software Crashes or Bugs

Diagnosis: The job log might abruptly end with no clear error message from your script, or you might see generic runner process exit codes (e.g., exit code 137 which often indicates a SIGKILL, frequently due to OOM killer). Sometimes, the runner’s own diagnostic logs (if you have access to them for self-hosted runners) will show internal errors.

Fix:

  • GitHub-hosted runners: Ensure you are using the latest available runner image. GitHub regularly updates these. If the issue persists, it might be a temporary glitch with the runner infrastructure, and retrying the workflow might resolve it. You can also try specifying a different runner OS or version if available (e.g., ubuntu-latest vs ubuntu-20.04).
  • Self-hosted runners: Ensure your self-hosted runner software is up-to-date. You can check for updates via the runner’s admin interface or by following the official documentation. Restarting the runner service can also sometimes resolve transient issues.

Why it works: Keeping the runner software updated ensures you have the latest bug fixes and performance improvements, reducing the chance of internal crashes.

Workflow Configuration Errors

Diagnosis: While less common for outright cancellation, malformed workflow files can sometimes lead to unexpected runner behavior. This could be subtle syntax errors or incorrect configuration of if conditions that cause steps to be skipped in a way that leaves the runner in an unexpected state.

Fix: Carefully review your .github/workflows/your-workflow.yml file for any syntax errors, incorrect indentation, or logic flaws in your if conditions. Use the GitHub Actions validator (available via npm install -g @actions/validator and then actions-validator validate path/to/your/workflow.yml) to check for syntax issues.

Why it works: A correctly formatted and logically sound workflow file ensures the runner executes steps as intended, without encountering parsing or execution ambiguities.

The next error you’ll likely encounter after fixing these issues is related to artifact upload failures if your jobs produce large outputs and the runner times out during artifact upload, rather than the job itself.

Want structured learning?

Take the full Github-actions course →