The GitLab CI job failed because the runner couldn’t extract the cached artifacts, indicating a problem with either the cache’s integrity, the runner’s access to the cache storage, or the job’s configuration.

Common Causes and Fixes:

  1. Corrupted Cache Archive: The tar.gz archive that GitLab uses for caching can become corrupted during upload or download.

    • Diagnosis: Look for specific error messages in the runner logs like tar: This does not look like a tar archive or gzip: invalid compressed data to uncompress. You can also manually try to extract a downloaded cache file on a machine with tar and gzip installed.
    • Fix: The simplest fix is to clear the cache for the specific project or branch. In your GitLab project, navigate to CI/CD > Pipelines and click the "Clear runner caches" button. This forces a fresh download of dependencies.
    • Why it works: This removes the problematic archive from the cache storage, forcing the CI job to rebuild and re-cache the artifacts.
  2. Insufficient Disk Space on Runner: If the runner’s disk runs out of space during cache extraction, the process will fail.

    • Diagnosis: Check the runner’s available disk space. If you have shell access to the runner, use df -h. Look for partitions that are at or near 100% usage, especially where GitLab CI artifacts/caches are stored.
    • Fix: Free up disk space on the runner by deleting old logs, unused Docker images, or other temporary files. Alternatively, increase the disk size of the runner instance.
    • Why it works: Cache extraction involves writing files to disk. Without enough space, the operation cannot complete.
  3. Permissions Issues with Cache Storage: The user account running the GitLab CI runner process might not have the necessary read/write permissions for the cache directory on the runner’s filesystem or for the object storage bucket.

    • Diagnosis: If using a shared runner, you might not have direct access to diagnose this. For self-hosted runners, check the permissions of the runner’s working directory (often /home/gitlab-runner/builds/<runner-token>/<project-id>). The gitlab-runner user needs read and write access. If using S3 or GCS, verify the IAM roles or access keys have s3:GetObject, s3:PutObject, and s3:DeleteObject permissions.
    • Fix: Ensure the gitlab-runner user has appropriate permissions. For filesystem caches, sudo chown -R gitlab-runner:gitlab-runner /path/to/cache/directory might be necessary. For object storage, update the IAM policy or credentials.
    • Why it works: The runner process needs to read from and write to the cache location. Lack of permissions prevents these operations.
  4. Incorrect Cache Key Configuration: The cache:key in your .gitlab-ci.yml might be too broad or too specific, leading to cache conflicts or preventing the correct cache from being found.

    • Diagnosis: Review your .gitlab-ci.yml for cache: directives. Pay close attention to the key: value. If it’s a static string, all jobs will try to use the same cache, which can lead to issues if their artifact requirements differ. If it uses variables that change unexpectedly, it might invalidate caches too often.
    • Fix: Use a more robust cache key strategy. For example, cache:key: files: - Gemfile.lock for Ruby projects, or cache:key: prefix: "$CI_COMMIT_REF_SLUG" files: - package-lock.json for Node.js.
    • Why it works: A well-defined cache key ensures that only relevant artifacts are cached and retrieved, preventing corruption from mixing different job artifacts and ensuring the correct cache is used for the current job’s context.
  5. Network Connectivity Issues to Cache Storage: The runner might be unable to connect to the configured cache storage (e.g., S3, GCS, or the GitLab instance itself if using the default object storage).

    • Diagnosis: Check the runner logs for network-related errors like connection refused, timeout, or DNS resolution failures. If using S3/GCS, try to ping or curl the endpoint from the runner’s environment.
    • Fix: Ensure the runner has network access to the cache endpoint. This might involve configuring firewall rules, VPC settings, or proxy settings on the runner. If using object storage, verify the endpoint URL is correct in your GitLab configuration (/etc/gitlab-runner/config.toml or environment variables).
    • Why it works: The runner needs a stable network connection to download (extract) and upload (save) cache archives to the remote storage.
  6. Object Storage Configuration Errors: If using external object storage (like S3), misconfiguration of the storage provider details in GitLab’s settings or the runner’s config.toml can lead to extraction failures.

    • Diagnosis: Double-check the GITLAB_CI_S3_BUCKET, GITLAB_CI_S3_ACCESS_KEY_ID, GITLAB_CI_S3_SECRET_ACCESS_KEY, and GITLAB_CI_S3_REGION (or equivalent for GCS) settings. Ensure they are correctly set in your GitLab instance’s configuration or in the runner’s config.toml if it’s a self-hosted runner.
    • Fix: Correct any typos or incorrect values in the object storage configuration. Ensure the credentials provided are valid and have the necessary permissions. Restart the GitLab services or runner process after making changes.
    • Why it works: Incorrect credentials or region settings mean the runner cannot authenticate with or locate the correct storage bucket, preventing cache access.

The next error you might encounter if all these are resolved is a "job exceeded time limit" error because the cache extraction is a prerequisite for the actual job steps, and if it fails, the job effectively stalls.

Want structured learning?

Take the full Gitlab-ci course →