The GitLab CI Container Scanning job is failing because the Trivy scanner, which is responsible for analyzing your container images, is encountering an error when trying to pull or scan the image. This is happening because the scanner isn’t able to authenticate with your container registry or the image itself is malformed.
Here are the most common reasons this happens and how to fix them:
1. Incorrect Registry Credentials
The most frequent culprit is that Trivy can’t log into your container registry to pull the image. This could be due to an expired token, incorrect username/password, or a misconfigured registry URL.
Diagnosis:
First, try to manually pull the image from your CI runner’s environment using docker login and docker pull. If that fails, the credentials are the issue. You can also check the Trivy job logs for specific authentication errors. Look for messages like "authentication required" or "unauthorized."
Fix:
Ensure your CI_REGISTRY_USER, CI_REGISTRY_PASSWORD, and CI_REGISTRY CI/CD variables are correctly set in your GitLab project’s CI/CD settings.
CI_REGISTRY_USER: For GitLab’s built-in registry, this is usuallygitlab-ci-token. For other registries, it’s your registry username.CI_REGISTRY_PASSWORD: For GitLab’s built-in registry, this should be theCI_JOB_TOKEN. For other registries, it’s your registry password or an access token.CI_REGISTRY: For GitLab’s built-in registry, this is typicallygitlab-registry.example.com(replaceexample.comwith your GitLab instance’s domain). For other registries, it’s the full URL (e.g.,registry.hub.docker.com).
Example for GitLab Container Registry:
In your GitLab project’s Settings > CI/CD > Variables:
CI_REGISTRY_USER: gitlab-ci-token
CI_REGISTRY_PASSWORD: {{ CI_JOB_TOKEN }} (use the predefined variable)
CI_REGISTRY: {{ CI_REGISTRY }} (use the predefined variable)
Why it works: Trivy uses these credentials to authenticate with the container registry, just like docker login would, allowing it to fetch the image for scanning.
2. Image Tag Mismatch or Non-existent Image
You might be trying to scan an image tag that doesn’t exist in the registry, or the image name itself is misspelled.
Diagnosis:
Manually run docker pull <your-image-name>:<your-image-tag> on a machine that has access to your registry. If docker pull fails with a "manifest unknown" or "repository not found" error, the tag or image name is wrong. Check the output of your Container Scanning job for similar errors.
Fix:
Verify the image name and tag being passed to the Container Scanning job. This is often defined by CI/CD variables like CI_COMMIT_REF_SLUG for tags or branch names, and CI_PROJECT_PATH for the repository. Ensure these variables are correctly populated and that the resulting image name/tag combination actually exists in your registry at the time of the scan.
Example:
If your .gitlab-ci.yml uses:
variables:
IMAGE_TAG: $CI_COMMIT_SHA
IMAGE_NAME: $CI_REGISTRY_IMAGE
container_scanning:
script:
- trivy image --cache-dir .cache/trivy "$IMAGE_NAME:$IMAGE_TAG"
Ensure that $CI_COMMIT_SHA is a valid, existing tag for the image $CI_REGISTRY_IMAGE.
Why it works: Trivy needs a valid image reference to pull and scan. Correcting the image name and tag ensures it’s pointing to an actual artifact in the registry.
3. Network Connectivity Issues from the Runner
The GitLab CI runner might not have direct network access to your container registry. This is common if you’re using a private registry or a registry in a different network segment.
Diagnosis:
From the CI runner’s environment (e.g., by SSHing into the runner or executing commands within a runner-provided Docker container), try to ping or curl the registry’s hostname. If these fail, there’s a network path problem.
Fix: Ensure that the network configuration of your CI runners allows outbound connections to your container registry’s FQDN and port (usually 443 for HTTPS). This might involve configuring firewall rules, security groups, or network routing. If using self-hosted runners, check their network settings. For GitLab.com runners, you generally don’t have direct control over their network, so ensure your registry is publicly accessible or consider using a self-hosted runner.
Why it works: A successful network connection is fundamental for the runner to reach the registry and download the image for Trivy.
4. Insufficient Disk Space on the Runner
Trivy needs to download the container image to scan it. If the CI runner has insufficient disk space, the image download will fail, leading to job failure.
Diagnosis:
Check the disk space available on the CI runner. You can often do this by running df -h on the runner’s host or within the runner’s execution environment. Look for partitions that are full, especially / or /var/lib/docker.
Fix: Free up disk space on the CI runner by removing old Docker images, build artifacts, or other unnecessary files. Alternatively, increase the disk size allocated to the runner. For Docker-based runners, ensure the Docker daemon’s storage driver is configured to use a partition with ample space.
Why it works: Downloading the image requires temporary disk space. Providing enough space allows the download to complete successfully.
5. Trivy Cache Issues
While usually beneficial, a corrupted or outdated Trivy cache can sometimes cause unexpected failures.
Diagnosis:
Check the Trivy job logs for any unusual errors related to cache operations or file system permissions within the cache directory (.cache/trivy).
Fix:
Manually clear the Trivy cache. You can do this by deleting the .cache/trivy directory before Trivy runs, or by adding a CI/CD variable to TRIVY_OFFLINE_ONLY=false if you suspect network issues with cache downloads, or TRIVY_SKIP_UPDATE=true if you suspect update issues. A simpler approach is to remove the cache directory in your .gitlab-ci.yml before the scanning job:
container_scanning:
before_script:
- rm -rf .cache/trivy
script:
- trivy image --cache-dir .cache/trivy "$IMAGE_NAME:$IMAGE_TAG"
Why it works: A clean cache ensures Trivy starts with a fresh state, avoiding any potential corruption or stale data that might interfere with its operation.
6. Malformed or Corrupted Image
In rare cases, the container image itself might be corrupted or malformed in a way that Trivy cannot parse it, even if docker pull succeeds.
Diagnosis:
Try scanning the image locally using the trivy command-line tool outside of GitLab CI. If it fails locally with similar errors, the image is likely the problem. You can also try scanning a different, known-good image to see if the scanner itself is working.
Fix:
Rebuild the container image. Ensure your Dockerfile is correct and that the build process completes without errors. If you’re pulling an image from a third-party source, try pulling a different tag or a different image entirely to rule out issues with the source.
Why it works: Rebuilding the image from scratch can resolve any underlying corruption or inconsistencies that prevent Trivy from analyzing it.
After addressing these, the next error you might encounter is a failure in a subsequent security job, like dependency scanning, if the initial image scanning revealed vulnerabilities that block the pipeline.