GitLab CI caching is surprisingly fragile, and often doesn’t cache what you think it does.
Let’s see how it actually works. Imagine you have a simple CI job that installs dependencies and then runs tests.
stages:
- build
- test
install_deps:
stage: build
script:
- echo "Installing dependencies..."
- apt-get update -yqq
- apt-get install -yqq nodejs
- npm install
cache:
key: "$CI_COMMIT_REF_SLUG"
paths:
- node_modules/
run_tests:
stage: test
script:
- echo "Running tests..."
- npm test
dependencies:
- install_deps
When install_deps runs, it creates a node_modules/ directory. The cache directive tells GitLab to save this directory to a cache associated with the branch ($CI_COMMIT_REF_SLUG). The next time a job on the same branch runs, GitLab will download this cache before the script section starts. Then, run_tests can use these installed dependencies.
This seems straightforward, but the devil is in the details of how the cache key and paths work.
The Cache Key is Everything
The key is how GitLab identifies a specific cache. If the key changes, GitLab creates a new cache. If it stays the same, GitLab attempts to use the existing cache.
- Common Cause: Using a key that changes too often, like
$CI_COMMIT_SHA. This means each commit gets its own cache, effectively disabling caching. - Diagnosis: Look at the job logs. You’ll see lines like
Restoring cacheandCreating cache. If it’s alwaysCreating cacheand neverRestoring cachefor dependencies, your key is likely too dynamic. - Fix: Use a stable key for dependencies that don’t change often. For
npmoryarndependencies,$CI_COMMIT_REF_SLUG(the branch name) is a good starting point. If you have lock files (likepackage-lock.jsonoryarn.lock), use them! A better key is"$CI_COMMIT_REF_SLUG-$CI_PROJECT_DIR/package-lock.json". This ensures the cache is invalidated only when the lock file changes.cache: key: "$CI_COMMIT_REF_SLUG-$CI_PROJECT_DIR/package-lock.json" paths: - node_modules/ - Why it works: The cache key is a unique identifier. When the key matches an existing cache, GitLab downloads it. By including the lock file, you guarantee that if your dependencies actually change (because the lock file was updated), a new cache is created. Otherwise, you reuse the fast, cached version.
What’s Actually Cached?
The paths directive tells GitLab which files and directories to include in the cache.
- Common Cause: Caching entire directories that contain temporary or build-specific files that don’t actually speed up subsequent runs. Or, missing crucial files that do speed things up.
- Diagnosis: Examine the
node_modules/directory (or equivalent) after a job runs. Does it contain exactly whatnpm installoryarn installproduces? Are there any stray files? Are there subdirectories that are large but rarely used by tests? - Fix: Be precise with your
paths. Fornpm,node_modules/is usually correct. For build artifacts, cache only the final output, not intermediate compilation steps. If you’re compiling C++, cache the compiled objects (*.o) if your build system supports it, or the final executable.cache: key: "$CI_COMMIT_REF_SLUG" paths: - build/ # Cache the final build output - target/ # Example for Maven/Gradle - Why it works: GitLab archives and unarchives only the specified paths. Being specific reduces the size of the cache, making uploads and downloads faster, and ensures you’re only restoring what’s necessary and beneficial.
Cache Scope: Per-Job vs. Per-Pipeline
GitLab CI has two main caching mechanisms: cache (per-job) and policy: pull-push (default) vs. policy: pull or policy: push.
- Common Cause: Jobs unintentionally overwriting each other’s caches, or jobs that should share a cache but don’t.
- Diagnosis: Check the job logs for
Restoring cacheandCreating cache. If a job that should be using a cache is instead creating a new one, it might be because a previous job in the pipeline already pushed a cache with the same key, and this job is configured to onlypull. - Fix: Understand the
policydirective. By default, it’spull-push, meaning a job tries to restore a cache and then uploads its own if it generated one. If you have a dependency installation job, it should bepull-push. If you have a subsequent job that only uses those dependencies and doesn’t modify them, you might considerpolicy: pull. However, for most cases,pull-pushis fine. Explicitly defining it can avoid confusion.install_deps: stage: build script: - npm install cache: key: "$CI_COMMIT_REF_SLUG" paths: - node_modules/ policy: pull-push # Explicitly state default behavior run_tests: stage: test script: - npm test dependencies: - install_deps cache: key: "$CI_COMMIT_REF_SLUG" # Must match parent job for effective caching paths: - node_modules/ policy: pull # This job only needs to read the cache, not update it - Why it works:
policy: pulltells the job to only attempt to download the cache and not to upload anything, preventing it from overwriting a cache created by another job. This is useful for jobs that consume artifacts but don’t produce cacheable outputs themselves.
Dependency Management Tools and Caching
Tools like npm, yarn, pip, bundler, and composer have their own dependency caching mechanisms within the node_modules/ or vendor/ directories.
- Common Cause: GitLab caching the entire
node_modules/directory when the dependency manager could have handled partial updates or rebuilds more efficiently. Or, GitLab caching the dependency manager’s own cache directory (like~/.npmor~/.cache/pip), which can get corrupted. - Diagnosis: Observe the output of your dependency installation command. Does it always reinstall everything? Or does it indicate that it’s using a local cache?
- Fix: If your dependency manager has a dedicated cache directory (e.g.,
~/.npmfor npm,~/.cache/yarnfor yarn,~/.cache/pipfor pip), it’s often better to cache that directory instead ofnode_modules/. This allows the manager to intelligently reuse downloaded packages.
Then, in yourcache: key: "$CI_COMMIT_REF_SLUG" paths: - ~/.npm # For npm # - ~/.cache/yarn # For yarn # - ~/.cache/pip # For pipscript, ensure you runnpm ci(clean install) oryarn install --frozen-lockfileto leverage the cached packages.script: - npm ci # Uses the cached ~/.npm directory - Why it works: Dependency managers are optimized to download individual packages and their dependencies. Caching their internal package stores allows them to quickly retrieve already-downloaded packages, significantly speeding up
npm ciorpip install.
Parallelism: The Other Half of the Speed Equation
Caching speeds up individual jobs by reducing work. Parallelism speeds up the entire pipeline by running multiple jobs or stages concurrently.
- Common Cause: Having many sequential jobs that could run at the same time.
- Diagnosis: Look at your
.gitlab-ci.ymlfile. Are there many jobs in the samestage? Or are there manystageswhere jobs in later stages depend only on a small subset of jobs in earlier stages? - Fix: Use multiple jobs within the same stage. If you have 100 tests, split them into 10 jobs, each running 10 tests.
You can also usetest_suite_1: stage: test script: - npm run test:suite1 parallel: 10 # This job will be duplicated 10 times test_suite_2: stage: test script: - npm run test:suite2 parallel: 10 # ... and so on for all 100 tests, perhaps with a templateparallel:matrixfor more complex scenarios. - Why it works: GitLab runners can execute jobs in parallel. By defining multiple jobs that can run independently, you saturate your available runner capacity, executing more work in the same amount of wall-clock time.
The next thing you’ll likely encounter is cache contention when multiple jobs try to push to the same cache key simultaneously, leading to race conditions and corrupted caches.