Caching node_modules in GitHub Actions can dramatically speed up your CI builds by skipping the npm install step on subsequent runs.
Here’s a GitHub Actions workflow that implements caching for node_modules:
name: CI
on: [push]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Use Node.js 18.x
uses: actions/setup-node@v3
with:
node-version: 18.x
cache: 'npm' # This is the magic!
- name: Install dependencies
run: npm ci # Use 'npm ci' for faster, reproducible installs
- name: Run tests
run: npm test
Let’s break down how this works and the mental model behind it.
The Core Idea: Statefulness in Stateless CI
GitHub Actions runners are generally stateless. Each job execution starts with a fresh environment. This means that every time your CI pipeline runs, it has to download and install all your project’s dependencies from scratch. For projects with many dependencies, npm install can take a significant amount of time, slowing down your feedback loop.
Caching allows you to store specific files or directories between job runs. In this case, we’re targeting the node_modules directory, which is where npm (or yarn, pnpm) places all your installed packages. If the node_modules directory is found in the cache, GitHub Actions can restore it to the runner, effectively skipping the npm install step entirely.
How actions/setup-node and cache Work Together
The actions/setup-node action is a convenience wrapper that does a few things:
- Installs Node.js: It ensures the specified Node.js version is available on the runner.
- Configures npm/yarn/pnpm: It sets up the package manager environment.
- Implements Caching: When you provide
cache: 'npm',yarn, orpnpm, this action automatically configures the caching mechanism for you.
When cache: 'npm' is specified, actions/setup-node does the following:
- Determines the Cache Key: It generates a unique key for the cache entry. This key is typically based on the Node.js version, the operating system, and importantly, the
package-lock.json(ornpm-shrinkwrap.json) file. If yourpackage-lock.jsonchanges, the cache key will change, forcing a new download and installation. - Checks for Existing Cache: Before running any subsequent steps, it checks if a cache entry matching the generated key already exists.
- Restores Cache: If a matching cache is found, it downloads the cached files (your
node_modulesdirectory) and places them in the expected location on the runner. - Saves Cache: If no matching cache is found, or if the
npm installcommand is run and successfully completes, the action will then attempt to save thenode_modulesdirectory to the cache using the generated key.
The npm ci Command: A Crucial Detail
Notice the use of npm ci instead of npm install. This is not accidental.
npm install: This command is designed to install dependencies based on yourpackage.jsonandpackage-lock.json. It can also updatepackage-lock.jsonif needed.npm ci: This command is specifically for Continuous Integration environments. It performs a clean install directly from yourpackage-lock.json. It’s generally faster thannpm installbecause it skips dependency resolution and just installs exactly what’s in the lock file. Crucially,npm ciwill fail ifpackage-lock.jsonis missing or if it’s out of sync withpackage.json. This makes it ideal for ensuring reproducible builds.
When actions/setup-node is configured for caching, it implicitly expects npm ci (or equivalent for other package managers) to be run in a subsequent step. If npm ci succeeds after a cache restoration, the cache is considered valid and is saved. If npm ci fails, the cache might be invalidated.
Configuring the Cache Manually (for deeper understanding)
While actions/setup-node handles it conveniently, understanding the underlying actions/cache action is beneficial. You could achieve the same result with:
name: CI - Manual Cache
on: [push]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Use Node.js 18.x
uses: actions/setup-node@v3
with:
node-version: 18.x
- name: Cache node_modules
uses: actions/cache@v3
id: cache-nodemodules # Give it an ID to reference later
with:
path: node_modules
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
restore-keys: | # Fallback keys if the primary key doesn't match
${{ runner.os }}-node-
- name: Install dependencies
# Only run npm ci if the cache was not restored
if: steps.cache-nodemodules.outputs.cache-hit != 'true'
run: npm ci
- name: Run tests
run: npm test
In this manual setup:
-
actions/cache@v3is used directly. -
path: node_modulesspecifies what to cache. -
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}is the primary cache key. It’s composed of the OS, a prefix, and a hash of yourpackage-lock.json. Any change topackage-lock.jsonwill result in a different hash, thus a new cache key. -
restore-keysprovides fallback keys. If the exactkeyisn’t found, GitHub Actions will try to find a cache entry matching one of therestore-keys. This is useful for cases where only the Node.js version might have changed, but the lock file is the same. -
The
if: steps.cache-nodemodules.outputs.cache-hit != 'true'condition onnpm ciensures thatnpm ciis only run if the cache was not successfully restored. If the cache was restored,node_modulesis already present, andnpm ciwould be redundant (and potentially error if the cachednode_modulesisn’t perfectly aligned with the lock file in some edge cases, thoughnpm ciis robust).
The actions/setup-node action with cache: 'npm' is essentially a streamlined version of this manual actions/cache configuration, automatically managing the keys and paths for you.
The "Gotcha" with package-lock.json
The most common reason caching might not work as expected is a mismatch between the package-lock.json file used when the cache was created and the package-lock.json file present during the current job run.
If you commit a change to package.json but forget to commit the updated package-lock.json, or if you have divergent branches with different lock files, the cache key generated will be different. GitHub Actions won’t find a matching cache, and npm ci will run, downloading everything.
Similarly, if you run npm install (which can update package-lock.json) on your local machine and then commit only the package.json, your CI pipeline will likely generate a different cache key than what was expected, leading to a cache miss. Always commit both package.json and package-lock.json together when making dependency changes.
The Next Step: Dependency Review
Once your node_modules caching is solid, the next area to optimize or secure is often dependency review. GitHub Actions has built-in features or integrations to scan your dependencies for known vulnerabilities.