Caching build dependencies in GitHub Actions is a game-changer for workflow speed, but it’s not just about slapping a cache step in and calling it a day. The real magic happens when you understand how the cache works, why it invalidates, and how to precisely control it.
Let’s see it in action. Imagine a Python project with pip dependencies. A typical workflow might look like this:
name: Python Build with Cache
on: [push]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Cache pip dependencies
uses: actions/cache@v4
id: cache-pip
with:
path: ~/.cache/pip
key: pip-$(hashFiles('**/requirements.txt'))
- name: Install dependencies
run: pip install -r requirements.txt
if: steps.cache-pip.outputs.cache-hit != 'true'
- name: Run tests
run: pytest
Here’s what’s happening:
actions/checkout@v4: This pulls your code into the runner. Standard stuff.actions/setup-python@v5: This configures the Python environment. Crucially, it tellspipwhere to store its cache, which is usually~/.cache/pip.actions/cache@v4: This is the core of our caching strategy.path: ~/.cache/pip: This tells the action what to cache. It’s the directory wherepipstores downloaded wheels and built packages.key: pip-$(hashFiles('**/requirements.txt')): This is the critical part for cache invalidation. Thekeyis a unique identifier for a specific cache. If a cache with this key already exists, it’s restored. If not, a new one is created.hashFiles('**/requirements.txt')generates a hash based on the contents of allrequirements.txtfiles in your repository. Ifrequirements.txtchanges, the hash changes, and a new cache is created.
Install dependencies: This step only runs if the cache was not hit (if: steps.cache-pip.outputs.cache-hit != 'true'). If the cache was hit,pipwill find the necessary packages already in~/.cache/pipand skip downloading/building them, making this step effectively a no-op.Run tests: Your actual build or test step.
The mental model here is simple: instead of re-downloading and re-installing dependencies every single time, we want to save them once and restore them on subsequent runs. The actions/cache action manages this saving and restoring. The key is the gatekeeper; it determines if the cache is "fresh" enough to be reused.
The most surprising thing about actions/cache is how it handles cache misses and hits. When the cache key doesn’t match an existing cache, the action still runs. It simply fails to restore anything. This is why the if: steps.cache-pip.outputs.cache-hit != 'true' condition is vital. It ensures that the expensive pip install command only executes when the cache isn’t available. Without it, you’d be installing dependencies even when a cache was restored, defeating the purpose. Furthermore, when a cache is restored, the ~/.cache/pip directory is populated before your Install dependencies step runs. pip then checks this directory, finds what it needs, and exits quickly.
Beyond pip, this pattern is applicable to many build systems. For Node.js, you’d cache ~/.npm. For Rust, ~/.cargo/registry and ~/.cargo/git. The principle remains: identify the dependency cache directory, and create a cache key that accurately reflects changes in your project’s dependencies.
The next common hurdle is dealing with multiple dependency files or complex build configurations that affect dependencies, requiring more sophisticated cache keys.