When you’re building in a monorepo, running your entire CI pipeline for every commit, even if only a single file in one service changed, is a massive waste of time and resources. GitHub Actions, by default, triggers on all pushes, but you can (and absolutely should) configure it to be smarter.
Here’s how you can make your GitHub Actions workflow only run for the services that were actually affected by a commit.
We’ll leverage GitHub’s paths filter in conjunction with a script to dynamically determine which services have changed.
The Core Idea: Path Filtering
GitHub Actions has a built-in paths filter for on: push and on: pull_request events. This allows you to specify which files or directories, if changed, should trigger a workflow run.
on:
push:
branches:
- main
paths:
- 'services/user/**'
- 'services/auth/**'
This is great if you know exactly which services might change. But in a large monorepo, manually listing every service’s path can become unmanageable. What if you add a new service? What if you refactor and move files? You need a dynamic approach.
Dynamic Path Filtering with a Script
The most robust solution involves a script that analyzes the Git history of your commit to determine which service directories have been modified. This script can then be used to conditionally execute your CI steps.
Let’s break down the process:
-
Get the Changed Files: We’ll use
git diff --name-only HEAD~1 HEADto get a list of all files that have changed between the previous commit and the current one. -
Identify Changed Services: We need a convention for how services are organized in your monorepo. A common pattern is
services/<service-name>/.... Your script will parse the list of changed files and map them back to their respective services. -
Conditional Execution: Based on the identified changed services, you’ll either proceed with the CI steps or skip them.
Implementing the Solution
Here’s a practical example using a shell script within your GitHub Actions workflow.
1. Workflow File (.github/workflows/ci.yml)
name: CI for Monorepo Services
on:
push:
branches:
- main # Or your deployment branch
jobs:
build_and_test:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0 # Fetch all history to compare commits
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Install dependencies
run: npm ci
- name: Determine Changed Services
id: changed_services
run: |
# Get list of changed files since the last commit
CHANGED_FILES=$(git diff --name-only HEAD~1 HEAD)
echo "Changed files: \n$CHANGED_FILES"
# Define your service directory pattern
SERVICE_DIR_PATTERN="^services/([^/]+)/"
# Extract unique service names from changed files
CHANGED_SERVICES=$(echo "$CHANGED_FILES" | grep -oP "$SERVICE_DIR_PATTERN" | sed "s/services\/\(.*\)\/.*/\1/" | sort -u)
if [ -z "$CHANGED_SERVICES" ]; then
echo "No service files changed. Skipping CI."
echo "::set-output name=run_ci::false"
else
echo "Changed services: $CHANGED_SERVICES"
# Set an output variable for the workflow to use
echo "::set-output name=run_ci::true"
# Also, set an environment variable for sub-steps if needed
echo "CHANGED_SERVICES_LIST=$CHANGED_SERVICES" >> $GITHUB_ENV
fi
- name: Run CI for Changed Services
if: steps.changed_services.outputs.run_ci == 'true'
run: |
echo "Running CI for services: ${{ env.CHANGED_SERVICES_LIST }}"
# Example: Loop through changed services and run their specific tests
for SERVICE in $(echo ${{ env.CHANGED_SERVICES_LIST }}); do
echo "Running tests for service: $SERVICE"
(cd services/$SERVICE && npm test) # Assuming each service has its own test script
done
- name: Deploy Changed Services
if: github.ref == 'refs/heads/main' && steps.changed_services.outputs.run_ci == 'true'
run: |
echo "Deploying changed services: ${{ env.CHANGED_SERVICES_LIST }}"
# Your deployment logic here. You might loop through services again.
# Example: (cd services/${{ env.CHANGED_SERVICES_LIST }} && npm run deploy)
Explanation:
fetch-depth: 0: This is crucial. It ensures thatactions/checkoutfetches the entire Git history, which is necessary forgit diffto work correctly across commits. If you only fetch the latest commit,HEAD~1might not be available.Determine Changed ServicesStep:git diff --name-only HEAD~1 HEAD: This command lists the paths of files that have been modified in the latest commit compared to the previous one.SERVICE_DIR_PATTERN="^services/([^/]+)/": This is a regular expression designed to capture the service name from paths likeservices/users/src/index.js. It looks forservices/followed by one or more characters that are not a slash ([^/]+), and captures these characters as the service name.grep -oP "$SERVICE_DIR_PATTERN": This extracts only the matching parts of the lines that conform to the service directory pattern.sed "s/services\/\(.*\)\/.*/\1/": Thissedcommand takes the output fromgrep(e.g.,services/users/) and extracts just the service name (users).sort -u: This ensures we get a unique, sorted list of service names.echo "::set-output name=run_ci::true": This is a GitHub Actions command to set an output variable for the job. We use this to control subsequent steps. If no services are changed,run_ciis set tofalse.echo "CHANGED_SERVICES_LIST=$CHANGED_SERVICES" >> $GITHUB_ENV: This sets an environment variable that can be accessed by later steps in the job.
Run CI for Changed ServicesStep:if: steps.changed_services.outputs.run_ci == 'true': This conditional ensures that this step only runs if therun_cioutput from the previous step wastrue.- The
runcommand then iterates through the identified changed services and executes their specific test commands. You’ll need to adaptnpm testto your actual build/test commands.
Deploy Changed ServicesStep:- This step is guarded by both the branch check (
github.ref == 'refs/heads/main') and therun_cicondition. This ensures deployments only happen on the main branch and only when relevant code has changed.
- This step is guarded by both the branch check (
Considerations and Enhancements
- Monorepo Structure: This script assumes a
services/<service-name>/directory structure. Adjust theSERVICE_DIR_PATTERNand thesedcommand if your structure differs. For example, if services are inpackages/<service-name>/, you’d change the pattern to^packages/([^/]+)/. - File Changes vs. Service Changes: This script assumes that any file change within a service directory implies that the service itself has changed and needs to be tested. For very large services, you might want a more granular check (e.g., checking specific subdirectories like
src/ortests/). - Shared Libraries: If you have shared libraries that multiple services depend on, changes to those libraries won’t be directly caught by this script if they aren’t in a
services/directory. You’ll need to adjust yourSERVICE_DIR_PATTERNor add explicit paths for shared modules (e.g.,shared/**). git diffBehavior: For the very first commit on a branch,HEAD~1won’t exist.git diff --name-only HEAD~1 HEADwill likely output nothing, correctly skipping CI for this initial commit. For the very first commit on the entire repository,git diffmight behave differently, but usually, you’d want to run CI for the initial commit. You could add a check forgit rev-list --count HEADbeing 1 to handle this edge case.- Performance: For extremely large monorepos with thousands of files,
git diffmight take a noticeable amount of time. However, it’s still orders of magnitude faster than running the full CI suite. - Tooling: For more complex monorepos, tools like Nx, Lerna, or Turborepo have built-in caching and affected-project detection that can be integrated with CI pipelines for even more sophisticated optimizations.
By implementing this dynamic path filtering, you significantly reduce the execution time and cost of your CI/CD pipeline, making your development workflow much more efficient. The next logical step would be integrating a monorepo management tool to handle dependency graphing and task orchestration more intelligently.