GitLab CI pipelines in large repositories often become a bottleneck, not because of the jobs themselves, but because the Git fetch operation at the start of each job is taking an eternity.
Let’s see a pipeline in action. Imagine you have a monorepo with hundreds of microservices. Your .gitlab-ci.yml looks something like this:
stages:
- build
- test
- deploy
build_service_a:
stage: build
script:
- echo "Building service A..."
- ./build.sh service_a
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
test_service_a:
stage: test
script:
- echo "Testing service A..."
- ./test.sh service_a
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
build_service_b:
stage: build
script:
- echo "Building service B..."
- ./build.sh service_b
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
test_service_b:
stage: test
script:
- echo "Testing service B..."
- ./test.sh service_b
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
When this pipeline runs, each job, from build_service_a to test_service_b, will first perform a git fetch --depth=1. In a massive repository, this single operation can take minutes, easily doubling or tripling your pipeline’s total execution time. The actual build or test commands are trivial in comparison.
The core problem is that GitLab CI, by default, clones the entire repository history (or a shallow clone of the latest commit) for every single job. When your repository has hundreds of thousands of commits and millions of files, this becomes incredibly inefficient. Jobs don’t need the entire history; they often only need the changes relevant to their specific task or service.
Here’s how to tackle this:
1. Optimize Git Fetch Depth:
The default GIT_DEPTH is usually 50. For very large repos, this can still be too much.
- Diagnosis: Look at the job logs. The
Fetching changes with git depth set to...line will show the depth. - Fix: Set
GIT_DEPTHto a very small number, like1or5.variables: GIT_DEPTH: 5 # or 1 - Why it works: Reduces the amount of history Git needs to download, speeding up the initial checkout.
2. Utilize GIT_STRATEGY: clone with GIT_SUBMODULE_STRATEGY: none (if you don’t use submodules):
If your jobs don’t need the full history or submodules, explicitly telling Git to clone and not fetch submodules can sometimes be faster.
- Diagnosis: Check if your jobs
git checkoutcommands are the longest part of the execution. - Fix:
variables: GIT_STRATEGY: clone GIT_SUBMODULE_STRATEGY: none - Why it works:
clonebypasses some of the fetch logic and directly gets the working copy, and explicitly disabling submodules avoids unnecessary submodule initialization if they aren’t used.
3. Selective Fetching with GIT_FETCH_EXTRA_CLONE_FLAGS:
You can tell Git to only fetch specific branches or tags.
- Diagnosis: You only care about changes on
mainor specific release tags. - Fix:
variables: GIT_FETCH_EXTRA_CLONE_FLAGS: --branch main --single-branch - Why it works: Instructs Git to only fetch the history for the specified branch, dramatically reducing download size.
4. Use a Git Server Mirror: If you have a dedicated Git server, you can configure GitLab Runner to use a local mirror.
- Diagnosis: Your runners are far from your main Git server, or you have extremely high network latency.
- Fix: Configure your GitLab Runner to point to a local Git server mirror. This is a more advanced setup involving Prometheus and a local Git daemon. Consult GitLab’s documentation for detailed setup.
- Why it works: Reduces network latency by fetching from a local source instead of a remote one.
5. Artifacts for Dependencies (if applicable): If your services depend on build artifacts from other services, use GitLab’s artifact system.
- Diagnosis: Job A builds something, and Job B needs that output. Instead of Job B re-fetching the repo and rebuilding, it downloads the artifact.
- Fix:
build_service_a: stage: build script: - ./build.sh service_a artifacts: paths: - build/service_a/ expire_in: 1 week test_service_b: stage: test script: - ./test.sh service_b --dependency build/service_a/ dependencies: - build_service_a - Why it works: Downloads only the necessary built artifacts, avoiding a full repo checkout for dependent jobs.
6. Sparse Checkout:
For truly massive monorepos, sparse-checkout can be a lifesaver. It tells Git to only check out specific directories within the repository.
- Diagnosis: You only need files for
service_aandservice_b, but the repo has 1000 services. - Fix:
You might need to adjust thescript: - git config core.sparseCheckout true - echo "service_a/" >> .git/info/sparse-checkout - echo "service_b/" >> .git/info/sparse-checkout - git checkout # your build/test commands followGIT_DEPTHandGIT_STRATEGYvariables in conjunction with this. - Why it works: Git only downloads and makes available the files you explicitly specify, drastically reducing the size of the working directory and the time to checkout.
7. Runner Configuration (config.toml):
Globally set Git strategies and depths for your runners.
- Diagnosis: You want these optimizations applied to all jobs on a specific set of runners.
- Fix: In your
config.tomlfile on the runner:[runners.git] depth = 5 strategy = "clone" submodule_strategy = "none" - Why it works: Enforces these settings at the runner level, reducing the need to specify them in every
.gitlab-ci.ymlfile.
The next hurdle you’ll likely encounter is dealing with large Docker image builds, which have their own set of optimization strategies like layer caching and multi-stage builds.