GitLab’s CI/CD pipelines are powerful, but they can become a bottleneck if not optimized. The needs keyword and Directed Acyclic Graphs (DAG) are your secret weapons for dramatically speeding up your pipelines by allowing jobs to run in parallel and skipping unnecessary stages.

Let’s see this in action. Imagine a common scenario: you have a set of tests that depend on a build artifact. Without needs, you’d typically have a build stage, followed by a test stage, where all tests run sequentially after the build is complete.

Here’s a simplified .gitlab-ci.yml without needs:

stages:
  - build
  - test
  - deploy

build_app:
  stage: build
  script:
    - echo "Building the application..."
    - sleep 60 # Simulate build time
    - echo "APP_VERSION=$(date +%s)" > build.env
  artifacts:
    reports:
      dotenv: build.env

run_unit_tests:
  stage: test
  needs: [] # No dependencies specified, so it waits for the 'build' stage to finish
  script:
    - echo "Running unit tests..."
    - sleep 30 # Simulate test time
    - echo "Unit tests passed!"

run_integration_tests:
  stage: test
  needs: [] # Also waits for the 'build' stage to finish
  script:
    - echo "Running integration tests..."
    - sleep 45 # Simulate test time
    - echo "Integration tests passed!"

deploy_to_staging:
  stage: deploy
  script:
    - echo "Deploying to staging..."
    - sleep 20
    - echo "Deployment to staging successful."

In this setup, run_unit_tests and run_integration_tests are in the test stage. They both have to wait for the entire build stage to complete, even though they don’t strictly depend on each other. Furthermore, they run sequentially within the test stage because they are in the same stage and have no defined needs. The total time here is roughly build_time + unit_test_time + integration_test_time.

Now, let’s introduce needs to create a DAG. The needs keyword allows you to specify which specific jobs a job depends on, not just entire stages. This enables jobs to start as soon as their dependencies are met, even if other jobs in a preceding stage are still running or haven’t started yet.

Here’s the same pipeline using needs to define dependencies and enable parallelism:

stages:
  - build
  - test
  - deploy

build_app:
  stage: build
  script:
    - echo "Building the application..."
    - sleep 60 # Simulate build time
    - echo "APP_VERSION=$(date +%s)" > build.env
  artifacts:
    reports:
      dotenv: build.env

run_unit_tests:
  stage: test
  needs: ["build_app"] # This job depends on build_app
  script:
    - echo "Running unit tests..."
    - sleep 30 # Simulate test time
    - echo "Unit tests passed!"

run_integration_tests:
  stage: test
  needs: ["build_app"] # This job also depends on build_app
  script:
    - echo "Running integration tests..."
    - sleep 45 # Simulate test time
    - echo "Integration tests passed!"

deploy_to_staging:
  stage: deploy
  needs: ["run_unit_tests", "run_integration_tests"] # This job depends on both tests completing
  script:
    - echo "Deploying to staging..."
    - sleep 20
    - echo "Deployment to staging successful."

In this DAG-enabled pipeline:

  1. build_app starts immediately.
  2. As soon as build_app finishes and its artifacts are available, both run_unit_tests and run_integration_tests can start in parallel because they both list build_app in their needs. They no longer have to wait for each other or for the entire build stage to finish.
  3. deploy_to_staging will start only after both run_unit_tests and run_integration_tests have successfully completed.

The total time is now closer to build_time + max(unit_test_time, integration_test_time). If your tests can run in parallel, this is a significant saving.

The needs keyword can also be used to skip entire stages. If a job in a later stage doesn’t depend on any jobs in a specific earlier stage, you can omit that stage from its needs entirely. GitLab will then skip that stage for that particular job’s execution path.

Consider a pipeline where you might have linting, building, testing, and deploying. If your deployment job doesn’t need the linting results, you can define its needs to point only to the jobs it actually requires.

stages:
  - lint
  - build
  - test
  - deploy

lint_code:
  stage: lint
  script:
    - echo "Linting code..."
    - sleep 15

build_app:
  stage: build
  script:
    - echo "Building the application..."
    - sleep 60
    - echo "APP_VERSION=$(date +%s)" > build.env
  artifacts:
    reports:
      dotenv: build.env

run_tests:
  stage: test
  needs: ["build_app"]
  script:
    - echo "Running tests..."
    - sleep 45

deploy_to_production:
  stage: deploy
  needs: ["run_tests"] # Only depends on tests completing, not linting
  script:
    - echo "Deploying to production..."
    - sleep 30

In this example, the deploy_to_production job explicitly lists run_tests as its dependency. It does not list lint_code. This means that even though lint_code is in an earlier stage, the deploy_to_production job doesn’t wait for it. If build_app or run_tests fail, deploy_to_production won’t run. But if lint_code were to fail, deploy_to_production would still be eligible to run if run_tests succeeds. This creates a more flexible and efficient pipeline flow.

You can also specify needs with an array of jobs, or use needs: [] to explicitly state a job has no dependencies and can run as soon as its stage is ready (or even before, if GitLab’s scheduler permits and the job doesn’t require artifacts from previous stages). Using needs with specific job names is the key to breaking free from rigid stage-by-stage execution and unlocking true parallelism.

When you use needs, GitLab creates a Directed Acyclic Graph (DAG) of your jobs. A DAG is a graph where the edges have a direction and there are no directed cycles. In GitLab CI, this means jobs can only depend on jobs that come "before" them in the dependency chain, preventing circular logic. The scheduler then intelligently determines which jobs can run concurrently based on these dependencies, rather than waiting for entire stages to complete.

A subtle but powerful aspect of needs is its ability to handle job failures gracefully. If a job listed in needs fails, the dependent job will be automatically skipped by default. You can override this behavior using the allow_failure keyword on the dependency job. For instance, if you have a linting job that you want to run for reporting but don’t want to block deployments if it fails, you’d mark it allow_failure: true and ensure your downstream jobs don’t strictly need it by only listing truly essential dependencies.

The next logical step is to explore how to use needs with trigger jobs to orchestrate multi-project pipelines, creating even more complex and efficient CI/CD workflows.

Want structured learning?

Take the full Gitlab-ci course →