Jenkins builds are marked "unstable" instead of "failed" when tests fail, and this distinction is crucial because it signals a recoverable issue, not a catastrophic build breakage.

Let’s see this in action. Imagine a Java project using Maven and Surefire for tests.

<project>
  ...
  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-surefire-plugin</artifactId>
        <version>3.0.0-M5</version>
        <configuration>
          <includes>
            <include>**/*Test.java</include>
          </includes>
        </configuration>
      </plugin>
    </plugins>
  </build>
  ...
</project>

If a test like this fails:

import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.assertEquals;

class MyMathTest {
    @Test
    void additionTest() {
        assertEquals(5, 2 + 2, "2 + 2 should equal 5"); // This assertion will fail
    }
}

When Jenkins runs mvn clean install, Surefire will report the test failure. By default, Jenkins’s Maven integration is configured to interpret test failures as "unstable" builds. This is because test failures, while undesirable, don’t necessarily mean the build process itself broke (e.g., compilation failed, dependencies couldn’t be resolved). The build artifacts were likely still produced.

The core problem is that the build process itself completed, but the quality gate (tests) was not met. Jenkins’s default behavior differentiates between build failures (e.g., compilation errors, SCM checkout issues) and test failures. This allows for different downstream actions: a "failed" build might trigger an immediate rollback, while an "unstable" build might just notify developers to fix the tests without halting further pipeline stages that don’t depend on test results.

Common Causes and Fixes for Test Failures Leading to Unstable Builds:

  1. Actual Test Logic Errors: The most common reason is a legitimate bug in the code being tested, or an incorrect assertion in the test itself.

    • Diagnosis: Review the Jenkins console output for the specific test that failed. Look for stack traces and assertion messages. Run the tests locally with mvn clean test or mvn clean verify to reproduce the failure and debug.
    • Fix: Correct the code or the test assertion. For the example above, change assertEquals(5, 2 + 2, ...) to assertEquals(4, 2 + 2, ...).
    • Why it works: This directly addresses the root cause of the test failing, ensuring the code behaves as expected.
  2. Environment-Specific Test Failures: Tests that rely on external services (databases, APIs) or specific system configurations can fail in the Jenkins environment but pass locally.

    • Diagnosis: Check if the failing tests interact with external dependencies. Verify that these dependencies are running and accessible from the Jenkins agent. Look for network errors or connection timeouts in the console output.
    • Fix: Ensure all required services are running and configured correctly on the Jenkins agent or are otherwise accessible. For a database test, this might mean starting a database service or ensuring connection strings are correct (e.g., jdbc:mysql://localhost:3306/testdb).
    • Why it works: By making the test environment consistent with what the test expects, the test can execute successfully.
  3. Flaky Tests (Intermittent Failures): Tests that pass sometimes and fail others, often due to race conditions, reliance on system time, or non-deterministic operations.

    • Diagnosis: Observe if the same test fails intermittently across multiple builds. Look for patterns related to test execution order or timing.
    • Fix: Refactor the test to be deterministic. Avoid Thread.sleep() if possible, ensure thread safety if tests run in parallel, and use proper synchronization mechanisms. For example, instead of Thread.sleep(1000);, use a polling mechanism that waits for a specific condition to be met.
    • Why it works: Removing non-determinism makes the test reliable and predictable, ensuring it passes when the code is correct.
  4. Resource Leaks or Exhaustion: Tests might fail due to running out of memory, file handles, or other system resources, especially in long-running builds or on agents with limited resources.

    • Diagnosis: Check the Jenkins agent’s system logs for out-of-memory errors (OOM killer) or resource exhaustion messages. Monitor agent resource usage during the build.
    • Fix: Optimize test code to release resources promptly. Increase JVM heap size for the agent or the build process (-Xmx2g). Ensure tests clean up after themselves.
    • Why it works: Providing sufficient resources or preventing leaks allows tests to complete without being terminated by the operating system or runtime.
  5. Incorrect Test Data or Setup: The data used to set up a test is incorrect, leading to unexpected outcomes.

    • Diagnosis: Examine the test setup code and the data it uses. Compare it with the expected state for the test to pass.
    • Fix: Ensure test data is correctly generated or loaded. For example, if a test expects a user record with ID 123, ensure that record exists in the test database before the test runs.
    • Why it works: Valid test data ensures the conditions under which the code is tested are correct, allowing assertions to pass if the code logic is sound.
  6. Outdated Dependencies or Plugin Versions: Incompatibility between test frameworks, build tools, or project dependencies can cause tests to misbehave.

    • Diagnosis: Check the versions of maven-surefire-plugin, JUnit, Mockito, and other testing-related dependencies in your pom.xml. Consult their release notes for known issues or compatibility problems.
    • Fix: Update relevant dependencies to their latest stable versions. For instance, update Surefire to 3.0.0-M7 or the latest 3.x release.
    • Why it works: Newer versions often contain bug fixes and improved compatibility, resolving underlying issues that might manifest as test failures.
  7. Jenkins Agent Configuration Issues: Incorrect Java versions, missing environment variables, or file permission problems on the Jenkins agent can cause tests to fail in unexpected ways.

    • Diagnosis: Verify the JAVA_HOME environment variable on the agent matches the expected JDK version. Check file permissions for the workspace directory. Ensure any required tools (like git or specific SDKs) are present and in the PATH.
    • Fix: Correct the agent’s environment variables or system configuration. For example, set JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64 on the agent.
    • Why it works: A properly configured agent provides the consistent environment necessary for builds and tests to run as intended.

The next error you’ll encounter after fixing all test failures is usually a "Build Blocked" status due to a pipeline stage failing that isn’t test-related, like a deployment step.

Want structured learning?

Take the full Jenkins course →