Your Kubernetes pods are reporting Readiness probe failed errors, meaning the kubelet on the node decided your pod wasn’t ready to receive traffic and stopped directing requests to it. This isn’t a crash; the pod is still running, but Kubernetes thinks it’s unhealthy and won’t send it new work.

Here’s why your readiness probes are failing:

1. Application Not Listening on the Correct Port The most common culprit is the application inside the pod not actually listening on the port specified in your readinessProbe definition. kubelet tries to connect to this port to check health.

  • Diagnosis: Exec into the pod and use netstat -tulnp or ss -tulnp to see which ports your application is listening on.
    kubectl exec -it <your-pod-name> -- netstat -tulnp
    
  • Fix: Update your readinessProbe in the Deployment/StatefulSet YAML to match the actual listening port. For example, if your app listens on 8080 but the probe is set to 80:
    readinessProbe:
      httpGet:
        path: /healthz
        port: 8080 # Corrected port
      initialDelaySeconds: 5
      periodSeconds: 10
    
  • Why it works: kubelet can now successfully establish a TCP connection to the port your application is bound to.

2. Application Not Responding to Health Check Endpoint Even if listening on the right port, your application might not be serving a valid HTTP response (or any response) on the specified path.

  • Diagnosis: Use curl from within the pod to hit the health check endpoint directly.
    kubectl exec -it <your-pod-name> -- curl http://localhost:<probe-port>/<probe-path>
    
    For example:
    kubectl exec -it <your-pod-name> -- curl http://localhost:8080/healthz
    
    Check for non-2xx status codes or no output at all.
  • Fix: Ensure your application’s health check endpoint returns an HTTP status code in the 2xx range (e.g., 200 OK) when it’s healthy. If it returns 5xx, 4xx, or no response, the probe will fail.
    readinessProbe:
      httpGet:
        path: /status # Ensure this path is correctly implemented
        port: 8080
      # ... other settings
    
  • Why it works: kubelet receives a successful HTTP status code, indicating the application is ready.

3. Application is Slow to Start Up Your application takes longer to become ready than the initialDelaySeconds configured in the probe. kubelet starts checking immediately after the container is running, but the app might still be initializing databases, loading caches, or performing other startup tasks.

  • Diagnosis: Observe the pod’s startup logs using kubectl logs <your-pod-name>. You’ll see the container starts, but the readiness probe fails repeatedly until much later.
  • Fix: Increase initialDelaySeconds to give your application ample time to initialize.
    readinessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 60 # Increased from 5 to 60 seconds
      periodSeconds: 10
    
  • Why it works: kubelet waits longer before its first probe attempt, ensuring the application has completed its essential startup routines.

4. Network Policy Blocking Probe Traffic If you have NetworkPolicies in place, they might be preventing kubelet from reaching the pod’s health check port. kubelet runs on the node, and its traffic might be subject to policies.

  • Diagnosis: Check your NetworkPolicy definitions for any rules that might restrict ingress to the pod on the probe port from the kube-system namespace or the node’s IP.
  • Fix: Add a NetworkPolicy rule that explicitly allows ingress traffic from the kube-controller-manager or the kubelet’s IP range on the probe port. Often, allowing ingress from kube-system namespace is sufficient.
    apiVersion: networking.k8s.io/v1
    kind: NetworkPolicy
    metadata:
      name: allow-health-checks
      namespace: default # Your pod's namespace
    spec:
      podSelector:
        matchLabels:
          app: your-app # Label matching your pod
      policyTypes:
      - Ingress
      ingress:
      - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system # Allow from kube-system
        ports:
        - protocol: TCP
          port: 8080 # Your probe port
    
  • Why it works: This policy ensures that the kubelet (which is typically managed by components in kube-system) can reach your pod on its readiness probe port.

5. Resource Constraints (CPU/Memory Throttling) The pod is starved for resources, preventing the application from responding to the probe within the configured timeoutSeconds.

  • Diagnosis: Check pod resource usage with kubectl top pod <your-pod-name> and look for high CPU or memory consumption. Examine pod events (kubectl get events --field-selector involvedObject.name=<your-pod-name>) for OOMKilled or throttling messages.
  • Fix: Increase the CPU and memory requests and limits for your container.
    resources:
      requests:
        memory: "256Mi"
        cpu: "200m"
      limits:
        memory: "512Mi"
        cpu: "500m"
    
  • Why it works: Adequate resources allow the application process to execute its health check logic and respond to kubelet’s requests promptly.

6. Incorrect timeoutSeconds or periodSeconds The probe’s timeoutSeconds is too short for your application to respond, or periodSeconds is too long, causing multiple failures to accumulate before kubelet marks it as not ready.

  • Diagnosis: Observe the timing of probe failures. If your application sometimes responds but it takes longer than the timeout, this is the issue.
  • Fix: Adjust timeoutSeconds and periodSeconds to be more forgiving. A common pattern is timeoutSeconds slightly longer than your expected response time, and periodSeconds twice that.
    readinessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 15 # Increased from 10
      timeoutSeconds: 5 # Increased from 1
    
  • Why it works: A longer timeout gives the application more time to process the request and send a response, while adjusting the period ensures probes are sent at a reasonable interval without overwhelming the application or missing transient issues.

7. Liveness Probe is Incorrectly Configured (and overriding readiness) While less common, a misconfigured liveness probe (especially if it’s failing more aggressively) can sometimes lead to pods being restarted, which then interrupts readiness checks. However, the primary mechanism is kubelet acting on the readiness probe directly. If your readiness probe is failing, the pod will be taken out of service rotation.

  • Diagnosis: Review both your livenessProbe and readinessProbe definitions carefully.
  • Fix: Ensure both probes are correctly configured, have appropriate initialDelaySeconds, periodSeconds, timeoutSeconds, and failureThreshold.
  • Why it works: Correctly configured probes accurately reflect the application’s state without causing unnecessary restarts or misinterpretations of readiness.

If you resolve these, you’ll likely next encounter CrashLoopBackOff if the underlying application issue is more severe than just slow startup.

Want structured learning?

Take the full Kubernetes course →