The Horizontal Pod Autoscaler (HPA) is failing because the Kubernetes metrics server is not collecting or reporting the necessary resource utilization data for the pods.

Common Causes and Fixes

1. Metrics Server Not Running or Not Ready

  • Diagnosis: Check the metrics-server pod status:
    kubectl get pods -n kube-system | grep metrics-server
    
    If it’s not running or in a CrashLoopBackOff state, check its logs:
    kubectl logs -n kube-system <metrics-server-pod-name>
    
  • Fix: If the metrics server is not deployed, install it. For many clusters, this is as simple as:
    kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
    
    If it’s already deployed but unhealthy, often restarting it or redeploying the manifest fixes transient issues. Ensure the metrics server has necessary RBAC permissions and is not blocked by network policies. It needs to be able to scrape metrics from pods and nodes.
  • Why it works: The HPA relies entirely on the metrics server to get CPU and memory usage data. If the server isn’t running or healthy, it can’t provide this data, and the HPA has nothing to base its scaling decisions on.

2. HPA Targeting Non-existent Metrics

  • Diagnosis: Inspect the HPA definition to ensure the target metric is correctly specified and that pods are actually exposing metrics for that metric.
    kubectl get hpa <hpa-name> -o yaml
    
    Look at the spec.metrics section. If you’re using custom metrics, verify they are being scraped by your metrics adapter. For CPU/memory, ensure the pods have resource requests defined.
  • Fix: Ensure your pods have resources.requests defined for CPU and memory. The HPA uses these requests to calculate the utilization percentage.
    spec:
      containers:
      - name: my-app
        image: my-image
        resources:
          requests:
            cpu: "100m"
            memory: "128Mi"
    
    If using custom or external metrics, ensure your metrics adapter (e.g., Prometheus Adapter, custom adapter) is correctly configured and collecting the specified metrics.
  • Why it works: The HPA calculates the target utilization as current_usage / requested_resources. If resources.requests are missing or zero, this division is undefined or results in an infinite value, preventing scaling.

3. Incorrect Resource Requests/Limits

  • Diagnosis: Check the resource requests and limits for the pods managed by the HPA.
    kubectl get pods -l app=<your-app-label> -o jsonpath='{.items[*].spec.containers[*].resources}'
    
    Compare these requests to the actual observed resource usage. If requests are set too high, the HPA might think pods are underutilized and scale down. If set too low, it might not scale up fast enough.
  • Fix: Adjust resources.requests to be realistic based on typical load. The HPA scales based on the percentage of the request that is used. Setting requests too high means the HPA will require a lot of actual usage before it considers scaling up.
    resources:
      requests:
        cpu: "200m" # Increased from 100m if actual usage is often above 50%
        memory: "256Mi"
    
  • Why it works: The HPA’s core logic for CPU/memory scaling is (current_cpu_usage / cpu_request) * 100%. If cpu_request is too high, the percentage will be low even if usage is substantial, leading to scaling down or no scaling up.

4. HPA Targeting the Wrong Deployment/StatefulSet

  • Diagnosis: Verify the spec.scaleTargetRef in the HPA definition points to the correct workload resource (Deployment, StatefulSet, ReplicaSet).
    kubectl get hpa <hpa-name> -o yaml
    
    Check spec.scaleTargetRef.kind and spec.scaleTargetRef.name. Then, check if that named resource actually exists and has pods associated with it.
  • Fix: Correct the scaleTargetRef in the HPA to point to the intended workload.
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: my-application-deployment # Ensure this is the correct deployment name
    
  • Why it works: The HPA needs to know which set of pods to manage. If the scaleTargetRef is wrong, it will be trying to autoscale a resource that doesn’t exist or isn’t the one you intended, meaning it will never see or affect the pods you care about.

5. Network Policies Blocking Metrics Server Access

  • Diagnosis: If your cluster uses Network Policies, check if they are preventing the metrics-server from scraping pods. The metrics server typically runs in the kube-system namespace and needs to reach pods in other namespaces.
    kubectl get networkpolicy --all-namespaces
    
    Look for policies that might restrict egress from kube-system or ingress to pods on port 10250 (the default metrics port).
  • Fix: Add an ingress rule to your pod’s NetworkPolicy to allow traffic from the metrics-server’s namespace on the metrics port (usually TCP 10250).
    apiVersion: networking.k8s.io/v1
    kind: NetworkPolicy
    metadata:
      name: allow-metrics
      namespace: <your-app-namespace>
    spec:
      podSelector: {} # Applies to all pods in the namespace
      policyTypes:
      - Ingress
      ingress:
      - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system # Or a specific label for metrics-server
        ports:
        - protocol: TCP
          port: 10250
    
  • Why it works: The metrics server fetches pod metrics by directly connecting to pods on their metrics endpoint (often port 10250). Network policies can block this communication, making the metrics unavailable to the HPA.

6. HPA Not Enabled or Misconfigured for CPU/Memory

  • Diagnosis: Review the HPA configuration. For CPU/memory scaling, ensure the type is Resource and that resource is cpu or memory.
    kubectl get hpa <hpa-name> -o yaml
    
  • Fix: If you intended CPU/memory scaling but configured it differently (e.g., for custom metrics without setting them up), correct the spec.metrics section.
    spec:
      scaleTargetRef:
        # ...
      minReplicas: 1
      maxReplicas: 10
      metrics:
      - type: Resource
        resource:
          name: cpu
          target:
            type: Utilization
            averageUtilization: 50
    
  • Why it works: The HPA needs to know what to scale on. If the metrics definition is malformed or points to a metric type that isn’t being collected, it cannot operate.

After fixing these issues, you might encounter a new error if your cluster is very small or has very low load: "the HPA controller has not been able to calculate a target utilization for any of the pods." This typically means that no pods are currently running, or they are running with 0 CPU/memory requests, or the metrics server is still catching up.

Want structured learning?

Take the full Kubernetes course →