The Horizontal Pod Autoscaler (HPA) is failing because the Kubernetes metrics server is not collecting or reporting the necessary resource utilization data for the pods.
Common Causes and Fixes
1. Metrics Server Not Running or Not Ready
- Diagnosis: Check the
metrics-serverpod status:
If it’s not running or in akubectl get pods -n kube-system | grep metrics-serverCrashLoopBackOffstate, check its logs:kubectl logs -n kube-system <metrics-server-pod-name> - Fix: If the metrics server is not deployed, install it. For many clusters, this is as simple as:
If it’s already deployed but unhealthy, often restarting it or redeploying the manifest fixes transient issues. Ensure the metrics server has necessary RBAC permissions and is not blocked by network policies. It needs to be able to scrape metrics from pods and nodes.kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml - Why it works: The HPA relies entirely on the metrics server to get CPU and memory usage data. If the server isn’t running or healthy, it can’t provide this data, and the HPA has nothing to base its scaling decisions on.
2. HPA Targeting Non-existent Metrics
- Diagnosis: Inspect the HPA definition to ensure the
targetmetric is correctly specified and that pods are actually exposing metrics for that metric.
Look at thekubectl get hpa <hpa-name> -o yamlspec.metricssection. If you’re using custom metrics, verify they are being scraped by your metrics adapter. For CPU/memory, ensure the pods have resource requests defined. - Fix: Ensure your pods have
resources.requestsdefined for CPU and memory. The HPA uses these requests to calculate the utilization percentage.
If using custom or external metrics, ensure your metrics adapter (e.g., Prometheus Adapter, custom adapter) is correctly configured and collecting the specified metrics.spec: containers: - name: my-app image: my-image resources: requests: cpu: "100m" memory: "128Mi" - Why it works: The HPA calculates the target utilization as
current_usage / requested_resources. Ifresources.requestsare missing or zero, this division is undefined or results in an infinite value, preventing scaling.
3. Incorrect Resource Requests/Limits
- Diagnosis: Check the resource
requestsandlimitsfor the pods managed by the HPA.
Compare these requests to the actual observed resource usage. If requests are set too high, the HPA might think pods are underutilized and scale down. If set too low, it might not scale up fast enough.kubectl get pods -l app=<your-app-label> -o jsonpath='{.items[*].spec.containers[*].resources}' - Fix: Adjust
resources.requeststo be realistic based on typical load. The HPA scales based on the percentage of the request that is used. Setting requests too high means the HPA will require a lot of actual usage before it considers scaling up.resources: requests: cpu: "200m" # Increased from 100m if actual usage is often above 50% memory: "256Mi" - Why it works: The HPA’s core logic for CPU/memory scaling is
(current_cpu_usage / cpu_request) * 100%. Ifcpu_requestis too high, the percentage will be low even if usage is substantial, leading to scaling down or no scaling up.
4. HPA Targeting the Wrong Deployment/StatefulSet
- Diagnosis: Verify the
spec.scaleTargetRefin the HPA definition points to the correct workload resource (Deployment, StatefulSet, ReplicaSet).
Checkkubectl get hpa <hpa-name> -o yamlspec.scaleTargetRef.kindandspec.scaleTargetRef.name. Then, check if that named resource actually exists and has pods associated with it. - Fix: Correct the
scaleTargetRefin the HPA to point to the intended workload.spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: my-application-deployment # Ensure this is the correct deployment name - Why it works: The HPA needs to know which set of pods to manage. If the
scaleTargetRefis wrong, it will be trying to autoscale a resource that doesn’t exist or isn’t the one you intended, meaning it will never see or affect the pods you care about.
5. Network Policies Blocking Metrics Server Access
- Diagnosis: If your cluster uses Network Policies, check if they are preventing the
metrics-serverfrom scraping pods. The metrics server typically runs in thekube-systemnamespace and needs to reach pods in other namespaces.
Look for policies that might restrict egress fromkubectl get networkpolicy --all-namespaceskube-systemor ingress to pods on port 10250 (the default metrics port). - Fix: Add an ingress rule to your pod’s NetworkPolicy to allow traffic from the
metrics-server’s namespace on the metrics port (usually TCP 10250).apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-metrics namespace: <your-app-namespace> spec: podSelector: {} # Applies to all pods in the namespace policyTypes: - Ingress ingress: - from: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: kube-system # Or a specific label for metrics-server ports: - protocol: TCP port: 10250 - Why it works: The metrics server fetches pod metrics by directly connecting to pods on their metrics endpoint (often port 10250). Network policies can block this communication, making the metrics unavailable to the HPA.
6. HPA Not Enabled or Misconfigured for CPU/Memory
- Diagnosis: Review the HPA configuration. For CPU/memory scaling, ensure the
typeisResourceand thatresourceiscpuormemory.kubectl get hpa <hpa-name> -o yaml - Fix: If you intended CPU/memory scaling but configured it differently (e.g., for custom metrics without setting them up), correct the
spec.metricssection.spec: scaleTargetRef: # ... minReplicas: 1 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 50 - Why it works: The HPA needs to know what to scale on. If the metrics definition is malformed or points to a metric type that isn’t being collected, it cannot operate.
After fixing these issues, you might encounter a new error if your cluster is very small or has very low load: "the HPA controller has not been able to calculate a target utilization for any of the pods." This typically means that no pods are currently running, or they are running with 0 CPU/memory requests, or the metrics server is still catching up.