Node auto-provisioning lets GKE create new node pools for you when your cluster needs more resources.

Let’s see it in action. Imagine you have a GKE cluster running a deployment that needs more pods.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 5
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: my-app-container
        image: nginx:latest
        resources:
          requests:
            cpu: "200m"
            memory: "256Mi"

When you apply this, Kubernetes tries to schedule 5 pods. If your existing nodes don’t have enough capacity, these pods will remain Pending.

Here’s what happens under the hood with node auto-provisioning enabled. GKE’s cluster autoscaler, specifically its auto-provisioning component, notices these pending pods. It checks your cluster’s configuration for any NodePool resources that are eligible for auto-provisioning. If it finds them, it determines the best machine type and configuration to satisfy the pending pods’ resource requests. Then, it automatically creates a new node pool with these specifications, adding new nodes to your cluster. Once the new nodes are ready, Kubernetes schedules the pending pods onto them.

The magic behind this is the ClusterAutoscaler resource, which you typically configure when you create or update your GKE cluster. You define profile templates that auto-provisioning uses to decide what kind of node pools to create.

Here’s a snippet of what that might look like in your cluster’s configuration (though you interact with this via gcloud or the GCP console, not by directly editing a ClusterAutoscaler resource like this):

# This is a conceptual representation, not direct YAML you edit for GKE
# Auto-provisioning is configured via gcloud commands or the GCP Console.
# Example configuration parameters:

# Enable auto-provisioning for the cluster
autoProvisioning:
  enabled: true

  # Define profile templates for node pool creation
  template:
    spec:
      # Define machine types to consider
      machineTypes:
      - "e2-medium"
      - "n1-standard-1"

      # Define disk sizes and types
      diskSizeGb: 100
      diskType: "pd-standard"

      # Define operating system and container runtime
      osImageType: "COS" # Container-Optimized OS
      containerRuntime: "containerd"

      # Define scaling limits for auto-provisioned node pools
      autoscaling:
        minNodeCount: 1
        maxNodeCount: 5

      # Define taints and labels for the nodes
      labels:
        "cloud.google.com/gke-nodepool": "auto-provisioned"
      taints:
      - key: "example.com/taint"
        value: "true"
        effect: "NoSchedule"

When pending pods appear, auto-provisioning considers these template settings. It might decide e2-medium is a good fit for your 200m CPU and 256Mi memory requests, creating a new e2-medium node pool. The minNodeCount and maxNodeCount ensure the node pool doesn’t grow uncontrollably.

The most surprising thing about node auto-provisioning is that it doesn’t just blindly create the largest possible machine type to satisfy requests. It performs a cost-benefit analysis, considering your specified machineTypes and the actual resource requests of the pending pods. It aims to find the most cost-efficient machine type that can accommodate those requests, often picking smaller, more granular instances rather than over-provisioning with large machines. This intelligent selection is key to managing your cloud spend.

The gcloud container clusters update <cluster-name> --zone <zone> --enable-autoscaling --min-nodes=1 --max-nodes=5 --node-locations=<zone-a>,<zone-b> command is what you’d use to enable the cluster autoscaler. For auto-provisioning specifically, you’d use gcloud container clusters update <cluster-name> --zone <zone> --enable-autoprovisioning --autoprovisioning-locations=<zone-a>,<zone-b> --autoprovisioning-max-surge-upgrade=1 --autoprovisioning-max-unavailable-upgrade=0. The --autoprovisioning-locations flag is crucial, as it specifies which zones auto-provisioning can create nodes in.

You can also specify --autoprovisioning-min-cpu-platform to ensure nodes meet certain CPU requirements, like amd64-v2. This is useful if your workloads depend on specific CPU features.

Once your cluster has auto-provisioning enabled and your workloads start requesting resources that can’t be met by existing nodes, you’ll see new node pools appearing in your GKE cluster console or via gcloud container node-pools list --cluster <cluster-name>. These will typically have names like np-12345678-abcd-efgh.

The next concept you’ll likely encounter is managing the lifecycle of these auto-provisioned node pools, including setting up rules for their deletion when they are no longer needed.

Want structured learning?

Take the full Gke course →