Cut GKE Costs with Spot VMs, Autoscaling, and Right-Sizing (2026)

Spot VMs are surprisingly similar to regular VMs, but with one massive difference: they can be preempted by Google Cloud at any time.

Let’s see this in action. Imagine you have an application that can tolerate interruptions, like a batch processing job or a stateless web service.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: batch-processor
spec:
  replicas: 3
  selector:
    matchLabels:
      app: batch-processor
  template:
    metadata:
      labels:
        app: batch-processor
    spec:
      containers:
      - name: processor
        image: your-batch-image:latest
        resources:
          requests:
            cpu: "1"
            memory: "2Gi"
          limits:
            cpu: "2"
            memory: "4Gi"
      nodeSelector:
        cloud.google.com/gke-spot: "true"

This Deployment targets nodes specifically labeled for Spot VMs. When a Spot VM is preempted, Kubernetes will automatically reschedule the pod onto another available node, potentially a regular VM or another Spot VM if one becomes available. The key is that your application must be able to handle this sudden eviction and restart gracefully.

The core problem this solves is the high cost of persistent, on-demand compute for workloads that don’t need 24/7 guaranteed uptime. Spot VMs offer up to a 91% discount compared to on-demand prices, making them incredibly attractive for cost optimization.

Here’s how it works internally: Google Cloud has a pool of spare compute capacity. When you request Spot VMs, you’re essentially bidding on this spare capacity. If demand for regular VMs increases, Google Cloud might reclaim those Spot VMs to satisfy the demand, hence the "preemption." GKE integrates with this by allowing you to provision node pools that exclusively use Spot VMs. When a node in such a pool is preempted, GKE detects the node loss and initiates the process of replacing it, ensuring your cluster maintains its desired state.

The levers you control are:

Node Pool Configuration: You define node pools that utilize provisioningModel: SPOT in your GKE cluster configuration. This tells GKE to create nodes using Spot VM instances.
Pod Scheduling: You can use nodeSelector or nodeAffinity in your pod specifications to direct workloads to Spot VM nodes. This is crucial for ensuring only interruptible workloads land on these cost-effective but ephemeral resources.
Autoscaling: Combining Spot VMs with cluster autoscaling is where the real magic happens. The cluster autoscaler will automatically scale up the number of Spot VMs in a node pool when your application’s resource demands increase, and scale them down when demand subsides, further optimizing costs.

Autoscaling is configured at the node pool level. For a node pool using Spot VMs, you’d set autoscaling.enabled: true and specify minNodeCount and maxNodeCount. For example, to allow a Spot VM node pool to scale between 1 and 10 nodes:

apiVersion: container.googleapis.com/v1
kind: Cluster
metadata:
  name: my-gke-cluster
spec:
  nodePools:
  - name: spot-pool
    autoscaling:
      enabled: true
      minNodeCount: 1
      maxNodeCount: 10
    config:
      machineType: n1-standard-2
      spot: true # This is the key for Spot VMs

Right-sizing your nodes is also critical. Don’t provision a Spot VM node pool with n1-highmem-32 if your workloads only need n1-standard-2. Use GKE’s node auto-provisioning or manual configuration to select the smallest machine type that satisfies your application’s requests. Tools like the GKE cost allocation report and workload resource usage metrics are your best friends here. Regularly review which nodes are underutilized and consider adjusting their machine types or consolidating workloads.

The most surprising aspect of Spot VMs is their reliability in practice for many workloads. While the guarantee of uptime is zero, the actual preemption rate for many regions and machine types is remarkably low. This means you can often get the cost savings without suffering significant operational impact, provided your application is designed for resilience. The key is understanding that "preemptible" doesn’t always mean "constantly interrupted." Google Cloud prioritizes running your workloads, and Spot VMs are only reclaimed when there’s a genuine need for that capacity elsewhere.

The next step in cost optimization is often exploring preemptible instances with custom machine types.