Route Traffic Across Multiple GKE Clusters with Multi-Cluster Ingress (2026)

Multi-Cluster Ingress lets you distribute traffic across multiple GKE clusters, but the most surprising thing is how it achieves this without a single, massive global load balancer.

Imagine you have two GKE clusters, cluster-us-east1 in us-east1 and cluster-europe-west1 in europe-west1, both running the same frontend deployment. You want traffic hitting a single, global IP address to be routed to the closest healthy cluster.

Here’s the setup:

First, ensure your GKE clusters have the necessary features enabled. You’ll need the "Network Endpoint Groups" feature enabled for your GKE clusters.

gcloud container clusters update cluster-us-east1 \
    --update-addons=NetworkEndpointGroups=ENABLED \
    --zone us-east1-b

gcloud container clusters update cluster-europe-west1 \
    --update-addons=NetworkEndpointGroups=ENABLED \
    --zone europe-west1-d

Next, deploy your application to both clusters. For this example, we’ll use a simple frontend deployment and a Service of type LoadBalancer.

Deployment YAML (common to both clusters):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend
spec:
  replicas: 2
  selector:
    matchLabels:
      app: frontend
  template:
    metadata:
      labels:
        app: frontend
    spec:
      containers:
      - name: frontend
        image: nginxdemos/hello:plain-text
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: frontend-svc
spec:
  selector:
    app: frontend
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80
  type: LoadBalancer

Apply this to cluster-us-east1:

kubectl apply -f deployment.yaml --context=gke_your-project_us-east1-b_cluster-us-east1

And to cluster-europe-west1:

kubectl apply -f deployment.yaml --context=gke_your-project_europe-west1-d_cluster-europe-west1

Once the services are created, you’ll have two separate LoadBalancer IPs, one for each cluster. This is where Multi-Cluster Ingress comes in. You create a MultiClusterIngress resource.

MultiClusterIngress YAML:

apiVersion: networking.gke.io/v1
kind: MultiClusterIngress
metadata:
  name: multi-cluster-frontend
spec:
  template:
    spec:
      backend:
        serviceName: frontend-svc
        servicePort: 80

Apply this resource to your management cluster (or any cluster that can reach the other clusters, typically your primary GKE cluster).

kubectl apply -f mci.yaml --context=gke_your-project_us-east1-b_cluster-us-east1

What happens now is that Google Cloud’s control plane sees this MultiClusterIngress resource and understands you want to expose frontend-svc across your configured clusters. It provisions a single Global External HTTP(S) Load Balancer. This load balancer doesn’t directly point to your GKE clusters’ internal load balancer IPs. Instead, it configures the Global External HTTP(S) Load Balancer to use Network Endpoint Groups (NEGs) as its backends.

Each GKE cluster that has the frontend-svc running will automatically register its healthy endpoints (pods) with a corresponding regional NEG managed by Google Cloud. The MultiClusterIngress controller then tells the Global External HTTP(S) Load Balancer to use these NEGs.

When a user requests multi-cluster-frontend.example.com (you’d set up a DNS record for this pointing to the Global External HTTP(S) Load Balancer’s IP), the request first hits the Global External HTTP(S) Load Balancer. This load balancer has built-in global routing logic. It directs the traffic to the closest healthy NEG, which in turn routes the traffic to a healthy pod within the nearest healthy GKE cluster.

The key here is that the Global External HTTP(S) Load Balancer is the single entry point, but its backends are dynamically managed NEGs that point to your actual GKE workloads. This avoids a single point of failure and allows for geo-distribution and failover.

The one thing most people don’t realize is that Multi-Cluster Ingress doesn’t create new load balancers in each cluster. It leverages the existing Service of type LoadBalancer in each cluster to register healthy endpoints with Google Cloud’s global load balancing infrastructure via NEGs. The MultiClusterIngress resource acts as a declaration to the Google Cloud control plane, orchestrating the creation and management of the global load balancer and its backend NEGs.

The next thing you’ll likely want to configure is advanced traffic management, like weighted routing or path-based routing, using a MultiClusterService resource.