Connecting multiple Kubernetes clusters with Istio multicluster isn’t about magically making them talk; it’s about building a unified, secure, and observable network fabric across distributed environments, treating them as a single logical unit.
Let’s see this in action. Imagine we have two clusters, cluster-a and cluster-b, and we want a service deployed in cluster-a to be accessible from cluster-b as if it were local.
First, we need to establish trust. This means having a shared root Certificate Authority (CA) that both clusters trust. If you’re using a managed Kubernetes service, you might use their default CA, or you’ll generate a new one and distribute its public key.
In our example, let’s assume we have a root CA certificate root-cert.pem.
Cluster A Setup:
We’ll install Istio on cluster-a with a configuration that designates it as a primary cluster and enables remote cluster joining.
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
profile: default
meshConfig:
trustDomain: cluster.local # Your chosen trust domain
accessLogFile: /dev/stdout
defaultConfig:
proxyMetadata:
# Enable Istio agent to fetch SDS from remote clusters
ISTIO_META_DNS_CAPTURE: "true"
ISTIO_META_DNS_INSECURE: "true" # For simplicity in demo, use proper CA in prod
values:
pilot:
env:
# Enable remote cluster discovery
EXTERNAL_ISTIOD: "true"
Apply this with istioctl install -f cluster-a-istio.yaml --context cluster-a.
Then, we create a eastwest-gateway for traffic to and from other clusters.
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
# This configuration is applied on top of the base Istio installation
components:
ingressGateways:
- name: istio-eastwestgateway
enabled: true
k8s:
env:
- name: ISTIO_META_ROUTER_MODE
value: "sni" # Or "tcp" if you prefer
overlays:
# Enable mTLS on the eastwest gateway
certs:
- secretName: cacerts # This secret should contain your root CA
items:
- key: ca-cert.pem
path: cacerts/root-cert.pem
- key: cert-chain.pem
path: cacerts/cert-chain.pem
- key: key.pem
path: cacerts/key.pem
Apply this with istioctl install -f cluster-a-eastwest.yaml --context cluster-a.
We also need to expose this gateway.
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: cross-network-gateway
namespace: istio-system
spec:
selector:
istio: eastwestgateway
servers:
- port:
number: 15443
name: tls
protocol: TLS
tls:
mode: AUTO_PASSTHROUGH
hosts:
- "*.local" # Or your specific domains
Cluster B Setup:
On cluster-b, we do a similar Istio installation but configure it to join a remote cluster.
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
profile: default
meshConfig:
trustDomain: cluster.local # Must match cluster-a
accessLogFile: /dev/stdout
defaultConfig:
proxyMetadata:
ISTIO_META_DNS_CAPTURE: "true"
ISTIO_META_DNS_INSECURE: "true" # Use proper CA in prod
values:
pilot:
env:
# Enable remote cluster discovery
EXTERNAL_ISTIOD: "true"
Apply with istioctl install -f cluster-b-istio.yaml --context cluster-b.
Now, we configure cluster-b to connect to cluster-a’s Istiod. This is done by creating a RemoteCluster resource.
apiVersion: networking.istio.io/v1beta1
kind: RemoteCluster
metadata:
name: cluster-a
spec:
# The address of Istiod in cluster-a, accessible from cluster-b
# This will be the Kubernetes API server address of cluster-a
network: "network1" # A logical network name
serviceAccountToken: <token_from_cluster_a> # Token for cluster-b to authenticate to cluster-a's API server
discoveryAddress: <cluster_a_api_server_address> # e.g., https://1.2.3.4:6443
secretName: cacerts # This secret must contain the root CA for cluster-a
The serviceAccountToken and discoveryAddress are crucial. The discoveryAddress is the public endpoint of cluster-a’s Kubernetes API server. The serviceAccountToken is a Service Account token from cluster-a that cluster-b will use to authenticate with cluster-a’s API server for discovery. You can get this token from cluster-a by creating a Service Account and then a Secret of type kubernetes.io/service-account-token for it.
The secretName: cacerts on cluster-b must contain the root CA certificate that cluster-a trusts (and that cluster-b will use to verify cluster-a’s control plane).
What’s Happening:
- Trust Establishment: Both clusters share the same root CA. This is the foundation for secure communication.
- Istiod Discovery: Istio’s control plane (
istiod) incluster-buses the provideddiscoveryAddressandserviceAccountTokento connect tocluster-a’s Kubernetes API server. It watches for resources incluster-a. - Network Registration: The
RemoteClusterresource tellscluster-baboutcluster-aand assigns it to a logicalnetwork(e.g., "network1"). This allows Istio to differentiate traffic originating from different networks. - East-West Gateway: The
eastwest-gatewayincluster-ais configured to accept TLS traffic on port 15443. This gateway acts as the entry point for services fromcluster-bintocluster-a. - Service Exposure: When you deploy a service in
cluster-a(e.g.,myservice.default.svc.cluster.local), Istio oncluster-bdiscovers this service through the API server. It then configures the Envoy proxies incluster-bto route traffic formyservice.default.svc.cluster.localto theeastwest-gatewayincluster-a. Theistio-eastwestgatewaywill then route it to the actual service pod incluster-a.
Common Pitfalls & Fixes:
- Trust Domain Mismatch: If
meshConfig.trustDomaindiffers between clusters, mTLS will fail. Ensure it’s identical.- Diagnosis: Check
istio-proxylogs on any pod. Look fortls: bad certificateorpermission deniedduring mTLS handshake. - Fix: Update the
trustDomainin theIstioOperatorconfiguration for affected clusters and re-install Istio or apply theMeshConfigchanges.
- Diagnosis: Check
- Network Reachability:
cluster-b’sistiodmust be able to reachcluster-a’s Kubernetes API server (discoveryAddress). Theeastwest-gatewayincluster-amust be reachable fromcluster-b.- Diagnosis: Use
kubectl execinto a pod incluster-bandcurl <cluster_a_api_server_address>:6443. Fromcluster-b,curl -v telnet://<cluster_a_eastwest_gateway_ip>:15443. - Fix: Ensure firewalls allow traffic, or configure Istio’s
Networkresources if using distinct networks. For the East-West Gateway, ensure its Service is of typeLoadBalanceror NodePort, and that the IP is accessible.
- Diagnosis: Use
- Incorrect Service Account Token/API Server Address: The token used by
cluster-bto authenticate tocluster-amust be valid and have sufficient permissions. ThediscoveryAddressmust be the correct, publicly accessible API server endpoint.- Diagnosis: Check
istiodlogs oncluster-bfor API authentication errors or connection refused messages. - Fix: Generate a new Service Account and token in
cluster-a. Ensure thediscoveryAddressis correct and accessible fromcluster-b. Grant the Service Account incluster-aappropriate RBAC permissions (e.g.,cluster-adminor a more restricted role to viewpods,services,endpoints).
- Diagnosis: Check
- Missing or Incorrect CA Certificates: The
cacertssecret on both clusters (especially the remote cluster joining) must contain the correct root CA certificate.- Diagnosis:
istio-proxylogs will showtls: bad certificateerrors when attempting to connect to the remote cluster’sistiodor east-west gateway. - Fix: Create a Kubernetes Secret named
cacertsin theistio-systemnamespace of the remote cluster containingca-cert.pem(the root CA) andcert-chain.pem(if applicable, intermediate CAs). Ensure theistio-eastwestgatewayon the primary cluster also has access to the CA secrets for its own identity.
- Diagnosis:
- East-West Gateway Configuration: The
istio-eastwestgatewayneeds to be properly configured for SNI passthrough or TCP, and its Service exposed.- Diagnosis: Services in
cluster-aare unreachable fromcluster-b. Check the logs of theistio-eastwestgatewaypod. - Fix: Ensure the
Gatewayresource for the east-west gateway is correctly applied, theselectormatches the gateway deployment, and theportandprotocolare set for TLS. The associated Kubernetes Service must expose the gateway.
- Diagnosis: Services in
- Remote Cluster Network: If you have multiple logical networks, ensure the
RemoteClusterresource correctly specifies thenetworkfield.- Diagnosis: Services can be reached, but performance is poor, or specific cross-network policies don’t work.
- Fix: Define custom
Networkresources in Istio’sMeshConfigto map logical network names to CIDR ranges if needed, and ensure theRemoteClusterresource uses the correct logical network name.
After successfully connecting your clusters, the next hurdle you’ll likely encounter is managing traffic policies and ensuring consistent security configurations across them.