GKE Dataplane V2, powered by Cilium, replaces kube-proxy with an eBPF-based data plane, offering significant performance gains and advanced networking features.
Let’s see it in action. Imagine you have a GKE cluster and you’ve just enabled Dataplane V2. You deploy a simple nginx deployment with a ClusterIP service:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
type: ClusterIP
Before Dataplex V2, a request to nginx-service on port 80 would hit kube-proxy running on a node. kube-proxy would then perform NAT to select one of the nginx-deployment pods and forward the traffic. This involved context switches between kernel and userspace, adding latency.
With Dataplane V2, this changes dramatically. The Cilium agent, running as a DaemonSet on each node, programs eBPF directly into the kernel’s network stack. When a packet destined for nginx-service arrives at a node, the eBPF program intercepts it before it even reaches userspace. It consults its internal state, which is dynamically updated based on Kubernetes service and endpoint information, and directly forwards the packet to one of the nginx-deployment pods. This bypasses kube-proxy entirely, leading to:
- Higher Throughput: Fewer context switches and optimized kernel paths mean more packets per second.
- Lower Latency: Direct forwarding from kernel to kernel reduces processing time.
- Enhanced Visibility: eBPF allows for granular network observability, including per-pod traffic, security events, and L7 protocol awareness.
- Advanced Network Policies: Cilium’s CNI capabilities enable rich, identity-aware network policies that go beyond standard Kubernetes NetworkPolicies.
The core problem Dataplane V2 solves is the performance bottleneck and feature limitations of kube-proxy for large-scale, high-performance Kubernetes networking. It leverages eBPF to move network function execution from userspace to the kernel, where it can be executed more efficiently.
Here’s a simplified view of what happens under the hood:
- Agent Deployment: The Cilium agent is deployed as a DaemonSet. This agent communicates with the Kubernetes API server to watch for changes in Services, Endpoints, Pods, and NetworkPolicies.
- eBPF Program Loading: The Cilium agent loads eBPF programs into the kernel on each node. These programs are attached to specific network hooks, such as the
cls_bpfclassifier or the XDP (eXpress Data Path) hook. - Datapath Programming: Based on the Kubernetes objects it observes, the agent programs the eBPF datapath. This includes:
- Service Translation: Mapping ClusterIPs and ports to backend pod IPs and ports.
- Load Balancing: Implementing load balancing algorithms (e.g., random, round-robin) for service backends.
- Network Policy Enforcement: Applying L3/L4 and L7 network policies based on pod labels and identities.
- Node-to-Node Connectivity: Routing traffic between pods across different nodes.
- Packet Processing: When a packet arrives at a node:
- If it’s destined for a local pod, the eBPF program routes it directly.
- If it’s destined for a remote pod, the eBPF program consults routing tables (also managed by Cilium) to send it to the correct egress interface.
- If it’s a service request, the eBPF program performs the service translation and load balancing.
The exact levers you control are primarily through your GKE cluster configuration and Kubernetes manifest files. When creating or updating a GKE cluster, you’ll select the "Dataplane V2" networking option. Beyond that, your control comes from:
- Kubernetes Services and Deployments: Standard Kubernetes objects define your application’s network endpoints and how they are exposed.
- CiliumNetworkPolicy: For advanced policy enforcement beyond what Kubernetes NetworkPolicies offer, you can use Cilium’s custom resources. These allow for richer policy definitions based on workload identity, L7 protocols, and more.
- CiliumClusterwideNetworkPolicy: For policies that apply across the entire cluster, regardless of namespace.
- Cilium Agent Configuration: While GKE manages the core Cilium agent, advanced users might interact with specific Cilium configuration options if they are exposed via GKE’s managed add-ons.
The surprising part is how eBPF allows for dynamic, policy-driven packet manipulation directly within the kernel’s packet processing path. It’s not just about routing; it’s about embedding complex logic like security policy enforcement and service discovery into the very fabric of the network stack.
One thing that often trips people up is understanding how Pod-to-Pod communication works when pods are on different nodes. Dataplane V2, through Cilium, utilizes efficient encapsulation mechanisms (like VXLAN or Geneve, depending on configuration) or direct routing, all managed by eBPF, to ensure seamless connectivity without relying on traditional overlay networks or complex routing configurations. The eBPF programs dynamically learn and update routing information for remote pods, making it appear as if all pods are on a flat network.
The next concept you’ll likely encounter is leveraging Cilium’s advanced L7 network policies for granular ingress and egress control of HTTP/gRPC traffic.