Service Mesh: Beyond Proxies

A service mesh doesn’t actually do anything for your application code, but it can make your Kubernetes deployments suddenly perform like they’re running on a supercomputer.

Let’s see what this looks like in practice. Imagine we have two microservices: frontend and backend.

First, the frontend service:

apiVersion: v1
kind: Service
metadata:
  name: frontend
  labels:
    app: frontend
spec:
  ports:
  - port: 80
    targetPort: 8080
    protocol: TCP
    name: http
  selector:
    app: frontend

And the backend service:

apiVersion: v1
kind: Service
metadata:
  name: backend
  labels:
    app: backend
spec:
  ports:
  - port: 8080
    targetPort: 8080
    protocol: TCP
    name: http
  selector:
    app: backend

Without a service mesh, if frontend wants to talk to backend, it makes a direct Kubernetes Service DNS lookup (e.g., backend.your-namespace.svc.cluster.local) and then a TCP connection. All the magic of routing, retries, and observability has to be built into the frontend application itself.

Now, let’s introduce a service mesh, say Istio. When Istio is installed, it injects a small proxy container (called istio-proxy or istio-proxyv2) into each of your application pods. This proxy intercepts all incoming and outgoing network traffic for the pod.

Here’s the same frontend pod definition, but now with the Istio sidecar injected:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend-deployment
spec:
  template:
    metadata:
      labels:
        app: frontend
    spec:
      containers:
      - name: frontend
        image: your-frontend-image:latest
        ports:
        - containerPort: 8080
      # This is the magic Istio sidecar
      - name: istio-proxy
        image: docker.io/istio/proxyv2:1.18.2
        ports:
        - containerPort: 15001 # Envoy's admin port
        # ... other istio-proxy configuration

When frontend now makes a request to backend, it’s not going directly. Instead, it goes:

frontend app -> istio-proxy (in frontend pod) -> Kubernetes network -> istio-proxy (in backend pod) -> backend app.

The istio-proxy containers are configured by Istio’s control plane. They know about all the services in the cluster, their network addresses, and can apply policies. This means you can define sophisticated traffic management rules outside your application code.

For example, to send 10% of traffic to a new version of backend (let’s call it backend-v2):

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: backend
spec:
  hosts:
  - backend
  http:
  - route:
    - destination:
        host: backend
        subset: v1
      weight: 90
    - destination:
        host: backend
        subset: v2
      weight: 10
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: backend
spec:
  host: backend
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2

Here, VirtualService defines how traffic is routed, and DestinationRule defines how to group pods by labels (like version: v1 or version: v2) to create these subsets. The istio-proxy on the frontend pod, when it sees a request for backend, consults this VirtualService and sends 90% of traffic to backend pods with label version: v1 and 10% to pods with label version: v2.

This is the core problem a service mesh solves: abstracting network concerns (routing, resilience, observability) from application code. Instead of building retry logic, circuit breakers, and tracing into every single microservice, you configure these behaviors once in the service mesh.

Let’s compare the popular options: Istio, Linkerd, and Cilium (when used with its eBPF-based service mesh capabilities).

Istio: Istio is the most feature-rich. It’s built on Envoy proxies, which are incredibly powerful and configurable.

Strengths: Extensive traffic management (fine-grained routing, fault injection, retries, timeouts), strong security features (mTLS, authorization policies), rich observability (distributed tracing, metrics, access logs).
Weaknesses: Can be complex to install and manage, higher resource overhead due to Envoy proxies.
Core Component: Envoy proxy (data plane), Istiod (control plane).

Linkerd: Linkerd is known for its simplicity and performance. It uses a lightweight Rust-based proxy.

Strengths: Very low resource overhead, fast, easy to install and operate, excellent out-of-the-box observability (latency, success rates), automatic mTLS.
Weaknesses: Less advanced traffic management features compared to Istio, fewer customization options for the proxy.
Core Component: Linkerd-proxy (data plane), controller (control plane).

Cilium Service Mesh: Cilium leverages eBPF (extended Berkeley Packet Filter) directly in the Linux kernel. Instead of injecting sidecar proxies, it can handle service mesh logic at the kernel level for compatible protocols.

Strengths: Extremely high performance and low latency by avoiding sidecar overhead, security features integrated with network policies, can also provide traditional sidecar functionality if needed.
Weaknesses: Newer to the service mesh space, relies heavily on eBPF capabilities (requires newer kernels), traffic management features are still evolving.
Core Component: eBPF programs in the kernel (data plane), Cilium API/controller (control plane).

The most surprising thing about service meshes is how much they can break your existing assumptions about how network traffic flows. You think your app is talking to backend.svc.cluster.local directly, but it’s actually being intercepted and routed by a proxy that could be thousands of miles away in a different data center, all transparently.

The exact levers you control are typically defined using Custom Resource Definitions (CRDs) in Kubernetes. For Istio, these are things like VirtualService, DestinationRule, Gateway, and ServiceEntry. For Linkerd, it’s ServiceProfile and TrafficSplit. Cilium uses its own set of CRDs alongside Kubernetes NetworkPolicies. These CRDs allow you to declare desired network behaviors, and the service mesh control plane translates these into configurations for the data plane proxies or eBPF programs.

One thing most people don’t know is that the service mesh data plane (the proxies or eBPF code) doesn’t actually understand your application’s protocol by default for most operations. It primarily works at the TCP/TLS level. For deeper insights or protocol-specific routing (like HTTP headers), it relies on the proxies being able to "decode" the traffic. Envoy (Istio) is excellent at this for many protocols like HTTP/1.1, HTTP/2, gRPC, and TCP. Linkerd’s proxy also understands these. Cilium’s eBPF approach is more flexible and can be extended to understand application protocols more deeply without a traditional proxy.

The next concept you’ll likely run into is how to manage ingress and egress traffic with a service mesh.