Use HTTP/2 as the Transport for Internal Microservice Communication (2026)

HTTP/2 as the transport for internal microservice communication is surprisingly a net negative for latency in most common microservice patterns.

Let’s see it in action. Imagine a simple request flow: a frontend service (frontend-svc) calls a backend service (backend-svc) which in turn calls a database service (db-svc).

# Request 1: frontend-svc -> backend-svc
POST /process_order HTTP/2
Host: backend-svc:8080
Content-Type: application/json
...

# Response 1: backend-svc -> frontend-svc
HTTP/2 200 OK
Content-Type: application/json
...

# Request 2: backend-svc -> db-svc
POST /save_order HTTP/2
Host: db-svc:9000
Content-Type: application/json
...

# Response 2: db-svc -> backend-svc
HTTP/2 200 OK
Content-Type: application/json
...

If we were using HTTP/1.1, each of these requests would establish a new TCP connection (or reuse one from a connection pool). With HTTP/2, the first request establishes a single TCP connection, and all subsequent requests (even to different hosts, if configured correctly via a proxy) multiplex over that same connection. This eliminates the overhead of establishing new TCP connections for every single RPC call.

The core problem HTTP/2 solves is the inefficiency of HTTP/1.1. HTTP/1.1’s "one request per connection" model led to:

Head-of-Line Blocking: If a request on a connection is slow, it blocks all subsequent requests on that same connection.
Connection Overhead: Establishing TCP connections is expensive (three-way handshake), and TLS handshakes are even more so. Many short-lived connections chew up resources.

HTTP/2 addresses these with:

Multiplexing: Multiple requests and responses can be in flight simultaneously over a single TCP connection, broken down into frames. This means a slow response doesn’t block others.
Header Compression (HPACK): Reduces the size of HTTP headers, which are particularly verbose in microservice communication.
Server Push: Allows the server to send resources to the client that it anticipates the client will need, before the client explicitly requests them. (Less common for RPC, more for web assets).
Flow Control: Prevents a fast sender from overwhelming a slow receiver.

The real levers you control are primarily at the network and application layer configurations of your HTTP/2 clients and servers.

Client-side settings: You can configure things like MaxConcurrentStreams (how many requests can be outstanding on a single connection) and InitialWindowSize (how much data can be sent before flow control is applied).
Server-side settings: Similar configurations apply, dictating how many streams the server will accept and how it handles incoming data.
TLS Configuration: While HTTP/2 can run over plain TCP (h2c), it’s almost always used with TLS (h2). Proper TLS cipher suite selection and certificate management are crucial.
Proxy Configuration: If you’re using a service mesh like Istio or Linkerd, or even a simple reverse proxy like Nginx or Envoy, their configurations for HTTP/2 settings will heavily influence behavior.

The most surprising thing about HTTP/2 multiplexing for internal microservices is how easily it can increase latency if not managed carefully. While it eliminates TCP connection overhead, the inherent nature of multiplexing means that a single slow request on a shared connection can cause delays for all other requests on that same connection. This is a different form of head-of-line blocking, now occurring at the stream level within a single TCP connection, rather than at the connection level. A single misbehaving or overloaded service can effectively throttle all other services sharing its connection, a problem often masked by the perceived "efficiency" of HTTP/2.

The next concept you’ll likely grapple with is how to effectively monitor and debug performance issues within these multiplexed HTTP/2 streams.