HTTP/2 as the transport for internal microservice communication is surprisingly a net negative for latency in most common microservice patterns.

Let’s see it in action. Imagine a simple request flow: a frontend service (frontend-svc) calls a backend service (backend-svc) which in turn calls a database service (db-svc).

# Request 1: frontend-svc -> backend-svc
POST /process_order HTTP/2
Host: backend-svc:8080
Content-Type: application/json
...

# Response 1: backend-svc -> frontend-svc
HTTP/2 200 OK
Content-Type: application/json
...

# Request 2: backend-svc -> db-svc
POST /save_order HTTP/2
Host: db-svc:9000
Content-Type: application/json
...

# Response 2: db-svc -> backend-svc
HTTP/2 200 OK
Content-Type: application/json
...

If we were using HTTP/1.1, each of these requests would establish a new TCP connection (or reuse one from a connection pool). With HTTP/2, the first request establishes a single TCP connection, and all subsequent requests (even to different hosts, if configured correctly via a proxy) multiplex over that same connection. This eliminates the overhead of establishing new TCP connections for every single RPC call.

The core problem HTTP/2 solves is the inefficiency of HTTP/1.1. HTTP/1.1’s "one request per connection" model led to:

  • Head-of-Line Blocking: If a request on a connection is slow, it blocks all subsequent requests on that same connection.
  • Connection Overhead: Establishing TCP connections is expensive (three-way handshake), and TLS handshakes are even more so. Many short-lived connections chew up resources.

HTTP/2 addresses these with:

  • Multiplexing: Multiple requests and responses can be in flight simultaneously over a single TCP connection, broken down into frames. This means a slow response doesn’t block others.
  • Header Compression (HPACK): Reduces the size of HTTP headers, which are particularly verbose in microservice communication.
  • Server Push: Allows the server to send resources to the client that it anticipates the client will need, before the client explicitly requests them. (Less common for RPC, more for web assets).
  • Flow Control: Prevents a fast sender from overwhelming a slow receiver.

The real levers you control are primarily at the network and application layer configurations of your HTTP/2 clients and servers.

  • Client-side settings: You can configure things like MaxConcurrentStreams (how many requests can be outstanding on a single connection) and InitialWindowSize (how much data can be sent before flow control is applied).
  • Server-side settings: Similar configurations apply, dictating how many streams the server will accept and how it handles incoming data.
  • TLS Configuration: While HTTP/2 can run over plain TCP (h2c), it’s almost always used with TLS (h2). Proper TLS cipher suite selection and certificate management are crucial.
  • Proxy Configuration: If you’re using a service mesh like Istio or Linkerd, or even a simple reverse proxy like Nginx or Envoy, their configurations for HTTP/2 settings will heavily influence behavior.

The most surprising thing about HTTP/2 multiplexing for internal microservices is how easily it can increase latency if not managed carefully. While it eliminates TCP connection overhead, the inherent nature of multiplexing means that a single slow request on a shared connection can cause delays for all other requests on that same connection. This is a different form of head-of-line blocking, now occurring at the stream level within a single TCP connection, rather than at the connection level. A single misbehaving or overloaded service can effectively throttle all other services sharing its connection, a problem often masked by the perceived "efficiency" of HTTP/2.

The next concept you’ll likely grapple with is how to effectively monitor and debug performance issues within these multiplexed HTTP/2 streams.

Want structured learning?

Take the full Http2 course →