WebSockets aren’t just a way to send messages back and forth; they’re a fundamental shift from the request-response paradigm, enabling real-time, low-latency communication that makes the web feel alive.

Imagine a chat application. With traditional HTTP, each message would require a new request from the client to the server, and potentially a new response. This is like sending a postcard for every sentence of a conversation – incredibly inefficient and slow.

GET /chat/messages HTTP/1.1
Host: example.com

The server would then respond:

HTTP/1.1 200 OK
Content-Type: application/json

{"messages": [{"user": "Alice", "text": "Hey Bob!"}]}

If Bob wants to reply, he sends another request:

POST /chat/messages HTTP/1.1
Host: example.com
Content-Type: application/json

{"user": "Bob", "text": "Hi Alice, how are you?"}

And the server responds again. This handshake happens for every single message.

WebSockets change this by establishing a single, persistent connection between the client and the server. Once this connection is open, data can flow in both directions freely, without the overhead of repeated HTTP requests.

Here’s how the handshake typically looks, starting with an HTTP request:

GET /chat HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

The server, if it supports WebSockets, responds with an HTTP 101 Switching Protocols status:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBi8a894Qf+e/G3b/O+x00=

After this initial handshake, the connection is no longer HTTP. It’s a WebSocket connection. Now, the client can send a message:

Hello server!

And the server can instantly reply:

Hello client!

This is happening over the same, open connection. The messages are framed differently, not as HTTP requests/responses, but as WebSocket frames.

The core problem WebSockets solve is the latency and overhead associated with traditional HTTP for applications requiring real-time interaction. Think about live sports score updates, stock tickers, collaborative editing tools (like Google Docs), online multiplayer games, or even just presence indicators (who’s online). For these, the delay introduced by establishing a new HTTP connection for every piece of data is unacceptable.

Internally, a WebSocket connection is a stateful TCP connection. Once established, both the client and server can send data frames to each other at any time. The protocol defines a framing mechanism to distinguish between different types of data (text, binary, control frames like pings/pongs for keep-alive) and to ensure data integrity.

Consider the ping and pong frames. These are control frames sent by either the client or server to check if the other end is still alive and responsive. If a ping is sent and no pong is received within a reasonable timeout, the connection is assumed to be broken and is closed. This is crucial for managing long-lived connections and preventing resources from being held open indefinitely for unresponsive clients.

The Sec-WebSocket-Key in the initial handshake is a base64-encoded string generated by the client. The server takes this key, concatenates it with a globally unique identifier (258EAFA5-E914-47DA-95CA-C5AB0DC85B11), and then calculates a SHA-1 hash of the combined string. The resulting hash is then Base64 encoded and sent back in the Sec-WebSocket-Accept header. This process ensures that the client and server have agreed on the Upgrade request and prevents malicious intermediaries from hijacking the connection.

Beyond basic text and binary messages, the WebSocket protocol also defines message fragmentation. If a message is too large to be sent in a single frame, it can be broken down into multiple frames, each with a specific flag indicating whether it’s the final fragment. This allows for efficient transmission of large data payloads without overwhelming buffers.

The primary levers you control as a developer are the message content and the connection lifecycle. You decide what data to send, when to send it, and how to react to incoming messages. You also manage the connection itself – establishing it, handling disconnections gracefully, and attempting reconnections if necessary. Libraries abstract away the low-level framing and protocol details, allowing you to focus on the application logic.

The most surprising aspect of WebSockets for many is how they bypass the typical browser security model for cross-origin requests. While an initial HTTP request to establish the WebSocket connection is subject to standard Same-Origin Policy (SOP) and CORS, once the connection is upgraded, it operates independently of the SOP for subsequent message exchanges. This allows a WebSocket server on api.example.com to accept connections from a web page served from www.example.com without explicit CORS preflight requests for every message.

The next step after mastering WebSockets is understanding how to scale them effectively, especially when dealing with thousands or millions of concurrent connections.

Want structured learning?

Take the full Computer Networking course →