The vast majority of web pages you see are delivered using a protocol that’s actually simpler than you might think, and it’s not even designed for delivering rich content.

Let’s watch a real-time interaction. Imagine your browser (the client) wants to get a web page from a server.

Browser (Client):

GET /index.html HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Accept-Language: en-US,en;q=0.9
Connection: keep-alive

Server (www.example.com):

HTTP/1.1 200 OK
Date: Tue, 15 Jun 2023 10:30:00 GMT
Server: Apache/2.4.41 (Ubuntu)
Last-Modified: Mon, 14 Jun 2023 15:00:00 GMT
Content-Length: 1234
Content-Type: text/html; charset=UTF-8
Connection: keep-alive

<!DOCTYPE html>
<html>
<head>
    <title>Welcome!</title>
</head>
<body>
    <h1>Hello, Web!</h1>
</body>
</html>

This exchange is the heart of HTTP, the Hypertext Transfer Protocol. It’s a request-response protocol. The client sends a request, and the server sends back a response. That’s it. The "Hypertext" part refers to the ability to link documents, which is what makes the web navigable.

At its core, HTTP is a stateless protocol. This means each request from a client to a server is treated as an independent transaction. The server doesn’t remember anything about previous requests from the same client. This simplicity is a feature, allowing for easy scaling and robustness. However, it also means that maintaining user sessions or complex application states requires additional mechanisms, like cookies or server-side session management.

The request itself has a few key parts. The first line, GET /index.html HTTP/1.1, specifies the method (GET for retrieving data), the resource path (/index.html), and the protocol version (HTTP/1.1). Following this are headers, like Host (which server to connect to if the IP address hosts multiple sites) and User-Agent (which browser is making the request). These headers provide metadata about the request.

The response mirrors this structure. The first line, HTTP/1.1 200 OK, indicates the protocol version, a status code (200 OK meaning success), and a reason phrase. Subsequent headers, like Content-Length (how many bytes the body is) and Content-Type (what kind of data is in the body, like text/html), give the client information about the payload. The final part is the response body, which in this case is the HTML of the web page.

HTTP has evolved. HTTP/1.1 introduced persistent connections (keeping a TCP connection open for multiple requests), pipelining (sending multiple requests without waiting for each response), and more robust caching controls. HTTP/2, which is now widely adopted, significantly improves performance by introducing multiplexing (sending multiple requests and responses over a single connection concurrently), header compression, and server push. HTTP/3, the latest iteration, uses QUIC (built on UDP) instead of TCP, further reducing latency and improving resilience to network congestion.

The methods are the verbs of HTTP. GET is for retrieving data. POST is for submitting data to be processed (like form submissions). PUT is for updating resources. DELETE is for removing resources. HEAD is like GET but only returns headers, useful for checking resource existence or modification dates without downloading the body. OPTIONS queries the server about communication options for the target resource.

When you type a URL into your browser, a DNS lookup happens first to translate the domain name (like www.example.com) into an IP address. Then, a TCP connection is established to that IP address on port 80 (for HTTP) or 443 (for HTTPS). Finally, the HTTP request is sent, and the response is received and processed by your browser to render the page. HTTPS, the secure version, encrypts this entire exchange using TLS/SSL.

One of the most misunderstood aspects of HTTP is how it handles state. Because it’s inherently stateless, when you log into a website, the server doesn’t magically remember you for the next page load. Instead, it sends back a Set-Cookie header with a session ID. Your browser then includes this cookie in subsequent requests to that same server. The server uses this ID to look up your session data and know who you are. This cookie-based mechanism is the foundation for most web authentication and session management, and it’s a crucial layer built on top of the stateless HTTP protocol.

The next step in understanding web communication involves looking at how these requests and responses are structured for more complex data exchange, like APIs.

Want structured learning?

Take the full Http course →