Bandwidth Math: Beyond Mbps

The network bandwidth you think you need is almost certainly wrong.

Let’s say you’re building a real-time collaborative whiteboard app. Your users are drawing, typing, and uploading images simultaneously. You’ve done some rough calculations: 50 active users, each sending 1 KB of data every second, that’s 50 KB/s, or about 0.4 Mbps. Easy, right? Except, that’s not how network traffic works.

Here’s a look at our whiteboard app in action. Imagine a user, Alice, draws a line.

{
  "event_type": "draw_segment",
  "user_id": "alice_123",
  "timestamp": 1678886400123,
  "segment_data": {
    "start_x": 100,
    "start_y": 150,
    "end_x": 120,
    "end_y": 170,
    "color": "#FF0000",
    "thickness": 2
  }
}

This event is broadcast to all other 49 users. If your server just blindly forwards this, each of those 49 users receives the same 200-byte payload. Now, consider Alice uploads an image. This isn’t a single event; it’s a stream of chunks, each with its own overhead.

The core problem is understanding state synchronization versus event broadcasting. If your app broadcasts every granular event (like a single pixel change), you’ll drown in traffic. A more efficient approach is to broadcast only significant changes or, better yet, have clients synchronize their state directly or through a more optimized protocol.

Let’s look at our whiteboard again. Instead of broadcasting every draw_segment event, a better model might be to broadcast a segment_completed event, or even better, have clients query the current state of a drawing when they join or when a drawing is modified.

Consider this segment_completed event:

{
  "event_type": "segment_completed",
  "user_id": "alice_123",
  "drawing_id": "canvas_abc",
  "segment_id": "seg_xyz",
  "timestamp": 1678886405678
}

This is much smaller. The actual drawing data for seg_xyz would be fetched by other clients on demand, or batched and sent periodically.

The actual bandwidth needs are dictated by:

Serialization and Protocol Overhead: Every message, no matter how small, has headers (e.g., TCP/IP, HTTP, WebSocket frames). A 10-byte payload can easily become 60-100 bytes.
Redundancy: If you’re broadcasting events, multiple clients receive the same data. If you have 50 users, you might send 50x the data needed for one user.
State vs. Event: Pushing every little change is inefficient. Pushing state diffs or allowing clients to pull state is often better.
Payload Size: Large payloads (images, video) dominate. Even with efficient compression, they’re significant.
Client Reconnection/Sync: When a client reconnects, it needs to fetch the current state, which can be a large initial burst.
Network Latency and Retries: Packet loss means retransmissions, increasing effective bandwidth usage.

To calculate realistically, you need to profile. Use tools like Wireshark or your application’s built-in network monitoring. For our whiteboard:

Measure actual event sizes: Send a draw_segment event and see its size over the wire. Let’s say it’s 250 bytes including headers.
Estimate event frequency per user: Alice might draw 5 segments per minute, upload an image every 10 minutes (1MB compressed).
Calculate peak simultaneous events: If 10 users draw simultaneously, that’s 10 * 5 events/min = 50 events/min.
Factor in redundancy: 50 events/min * 250 bytes/event * (N-1) clients. If N=50, that’s 50 * 250 * 49 bytes/min = 612,500 bytes/min = ~0.08 Mbps just for drawing events. This is much higher than the naive calculation.
Add image uploads: 1MB / 10min = 0.8 Mbps.
Add other traffic: User presence, chat messages, etc.

A common mistake is to assume a constant, low-bandwidth usage. Instead, think in terms of peak concurrent users and the maximum data churn those users can generate simultaneously. For a real-time app with 50 users, you might budget for 2-5 Mbps per active user during peak activity, accounting for redundancy and overhead, especially if large payloads are involved.

The most surprising thing about real-time communication bandwidth is how quickly state synchronization can overwhelm simple event broadcasting, especially when you factor in the number of recipients for each event. A single 200-byte event sent to 100 clients effectively consumes 20KB of network egress, not 200 bytes.

If your application uses WebSockets, monitor the bytes_sent and bytes_received metrics on your WebSocket server. For example, in a Node.js/ws setup, you might have a global counter: let totalBytesSent = 0; ws.on('message', (message) => { totalBytesSent += message.length; /* ... */ });. Then, aggregate this across all connections and time windows to find your peak.

The next problem you’ll face is latency, which bandwidth alone doesn’t solve.