The network bandwidth you think you need is almost certainly wrong.
Let’s say you’re building a real-time collaborative whiteboard app. Your users are drawing, typing, and uploading images simultaneously. You’ve done some rough calculations: 50 active users, each sending 1 KB of data every second, that’s 50 KB/s, or about 0.4 Mbps. Easy, right? Except, that’s not how network traffic works.
Here’s a look at our whiteboard app in action. Imagine a user, Alice, draws a line.
{
"event_type": "draw_segment",
"user_id": "alice_123",
"timestamp": 1678886400123,
"segment_data": {
"start_x": 100,
"start_y": 150,
"end_x": 120,
"end_y": 170,
"color": "#FF0000",
"thickness": 2
}
}
This event is broadcast to all other 49 users. If your server just blindly forwards this, each of those 49 users receives the same 200-byte payload. Now, consider Alice uploads an image. This isn’t a single event; it’s a stream of chunks, each with its own overhead.
The core problem is understanding state synchronization versus event broadcasting. If your app broadcasts every granular event (like a single pixel change), you’ll drown in traffic. A more efficient approach is to broadcast only significant changes or, better yet, have clients synchronize their state directly or through a more optimized protocol.
Let’s look at our whiteboard again. Instead of broadcasting every draw_segment event, a better model might be to broadcast a segment_completed event, or even better, have clients query the current state of a drawing when they join or when a drawing is modified.
Consider this segment_completed event:
{
"event_type": "segment_completed",
"user_id": "alice_123",
"drawing_id": "canvas_abc",
"segment_id": "seg_xyz",
"timestamp": 1678886405678
}
This is much smaller. The actual drawing data for seg_xyz would be fetched by other clients on demand, or batched and sent periodically.
The actual bandwidth needs are dictated by:
- Serialization and Protocol Overhead: Every message, no matter how small, has headers (e.g., TCP/IP, HTTP, WebSocket frames). A 10-byte payload can easily become 60-100 bytes.
- Redundancy: If you’re broadcasting events, multiple clients receive the same data. If you have 50 users, you might send 50x the data needed for one user.
- State vs. Event: Pushing every little change is inefficient. Pushing state diffs or allowing clients to pull state is often better.
- Payload Size: Large payloads (images, video) dominate. Even with efficient compression, they’re significant.
- Client Reconnection/Sync: When a client reconnects, it needs to fetch the current state, which can be a large initial burst.
- Network Latency and Retries: Packet loss means retransmissions, increasing effective bandwidth usage.
To calculate realistically, you need to profile. Use tools like Wireshark or your application’s built-in network monitoring. For our whiteboard:
- Measure actual event sizes: Send a
draw_segmentevent and see its size over the wire. Let’s say it’s 250 bytes including headers. - Estimate event frequency per user: Alice might draw 5 segments per minute, upload an image every 10 minutes (1MB compressed).
- Calculate peak simultaneous events: If 10 users draw simultaneously, that’s 10 * 5 events/min = 50 events/min.
- Factor in redundancy: 50 events/min * 250 bytes/event * (N-1) clients. If N=50, that’s 50 * 250 * 49 bytes/min = 612,500 bytes/min = ~0.08 Mbps just for drawing events. This is much higher than the naive calculation.
- Add image uploads: 1MB / 10min = 0.8 Mbps.
- Add other traffic: User presence, chat messages, etc.
A common mistake is to assume a constant, low-bandwidth usage. Instead, think in terms of peak concurrent users and the maximum data churn those users can generate simultaneously. For a real-time app with 50 users, you might budget for 2-5 Mbps per active user during peak activity, accounting for redundancy and overhead, especially if large payloads are involved.
The most surprising thing about real-time communication bandwidth is how quickly state synchronization can overwhelm simple event broadcasting, especially when you factor in the number of recipients for each event. A single 200-byte event sent to 100 clients effectively consumes 20KB of network egress, not 200 bytes.
If your application uses WebSockets, monitor the bytes_sent and bytes_received metrics on your WebSocket server. For example, in a Node.js/ws setup, you might have a global counter: let totalBytesSent = 0; ws.on('message', (message) => { totalBytesSent += message.length; /* ... */ });. Then, aggregate this across all connections and time windows to find your peak.
The next problem you’ll face is latency, which bandwidth alone doesn’t solve.