Design WhatsApp

End-to-End Encryption, WebSocket Messaging, Delivery Receipts, Group Fan-out, and Presence at Scale

WhatsApp handles 100B+ messages per day across 2B+ monthly active users. The core challenges: implementing end-to-end encryption using the Signal Protocol so even the server cannot read messages, maintaining persistent WebSocket connections for real-time delivery, tracking message delivery status (sent, delivered, read) with receipts, supporting group messaging with fan-out to up to 1024 members, building a presence system that tracks online/offline status for billions of users, and handling media transfer (images, video, documents) with separate upload/download paths. At scale, that means millions of concurrent WebSocket connections per server and petabytes of message throughput daily.

End-to-End Encryption Visualizer

WhatsApp uses the Signal Protocol for E2E encryption. Each user generates a public/private key pair. A shared secret is derived via Diffie-Hellman key exchange. The server only sees encrypted ciphertext -- it can never read the plaintext.

Alice
Public Key: --
Private Key: --
Key Exchange
Shared Secret
--
Bob
Public Key: --
Private Key: --
Plaintext
--
Encrypted (on wire)
--
Decrypted
--
The server only sees the encrypted bytes -- it cannot decrypt the message.

Message Delivery Status Simulator

WhatsApp uses a three-stage delivery receipt system: a single check mark means the server received the message, double check marks mean the recipient's device received it, and blue double check marks mean the recipient opened and read the message.

Sender
Server
📱
Recipient
Hello! Are you there? 10:42 AM
Click "Send Message" to simulate delivery flow

Capacity Estimation Calculator

Back-of-the-envelope math for a WhatsApp-scale messaging system. Adjust parameters to see how throughput, storage, bandwidth, and infrastructure requirements change.

Messages/sec--
Storage/day--
Storage/year--
Bandwidth--
WebSocket Conns--
Server Count--
Peak Messages/sec--
Media Storage/day--

Group Messaging Fan-out

When a message is sent to a group, the server must deliver it to every member. Compare two strategies: fan-out writes (copy message to each member's queue) vs group-level storage with pointers. Larger groups make the trade-off clearer.

Fan-out on Write

Copy each message to every member's inbox queue

Writes/hour--
Storage multiplier--
Read latencyO(1)
vs

Group Storage + Pointers

Store once, each member holds a pointer to group log

Writes/hour--
Storage multiplier--
Read latencyO(log n)
Adjust group size and message rate to see the break-even point.

Presence System (Online/Offline)

Tracking who is online is expensive at scale. Each user sends periodic heartbeats; the server marks them offline after a timeout. Optimizations: only track presence for contacts, batch updates, and use a pub/sub model to push status changes to interested subscribers.

Online Users0
Offline Users0
Heartbeats/sec0
At 2B users--
Click "Start Heartbeats" to simulate the presence system
Why presence is expensive: With 2B users, if 500M are online and heartbeat every 30s, that is 16.7M heartbeats/sec. Each status change must be pushed to all contacts who are also online. Optimizations: only subscribe to presence for users visible on screen, batch updates every 5s, use a distributed pub/sub (Redis Pub/Sub or Kafka) to fan out status changes regionally.

High-Level Architecture

WhatsApp's architecture is built around persistent WebSocket connections for real-time delivery, with a message queue for reliable async processing and separate services for users, presence, and media.

Mobile Client
Load Balancer
WebSocket Servers
Message Queue (Kafka)
Async delivery, ordering
User Service
Auth, contacts, profiles
Presence Service
Online status, heartbeats
Cassandra
Messages, chat history
Redis Cluster
Sessions, presence, routing
Object Storage (S3)
Media files
WebSocket Servers maintain persistent connections (millions per server). When a message arrives, the server looks up the recipient's connection in Redis and delivers directly. If offline, the message is queued in Kafka.
Message Queue ensures reliable delivery. Messages are persisted in Kafka partitioned by recipient_id. When the recipient reconnects, queued messages are drained in order.
Cassandra stores messages partitioned by (chat_id, time_bucket) for efficient range scans. Messages are encrypted client-side -- the server stores opaque ciphertext. TTL of 30 days for undelivered messages.
Presence Service uses Redis sorted sets with TTL. Heartbeats every 30s update the timestamp. A separate pub/sub channel per user fans out status changes to online contacts.

Key Design Decisions

WebSocket vs Long Polling

WebSocket (WhatsApp's choice)
  • Full-duplex, persistent connection
  • Sub-100ms message delivery
  • Server can push without client request
  • Efficient for high-frequency messaging
  • Connection state must be tracked
vs
Long Polling
  • HTTP-based, simpler infra
  • Higher latency (seconds)
  • Client initiates every request
  • Better for low-frequency updates
  • Stateless, easier load balancing

Message Storage: Cassandra vs MySQL

Cassandra (WhatsApp's choice)
  • Write-optimized, append-only
  • Linear horizontal scaling
  • Tunable consistency (ONE for writes)
  • Time-series data model fits messages
  • No single point of failure
vs
MySQL (Sharded)
  • ACID transactions for critical ops
  • Rich query capabilities
  • Complex sharding logic needed
  • Resharding is painful
  • Better for user metadata

Media Handling

Media is handled separately from text messages. The sender uploads the encrypted media file to object storage (S3) via a dedicated media service, receives a URL, and sends the URL as part of the message. The recipient downloads the media independently. This keeps the message pipeline lightweight -- text messages are ~200 bytes while media files average 200KB+. Media is encrypted client-side with a per-file key included in the message metadata.

Connection Routing

Each user's WebSocket connects to one server. Redis maps user_id → server_id for routing. When Alice sends to Bob, the server looks up Bob's server in Redis and forwards. If Bob is on a different datacenter, the message routes via an internal message bus. Connection draining during deploys: new connections go to new servers while existing ones finish gracefully with a 30s timeout.

WhatsApp design tests real-time messaging, end-to-end encryption, fan-out strategies, presence at scale, and the WebSocket vs polling trade-off -- practice explaining delivery guarantees and group messaging architecture.