Design Real-time Collaboration (Google Docs-like)
A real-time collaborative editor lets two or more humans type into the same document and see each other's changes within ~100 ms. The hard problems are convergence (concurrent edits resolve to the same state on every client), presence (cursors and selections of other users), offline support (edits made on a plane sync correctly when WiFi returns), and scale (one shared doc with 50 active editors). The defining choice is OT vs CRDT and the storage model that follows.
Architecture
Capacity Estimation
| Metric | Value | Notes |
|---|---|---|
| Active docs | ~10 M | open in tabs at peak |
| Avg concurrent editors / doc | 2–5 | p99 ~50 |
| Op rate / active doc | 1–5/s | typing burst |
| WebSocket conns | ~30 M | doc viewers |
| Op size | ~50 B | insert/delete primitive |
| Latency budget (typing) | < 100 ms | echo latency |
OT vs CRDT
Two algorithm families solve the same problem (concurrent edits converge):
- OT (Operational Transformation) — ops carry positions; the server transforms incoming ops against already-applied concurrent ones. Simple ops, requires a central authority for canonical order. Used by Google Docs.
- CRDT — each character has a unique ID; concurrent ops are reconciled by ID order. Heavier per-character metadata, but no central order required. Used by Notion, Linear, Yjs ecosystem.
For an interview today, CRDT (Yjs) is the modern default unless you have specific needs — the library is mature, the literature is settled, and offline support is natural. Pick OT only if you're emulating Google Docs and have a strict central server.
See the collaborative editing system design page for the depth on tradeoffs.
Server-mediated Sync
The data flow:
- Client A types; local CRDT inserts; UI updates immediately (optimistic).
- Client A sends op over WebSocket to the doc server.
- Server appends to op log, broadcasts to other clients via pub/sub.
- Client B receives, applies to local CRDT, UI updates.
Sticky sessions: all clients of one doc connect to the same doc-server instance. Routing via consistent hash on doc_id; failover requires a replica that has been receiving the op log. This is the most operationally tricky bit — getting failover smooth without losing presence or sequence.
Conflict Resolution
With CRDT, conflict cannot occur in the algorithmic sense: every op has a unique ID and concurrent ops merge deterministically. What feels like a conflict in product terms is the user's perception:
- Two users typing in the same paragraph — intermixed text. Usually fine.
- Two users replacing the same word — both replacements coexist, last visually wins. Usually unexpected.
Block-level CRDTs (Notion) reduce the conflict perception by structuring the document: edits in different blocks never interleave at the character level.
Presence Cursors
Other users' cursors are an ephemeral channel orthogonal to ops:
- Each client publishes
(user_id, cursor_anchor, cursor_head, color)on every keystroke (rate-limited to ~10 Hz). - Server fans out to peers; not persisted.
- Anchor / head positions are CRDT-relative IDs, not byte offsets — otherwise remote inserts move your displayed cursor under you.
Active selection (highlight) is a range (anchor, head); transient with same channel.
Offline Support
The CRDT's killer feature: edits made offline merge correctly on reconnect. Implementation:
- Client buffers ops locally (IndexedDB / LocalStorage) tagged with a state vector indicating what it's seen from each peer.
- On reconnect, client sends its state vector; server returns ops the client missed; client applies; client sends its buffered ops.
- UX: visually show a "syncing" indicator; allow editing during sync.
Long offline (days) edits still merge correctly — the only thing that breaks is user expectation when 1000 of someone else's ops come flooding in.
Storage: Op Log + Snapshots
- Op log — append-only stream per doc, indexed by sequence number. Postgres table or S3-backed write-ahead log.
- Snapshot — materialized doc state at op N; speeds up cold-load. Write a new snapshot every 1000 ops; older ops can be archived.
- Garbage collection — CRDT tombstones (deleted character metadata) accumulate; periodic compaction reduces size.
Failure Modes
- Server failover loses presence — new server has no presence state. Re-broadcast presence on reconnect; eventually consistent.
- Op duplication — client retries on flaky network; CRDT idempotency saves you (same op ID, applied once).
- Snapshot corruption — reconstruct from full op log; expensive but correct.
- Hot doc — viral collaborative doc with 500 editors; one server saturates. Limit concurrent editors per doc; spread reads to read replicas.
Scaling: Sharding Docs Across Servers
One doc-server holds all clients of one doc — that is the sticky-session constraint. To scale to 10 M concurrent docs, shard:
- Consistent hash by doc_id — route by hash mod N. Adding a node moves ~1/N docs; the moving docs experience a ~10 s reconnect during migration.
- Per-doc ownership lease — the server holding a doc has a lease in etcd; on lease loss, ownership migrates. Prevents two servers from owning the same doc.
- Read replicas — for view-only viewers (read-only ACL), serve from a follower that subscribes to the leader's op stream. Offloads the leader for high-readership docs.
Cross-server fan-out is via Redis pub/sub: when server A processes an op for doc X, it publishes on doc:X; servers B, C, D with viewers of doc X subscribe and forward to their connected clients. Most docs are single-server so this rarely fires.
Comments and Suggestions
Beyond the document body, real collab products have comments anchored to ranges, suggestions (proposed edits the doc owner accepts/rejects), and revision history. Each is a separate CRDT structure:
- Comments: thread tree anchored to a CRDT range; the range tracks across edits via the same anchor IDs.
- Suggestions: proposed insertions/deletions held in a parallel pending state; on accept, applied as ops; on reject, discarded.
- Named versions: snapshot the op-log seq at the moment a user clicks "save version"; expose as a navigable history.
FAQ
How do you handle large pastes?
One bulk-insert op containing the pasted text; modern CRDTs compress consecutive inserts — size impact is < 2× the byte length.
What about images and embeds?
Block-level CRDT: images are blocks with metadata. The image bytes go to S3; the block stores a reference. Inline images in text are awkward; most editors avoid them.
Permissions / sharing model?
Per-doc ACL: owner, editor, commenter, viewer. Enforced at the WebSocket auth layer and op-acceptance layer. Read-only viewers receive ops but cannot send.
Audit / version history?
Op log is the audit. Render historical snapshots by replaying ops up to a target seq. Naming versions is a separate UX feature on top.