Architecture

Editor A Editor B Doc server (sticky) WebSocket Transform / merge Op log (ordered) Snapshot writer Presence channel Postgres / S3 snapshots + op log Redis pub/sub cross-server fanout

Capacity Estimation

MetricValueNotes
Active docs~10 Mopen in tabs at peak
Avg concurrent editors / doc2–5p99 ~50
Op rate / active doc1–5/styping burst
WebSocket conns~30 Mdoc viewers
Op size~50 Binsert/delete primitive
Latency budget (typing)< 100 msecho latency

OT vs CRDT

Two algorithm families solve the same problem (concurrent edits converge):

  • OT (Operational Transformation) — ops carry positions; the server transforms incoming ops against already-applied concurrent ones. Simple ops, requires a central authority for canonical order. Used by Google Docs.
  • CRDT — each character has a unique ID; concurrent ops are reconciled by ID order. Heavier per-character metadata, but no central order required. Used by Notion, Linear, Yjs ecosystem.

For an interview today, CRDT (Yjs) is the modern default unless you have specific needs — the library is mature, the literature is settled, and offline support is natural. Pick OT only if you're emulating Google Docs and have a strict central server.

See the collaborative editing system design page for the depth on tradeoffs.

Server-mediated Sync

The data flow:

  1. Client A types; local CRDT inserts; UI updates immediately (optimistic).
  2. Client A sends op over WebSocket to the doc server.
  3. Server appends to op log, broadcasts to other clients via pub/sub.
  4. Client B receives, applies to local CRDT, UI updates.

Sticky sessions: all clients of one doc connect to the same doc-server instance. Routing via consistent hash on doc_id; failover requires a replica that has been receiving the op log. This is the most operationally tricky bit — getting failover smooth without losing presence or sequence.

Conflict Resolution

With CRDT, conflict cannot occur in the algorithmic sense: every op has a unique ID and concurrent ops merge deterministically. What feels like a conflict in product terms is the user's perception:

  • Two users typing in the same paragraph — intermixed text. Usually fine.
  • Two users replacing the same word — both replacements coexist, last visually wins. Usually unexpected.

Block-level CRDTs (Notion) reduce the conflict perception by structuring the document: edits in different blocks never interleave at the character level.

Presence Cursors

Other users' cursors are an ephemeral channel orthogonal to ops:

  • Each client publishes (user_id, cursor_anchor, cursor_head, color) on every keystroke (rate-limited to ~10 Hz).
  • Server fans out to peers; not persisted.
  • Anchor / head positions are CRDT-relative IDs, not byte offsets — otherwise remote inserts move your displayed cursor under you.

Active selection (highlight) is a range (anchor, head); transient with same channel.

Offline Support

The CRDT's killer feature: edits made offline merge correctly on reconnect. Implementation:

  • Client buffers ops locally (IndexedDB / LocalStorage) tagged with a state vector indicating what it's seen from each peer.
  • On reconnect, client sends its state vector; server returns ops the client missed; client applies; client sends its buffered ops.
  • UX: visually show a "syncing" indicator; allow editing during sync.

Long offline (days) edits still merge correctly — the only thing that breaks is user expectation when 1000 of someone else's ops come flooding in.

Storage: Op Log + Snapshots

  • Op log — append-only stream per doc, indexed by sequence number. Postgres table or S3-backed write-ahead log.
  • Snapshot — materialized doc state at op N; speeds up cold-load. Write a new snapshot every 1000 ops; older ops can be archived.
  • Garbage collection — CRDT tombstones (deleted character metadata) accumulate; periodic compaction reduces size.

Failure Modes

  • Server failover loses presence — new server has no presence state. Re-broadcast presence on reconnect; eventually consistent.
  • Op duplication — client retries on flaky network; CRDT idempotency saves you (same op ID, applied once).
  • Snapshot corruption — reconstruct from full op log; expensive but correct.
  • Hot doc — viral collaborative doc with 500 editors; one server saturates. Limit concurrent editors per doc; spread reads to read replicas.

Scaling: Sharding Docs Across Servers

One doc-server holds all clients of one doc — that is the sticky-session constraint. To scale to 10 M concurrent docs, shard:

  • Consistent hash by doc_id — route by hash mod N. Adding a node moves ~1/N docs; the moving docs experience a ~10 s reconnect during migration.
  • Per-doc ownership lease — the server holding a doc has a lease in etcd; on lease loss, ownership migrates. Prevents two servers from owning the same doc.
  • Read replicas — for view-only viewers (read-only ACL), serve from a follower that subscribes to the leader's op stream. Offloads the leader for high-readership docs.

Cross-server fan-out is via Redis pub/sub: when server A processes an op for doc X, it publishes on doc:X; servers B, C, D with viewers of doc X subscribe and forward to their connected clients. Most docs are single-server so this rarely fires.

Comments and Suggestions

Beyond the document body, real collab products have comments anchored to ranges, suggestions (proposed edits the doc owner accepts/rejects), and revision history. Each is a separate CRDT structure:

  • Comments: thread tree anchored to a CRDT range; the range tracks across edits via the same anchor IDs.
  • Suggestions: proposed insertions/deletions held in a parallel pending state; on accept, applied as ops; on reject, discarded.
  • Named versions: snapshot the op-log seq at the moment a user clicks "save version"; expose as a navigable history.

FAQ

How do you handle large pastes?

One bulk-insert op containing the pasted text; modern CRDTs compress consecutive inserts — size impact is < 2× the byte length.

What about images and embeds?

Block-level CRDT: images are blocks with metadata. The image bytes go to S3; the block stores a reference. Inline images in text are awkward; most editors avoid them.

Permissions / sharing model?

Per-doc ACL: owner, editor, commenter, viewer. Enforced at the WebSocket auth layer and op-acceptance layer. Read-only viewers receive ops but cannot send.

Audit / version history?

Op log is the audit. Render historical snapshots by replaying ops up to a target seq. Naming versions is a separate UX feature on top.