Architecture

Feed / app Graph API edges, assoc, ranges Edge cache (TAO) leader-follower per shard writes through leader Sharded MySQL edge_list, assoc tables Graph DB JanusGraph / Neo4j Recommendations FoF, common interest, batch + online Graph compute Pregel / GraphX PageRank, communities

Capacity Estimation

MetricValueNotes
Users~3 BFacebook-scale
Edges~500 Bfollowers + interests + actions
Edge reads/s~50 Mcache-hit dominated
Edge writes/s~500 Kfollows, likes, comments
Avg degree~150per-user friends
p99 degree~50 Mcelebrities
Storage~50 TBjust edges, with metadata

Storage: TAO, JanusGraph, Neo4j

TAO (Facebook's "The Associations and Objects") is the canonical example: a write-through cache layer over sharded MySQL, exposing two primitives — objects (typed nodes) and associations (typed directed edges with metadata). Reads hit a leader-follower cache hierarchy; writes go through the shard's leader, then asynchronously to followers. TAO trades strict consistency for read latency and bandwidth efficiency — the eventual-consistency window is < 1 s in practice.

JanusGraph is open-source, pluggable storage (Cassandra, HBase) and indexing (Elasticsearch). Strong on Gremlin traversal. Operationally heavier than TAO but you do not have to build it.

Neo4j is single-master, ACID, expressive Cypher. Fits ~10 B-edge graphs on one beefy box; horizontal scale via fabric is a recent and rough capability. Best for analytical graph queries on enterprise-scale data, not consumer social.

Practical lesson: most consumer-scale social graphs end up looking like TAO regardless of starting choice — an edge-list table sharded by source vertex, with a write-through cache. The "graph DB" label matters less than the access pattern.

Shortest Path and Friend-of-Friend

Six-degrees queries on a billion-node graph cannot do online BFS. The trick is bidirectional BFS: expand from both endpoints simultaneously, terminate when frontiers meet. Cuts the search space from bd to 2 · bd/2. For social graphs at average degree 150 and depth 4, that is 1504=506M vs 2×1502=45K nodes touched.

Friend-of-friend is the friendly cousin: enumerate neighbors of neighbors, intersect with a candidate set, score, return top-k. The cost is dominated by celebrity edges (see below); production systems precompute FoF for the top fraction of users via batch Spark jobs.

Graph Partitioning

You cannot fit the graph on one machine. Partition by node: each shard owns a subset of users + their outgoing edges. Then the question is which partitioning:

  • Hash by user_id — uniform load, but every traversal crosses many shards.
  • Community-based (METIS, Louvain) — minimizes cross-shard edges but expensive to compute and rebalance as the graph evolves.
  • Geography — users by country / region. Locality is good for in-region queries; falls apart for international friend graphs.

Most production systems hash and accept the cross-shard traversal cost, because the alternative requires global recomputation as users join/leave.

The Celebrity Problem

One user with 100 M followers means one shard has 100 M edges anchored at that user. Three flavors of pain:

  • Read amplification — "is X following Y?" for celebrity Y hits a giant edge list. Maintain a reverse index sharded by follower so the lookup is O(1) on the follower side.
  • Fan-out on write — celebrity posts; you need to deliver to 100 M home feeds. Pure push fan-out blocks the writer for tens of minutes. Hybrid: push to most users, pull for celebrity-followed users (their feed is assembled at read time by querying celebrities they follow).
  • Hot shard — the celebrity's shard is hammered by every "who follows X?" query. Replicate hot user records to multiple shards; route queries randomly.

Empirical heuristic from Twitter's feed design: anyone with > 10 K followers is treated as a celebrity for fan-out purposes.

Reciprocal vs Directional Follows

Facebook (friends) is reciprocal: a friendship is two directed edges, both must agree. Twitter (followers) is directional: A follows B without B's consent. Storage:

  • Reciprocal: store as a single edge per pair (A–B) keyed canonically (smaller-id first); a friend list is a range query on either endpoint.
  • Directional: store two edge lists per user — following (out-edges) and followers (in-edges). Doubles storage; halves query cost on common reads.

Mixed graphs (LinkedIn: connections + follows) keep the two relationships in separate tables; never overload one column.

Failure Modes

  • Replication lag spike — user adds friend, refreshes, friend not visible. Read-your-writes via session-pinning to the leader for that user's window.
  • Mass unfollow campaign — bot army unfollows a celebrity; 10 M edge deletes pile up on one shard. Throttle delete throughput per target.
  • Schema migration on edges — adding a column to edge_list across hundreds of shards. Online schema-change tools (gh-ost, pt-osc) are mandatory.
  • Garbage collection — dead accounts leave dangling edges. Background sweep marks orphans; a soft-delete window protects against false positives.

FAQ

Should I use a graph database?

Only if your queries are deeply traversal-heavy (5+ hops, complex pattern matching). For "list my friends" and "do A and B share a friend?", a sharded relational store with a cache (TAO-style) is simpler and faster.

How do you do PageRank on a social graph?

Offline. Pregel/Spark GraphX once daily; serve the score as a feature in ranking. Online recomputation is infeasible.

What about end-to-end consistency?

Within a session, pin reads to the leader for < 1 minute after a write. Cross-session, accept eventual consistency — users do not notice 1 s lag on a follow.

How does fan-out interact with feed ranking?

Fan-out delivers candidate posts; ranking happens at read time over the candidate set. Separating production (fan-out) from consumption (ranking) lets you change ranking without re-fanning history.