Design Twitter / X

Tweet Fanout Strategies, Timeline Generation, Search Indexing, Capacity Estimation, and Hybrid Push-Pull Architecture

Twitter/X serves 500M+ daily active users generating 500M+ tweets per day. The core challenges: choosing between push (fan-out on write) and pull (fan-out on read) for timeline delivery, handling the celebrity problem where a single account may have 100M+ followers, building a real-time search index over the full tweet corpus, designing a timeline cache in Redis that serves home timelines in under 50ms, and scaling a system where the read-to-write ratio exceeds 600:1. The hybrid fanout approach—push for normal users, pull for celebrities—is the defining architectural insight.

Tweet Fanout Simulator

When a user tweets, how does it reach followers? In push (fan-out on write), the tweet is written to every follower's timeline cache. In pull (fan-out on read), followers fetch tweets on demand. Enter a follower count to compare the two approaches and see why Twitter uses a hybrid model.

Push (Fan-out on Write)
Cache writes needed --
Latency per tweet --
Read cost --
vs
Pull (Fan-out on Read)
Cache writes needed --
Latency per tweet --
Read cost --

Timeline Generation Pipeline

Trace how a tweet flows through the system from creation to appearing in followers' timelines. Click to simulate a tweet from a normal user (push path) or a celebrity (hybrid path).

1
Tweet Posted
Client sends POST /api/tweet with text + media IDs
--
2
Write to Tweets Table
Persist tweet_id, user_id, text, created_at to DB
--
3
Fan-out Service
Check follower count against threshold
--
4
Push / Pull Decision
Route based on user type
--
5
Timeline Delivery
Tweets land in follower timelines
--
6
Timeline Read
Follower opens app, merges push cache + pull celebrity tweets
--
Click a button above to simulate the tweet pipeline

Capacity Estimation Calculator

Back-of-the-envelope math for Twitter at scale. Adjust the parameters and see how QPS, storage, cache, and bandwidth requirements change.

Write QPS --
Read QPS --
Peak Write QPS --
Peak Read QPS --
Storage / Day --
Storage / Year --
Cache Needed --
Bandwidth (read) --

Search Indexing Simulator

Type a tweet to see how it gets tokenized and added to an inverted index. Each word maps to a list of tweet IDs containing it. Add multiple tweets to watch the index grow.

Tokens will appear here after adding a tweet
Indexed Tweets
No tweets indexed yet
Inverted Index
Empty — add tweets to build the index

Architecture

Client
API Gateway
App Servers
Redis (Timeline)
Home Timeline Cache
Database (Tweets)
Tweet Storage
Fan-out Service
Push / Hybrid
Elasticsearch
Search Index
Kafka
Analytics

Key Design Decisions

Push vs Pull vs Hybrid Fanout

Push (Fan-out on Write)
  • + Timeline reads are O(1) from cache
  • + Sub-millisecond read latency
  • - Celebrity tweets need N million writes
  • - Wasted writes for inactive followers
vs
Pull (Fan-out on Read)
  • + Tweet write is O(1)
  • + No wasted work for inactive users
  • - Slow: must query all followees
  • - N+1 query problem at read time
Hybrid (Twitter's approach): Push for users with <100K followers, pull-on-read merge for celebrities. When a follower opens their timeline, the pre-built cache is merged with latest celebrity tweets in real time.

Tweet Storage: SQL vs NoSQL

Tweet data (text, user_id, timestamp) fits well in a relational DB partitioned by tweet_id. Twitter historically used MySQL with custom sharding (Gizzard). The tweets table is append-heavy and rarely updated. Shard by user_id for user-timeline queries, or by tweet_id for random access. Use a snowflake ID for time-sortable, globally unique IDs.

Search Architecture

Twitter's search is an inverted index built on top of Earlybird (custom Lucene). Tweets are tokenized, stemmed, and indexed in near real-time. The index is partitioned by time (recent tweets on faster hardware) and by hash for horizontal scaling. Queries fan out to all partitions and results are merged by relevance and recency.

The hybrid fanout model is the defining insight — master push vs pull trade-offs and celebrity scaling.