Design Instagram

Photo Storage, CDN Distribution, Feed Ranking, Stories Architecture, and Image Processing Pipeline

Instagram serves 2B+ monthly active users uploading 100M+ photos per day. The core challenges: building an image processing pipeline that resizes and optimizes every upload into multiple formats, distributing billions of images globally via a CDN with sub-100ms latency, designing a feed ranking algorithm that balances recency, engagement, and relationship signals, implementing ephemeral Stories with 24-hour TTL, and managing a social graph where celebrity accounts have 500M+ followers. At scale, that means petabytes of new storage per year and millions of read QPS served from edge caches.

Image Upload Pipeline Simulator

Every photo upload triggers a multi-step pipeline: the client uploads the raw image, the API gateway validates and routes it, the image processor generates multiple resolutions, objects are stored in S3, CDN edges are warmed, and metadata is persisted to the database.

1
Client Upload
Raw JPEG/HEIC from mobile app
--
2
API Gateway
Auth, rate limit, route to image service
--
3
Image Processor
Resize: 150px thumb, 640px feed, 1080px full
--
4
Object Storage (S3)
Store 3 versions + original in S3 buckets
--
5
CDN Distribution
Push to edge PoPs, warm cache for followers
--
6
Metadata to DB
photo_id, user_id, caption, location, timestamps
--
Original size: --
Thumbnail (150px): --
Feed (640px): --
Full (1080px): --
Total storage per photo: --
Click "Upload Photo" to simulate the image processing pipeline

CDN Architecture Visualizer

Images are served from the nearest CDN Point of Presence (PoP). A cache hit returns the image in single-digit milliseconds; a miss requires fetching from the origin, adding hundreds of milliseconds. Simulate requests from different regions to see the difference.

Origin (S3) US-East PoP EU-West PoP Asia PoP US User EU User Asia User Click a region button to simulate a request
Cache Status --
Latency (cached) --
Latency (miss) --
Cache Hit Ratio --

Feed Ranking Algorithm

Instagram's feed is ranked, not chronological. Each post's score is computed as: score = w1*recency + w2*engagement + w3*relationship + w4*contentType. Adjust the weights to see how the feed reorders in real time.

Presets:

Capacity Estimation Calculator

Back-of-the-envelope math for Instagram at scale. Adjust the parameters and see how storage, bandwidth, and QPS requirements change.

Daily Uploads --
Storage / Day --
Storage / Year --
Write QPS --
Read QPS --
CDN Bandwidth --
Peak Write QPS --
Peak Read QPS --

Stories System Architecture

Stories are ephemeral content with a 24-hour TTL. Design decisions around storage, delivery, and expiration significantly impact system complexity and cost.

Storage: Separate vs Shared

Separate Storage
  • + Optimized TTL cleanup
  • + Independent scaling
  • + Different replication policy
  • - Data duplication for reposts
vs
Shared with Posts
  • + Unified media pipeline
  • + Simpler ops
  • - TTL logic mixed in
  • - Over-provisioned durability
Recommendation: Separate storage with shared image processing pipeline. Stories use cheaper storage with lower durability (no need for 11 nines).

Delivery: Push vs Pull

Push (Fan-out)
  • + Instant story ring updates
  • + Pre-computed story feeds
  • - Write amplification
  • - Wasted for inactive users
vs
Pull (On-demand)
  • + No wasted writes
  • + Simpler pipeline
  • - Higher read latency
  • - Thundering herd on open
Recommendation: Hybrid. Push to active followers' story feeds in Redis. Pull for inactive users on app open. Celebrity stories always pull.

24-Hour TTL: Expiration Strategies

Lazy Expiration
  • Check TTL on read, skip expired
  • Background cleanup job hourly
  • Simple, eventually consistent
vs
Redis TTL + CDC
  • Redis EXPIRE for feed entries
  • Change Data Capture deletes S3 objects
  • Precise, cost-efficient storage
Recommendation: Redis TTL for feed entries (auto-expire), combined with a nightly S3 lifecycle policy to delete media objects older than 25 hours (1-hour grace period).

High-Level Architecture

The system decomposes into independent services connected through a message queue, with shared infrastructure for storage, caching, and content delivery.

Mobile / Web Client
API Gateway
Image Service
Upload, resize, optimize
Feed Service
Ranking, pagination
Stories Service
TTL, story ring
User Service
Profile, follow graph
Object Storage (S3)
Photos, videos, stories
CDN (CloudFront)
Edge caching, PoPs
PostgreSQL / Cassandra
User, photo metadata
Redis Cluster
Feed cache, sessions
Kafka / SQS
Async processing
Image Service receives uploads, pushes resize jobs to Kafka, workers generate 3 sizes, store to S3, then publish CDN invalidation.
Feed Service uses fan-out on write for regular users (pre-compute feed in Redis) and fan-out on read for celebrities (merge at query time).
Stories Service uses Redis sorted sets with TTL for the story ring. S3 lifecycle policies auto-delete expired media.
Database sharded by user_id. Photo metadata in Cassandra for write throughput. User profiles in PostgreSQL for relational queries.

Key Design Decisions

Image Storage Strategy

Store images in object storage (S3) with a CDN layer, never in the database. The DB only stores metadata: photo_id, user_id, S3 URL, dimensions, and timestamps. Object storage provides 11 nines durability, infinite scalability, and costs ~$0.023/GB/month vs $0.10+/GB for database storage. The CDN serves 95%+ of reads, reducing S3 egress costs and latency.

Feed Generation Strategy

Fan-out on write for normal users: when they post, push the post_id into every follower's Redis feed. Fan-out on read for celebrities (>500K followers): their posts are merged into feeds at read time to avoid millions of writes per post. The threshold is tuned based on write capacity -- Instagram uses ~10K followers as the cutoff. This hybrid approach handles both the celebrity problem and the common case efficiently.

Sharding Strategy

Shard by user_id using consistent hashing. All photos, followers, and feed data for a user live on the same shard, enabling single-shard queries for profile views and feed generation. Cross-shard queries (search, explore) use a separate index. Photo IDs embed the shard key: shard_id (16 bits) + timestamp (32 bits) + sequence (16 bits). This keeps data locality high and avoids scatter-gather for the hot path.

Consistency Model

Eventual consistency is acceptable for Instagram. A post appearing 2-3 seconds late in a follower's feed is fine. Like counts can lag. The critical path (upload confirmation, follow/unfollow) is strongly consistent via synchronous writes. Feed and story delivery use async fan-out through Kafka. Read-your-own-writes consistency is maintained by reading from the leader for the posting user's own profile view.

Instagram design tests image processing pipelines, CDN architecture, hybrid fan-out strategies, and ephemeral content systems -- practice explaining storage estimation and the celebrity problem trade-offs.