Architecture

Client POST /shorten mint key, store mapping GET /:key cache → DB → 301 redirect ID allocator Redis cache KV / DynamoDB Click analytics Kafka → HLL, aggregator

Capacity Estimation

MetricValueNotes
New URLs/day~10 MBitly-scale
Redirects/day~10 B1000:1 read/write
Peak read QPS~250 K10× daily peak
Storage / 5 yr~3 TB500 B/row × 15 B URLs
Cache (20% hot)~600 GBmemory across cluster
Key length7 chars627 ≈ 3.5 T URLs
Redirect p99< 20 msedge cache + DB

Key Generation: Base62 vs Hash vs Sequential

  • Base62 of an auto-increment ID — allocate ID from a counter (or pre-allocated batches via ZooKeeper / Redis), encode in [0-9a-zA-Z]. Pros: collision-free, compact, predictable length. Cons: enumerable (an attacker scrapes consecutive IDs); reveals creation order.
  • Hash of URL — SHA-256 truncated to 8 chars in base62. Pros: idempotent (same URL = same key, dedup for free). Cons: collisions at ~4 billion, requires collision handling.
  • Random key — generate random 7-char base62 string; SETNX in DB; retry on collision. Pros: not enumerable, no central counter. Cons: at high write rate, collision probability grows; must handle collision retry.
  • Snowflake-style — (timestamp || machine || sequence), encoded base62. Combines random-ish with monotonic; works for distributed allocation without coordination.

Bitly uses base62 of a sequential ID with allocation via a coordination service. Sentinel hash dedup is added on the application side: same target URL by same user returns the same short. Different users hashing the same URL get different shorts (so each user sees their own analytics).

Storage and Cache Layer

Working set is the small fraction of URLs that are actively trafficked. For redirects:

  • Redis as L1: keyed by short, value = long_url. ~200 GB cluster covers the hot 20% of all-time URLs. ~1 ms reads.
  • DynamoDB / Cassandra as the durable store: partition key = short, attributes long_url, created_at, owner, expires_at. KV access pattern is the dominant query.
  • CDN edge for the truly hot ones (a viral marketing campaign): cache the 301 response at Cloudflare; redirect resolves at the PoP without hitting your origin.

301 vs 302 Redirects and Analytics

The redirect HTTP code matters for analytics:

  • 301 Moved Permanently — browsers cache the redirect; subsequent clicks bypass your server. Faster UX; loses analytics on repeat visits.
  • 302 Found (or 307) — non-cacheable; every click hits your server. Slower repeat visits; full click visibility.

Most analytics-driven shorteners (Bitly, lnk.in) use 301 with a short cache (Cache-Control: max-age=120) or 302 outright. Pick by whether analytics or latency is the primary product metric.

Click Analytics with HyperLogLog

Per-URL click tracking quickly explodes: 10 B clicks/day, billion of URLs. Two simultaneous problems:

  • Total clicks — trivial counter; INCR per redirect. Sample if precise count is unnecessary.
  • Unique visitors — deduplicating IPs naively requires per-URL Set with potentially millions of entries. HyperLogLog approximates the cardinality of a set in ~12 KB with ~2% error. Redis has PFADD / PFCOUNT built in.

For temporal aggregates (hourly clicks, geographic distribution), pipe redirects to Kafka, do streaming aggregation in Flink / Spark Streaming, write hourly rollups to ClickHouse. The redirect path itself stays sub-10 ms; analytics is async.

Custom Vanity Slugs

Premium feature: sho.rt/my-cool-link. Implementation:

  • Reserve a slug namespace separate from the auto-generated keys (e.g., custom slugs are 4–30 chars; auto are exactly 7).
  • Atomic CAS: INSERT ... ON CONFLICT DO NOTHING; check rowcount.
  • Reserved-word blocklist: do not let users register admin, api, login.
  • Profanity filter (multilingual) at registration time.

Abuse Prevention

Every URL shortener becomes a phishing vector. Defenses:

  • URL scanning at submit time — check Google Safe Browsing, PhishTank, internal blocklists. Reject or quarantine known-malicious targets.
  • Domain reputation — rate-limit by submitter IP / account; throttle anonymous submissions hard.
  • Click-time interstitial — for new / unverified URLs, show "you are about to visit example.com, continue?" Loses some UX, blocks one-click drive-by.
  • Takedown pipeline — abuse@ inbox → ticket → flip the URL to a warning page within hours, not days.
  • SOC2 / abuse reports — without an active abuse program, your own domain ends up on Safe Browsing, breaking every legitimate user.

Link Expiration

Some use cases need TTL: marketing campaign for 30 days, password reset for 1 hour. Implementation: expires_at column with TTL on cache + DB. Background sweep deletes expired rows; redirect path checks expires_at > now() and returns 410 Gone.

DynamoDB's native TTL works; for Cassandra, use TTL on the row. Avoid scanning the whole table to find expired keys.

The 6→8 Character Migration

Bitly's historical pain: original keys were 6 characters. 626 = 56 B URLs — not enough at long-term scale. They migrated to 7–8 character keys for new URLs while keeping old 6-char URLs forever. Lessons:

  • Variable-length keys are required from day one — never assume fixed length.
  • Prefix-based dispatching — the 6-char and 7-char ranges occupy disjoint namespaces (different starting characters or explicit prefix), so old and new coexist without ambiguity.
  • Cache invalidation is irrelevant — old keys stay valid; you do not migrate values, only the allocator.

Failure Modes

  • Hot key — viral link gets 1 M req/s on one cache shard. Replicate hot key to N shards; CDN-cache at the edge; rate-limit per source.
  • ID allocator outage — cannot mint new keys. Pre-allocate batches per writer process so the allocator can be down for hours without affecting writes.
  • Database unreachable on redirect path — cache miss + DB down = 5xx. Serve stale-while-error from cache; never block the redirect on the DB if cache had it recently.
  • SEO / link rot — old shortened URLs across the web are invaluable; one bad migration kills tens of millions of inbound links. Treat short codes as a permanent commitment.

FAQ

Why not just use UUIDs?

UUIDs are 36 chars and not URL-friendly. The whole point is short. Base62 of a numeric ID gives 7 chars for the same uniqueness range.

How do you handle deletion?

Soft delete (set deleted_at) so you can recover from accidental purges; redirect to a 410 page. Never reuse the short key for a different long URL — clobbers cached state on every browser.

Should I store the entire long URL?

Yes; do not normalize or prettify. Users paste full URLs and expect them back unchanged. The DB row is small even at 2 KB long URLs.

Geo-aware redirects?

Optional premium feature: route to different long URLs by visitor country / language. Requires a more complex storage row and edge-aware redirect.