Database Sharding

Horizontal Partitioning Strategies, Shard Key Selection, and Scalable Data Distribution

Sharding splits a database into horizontal partitions called shards, each stored on a separate node. The goal: scale writes, storage, and reads beyond what a single database can handle — typically past 1M rows/sec or 10TB. The core challenge: choosing a shard key that evenly distributes data, minimizes cross-shard operations, and survives node additions without massive reshuffling. Get it wrong and you'll create hotspots, rebalancing storms, or impossible JOINs. Get it right and you can scale linearly.

When Do You Need Sharding?

Enter your numbers to see if sharding is warranted — and what capacity each shard needs.

Daily writes: 10M

Avg row size: 1 KB

Replication factor: 3

Storage retention: 1 year(s)

Max writes per shard/sec: 10K

Total storage—

Raw writes/sec—

Shards needed (write)—

Shards needed (storage)—

Total nodes (with RF=3)—

Shard recommendation—

Shard Key Strategies

The shard key determines how data is distributed. Pick the wrong one and you'll create bottlenecks.

Hash-Based

Even Distribution

hash(key) mod N — maps keys uniformly across shards. Best for: random access patterns, user IDs, tenant IDs.

shard = hash(user_id) % num_shards

✓ Even distribution ✓ Simple

✗ Range scans span all shards ✗ Resharding requires full remap

Range-Based

Query-Friendly

Shard by ranges of the key — e.g., A-E on shard 1, F-J on shard 2. Best for: time-series, geographic data.

shard = getRangeShard(timestamp)

✓ Efficient range scans ✓ Easy to understand

✗ Hotspots (newest partition) ✗ Uneven if access is skewed

Directory-Based

Most Flexible

Lookup table maps keys → shards. Best for: multi-tenant SaaS, mixed access patterns, gradual migration.

shard = lookup_table.get(shard_key)

✓ No key remapping on add ✓ Flexible routing

✗ Lookup overhead ✗ Lookup table becomes SPOF

Consistent Hashing Ring

Add or remove nodes and watch how consistent hashing minimizes key remapping — compared to naive modulo.

Virtual nodes: 3

Naive Modulo (hash % N)

—

Consistent Hashing

—

Resharding: Adding a Node

When you add a shard, naive modulo reassigns most keys. Consistent hashing reassigns only ~1/N.

Current shards: 4

Before (4 shards)

After adding shard 5

Naive % keys moved —

Consistent Hashing % moved —

Keys saved from remapping —

Cross-Shard Query Patterns

Some queries must touch multiple shards. Choose the right strategy based on your access pattern.

Pros

Cons

Hot Key Mitigation

A single popular shard key (e.g., celebrity user) can overwhelm one shard. Here's how to handle it.

🔀 Key Splitting

user:1234 → user:1234:0, user:1234:1, user:1234:2

When: one key has >20% of traffic

📋 Local Cache

Redis L1 cache on each shard node for hot keys

When: read-heavy hot keys

🗳️ Replica Read

Route hot key reads to read replicas, writes to primary

✗ Replication lag may show stale reads

🧂 Salting

Add random suffix to distribute load: user:1234:salt → shard N

When: hot key pattern is predictable

Shard Key Decision Framework

📊 Query Pattern First

Identify your top 3 most common queries. Can they be answered by a single shard? If yes, use a shard key aligned to that query. If all queries are cross-shard, lean toward denormalization.

⚖️ Write Distribution

Will writes hit a few keys heavily (time-series, social media)? Use hash-based for even write distribution. Writes uniformly distributed? Range-based works fine.

🔢 Cardinality

Shard key must have high cardinality — thousands+ of distinct values. User ID is good; Country code (200 values) is not. Low cardinality = few large shards.

🛡️ Resharding Cost

Plan for 2× expected growth to avoid resharding in <12 months. If resharding is likely, use directory-based or consistent hashing from day one.

✅ Golden Rule

If you can't answer "which shard does this query hit?" before executing, your shard key needs work. Every unnecessary scatter-gather query multiplies latency by the number of shards.

Frequently Asked Questions

When should I shard vs. just adding read replicas?

Add replicas first. Replicas handle read scaling, improve availability, and require zero application changes. Shard when: (1) you hit write throughput limits, (2) storage exceeds a single node, or (3) p99 latency spikes under write load. Typically this is 5–50M+ rows/day or 1TB+ of hot data.

What happens to foreign keys and JOINs?

Foreign keys don't automatically span shards. You have three options: denormalize (duplicate data for co-location), re-published queries (run sub-query on each shard and merge), or application-level joins (fetch from each shard and join in your service layer). Denormalization is most common in sharded systems.

Can I shard without a middleware proxy?

Yes. Many databases (MongoDB, Cassandra, CockroachDB) handle shard routing natively. Proxy-based sharding (Vitess, CockroachDB, Spanner) adds a routing layer that your app talks to like a single database. Direct sharding (CockroachDB, YugabyteDB) handles it transparently. Proxy adds latency but gives you full control; native sharding is simpler but less flexible.

How do I choose between hash and range sharding?

Hash sharding when: queries are random-access by ID, writes are uniformly distributed, you want even load. Range sharding when: queries naturally cluster (time ranges, alphabetical ranges, geographic regions), you frequently scan contiguous ranges, and data distribution is naturally chunkable.

What's the ideal number of shards?

Design for 2× your expected 18-month growth. Start with enough shards that each can handle your peak load with 50% headroom. For most mid-size systems: 4–16 shards is a good range. Too few shards = insufficient scale. Too many = coordination overhead, more cross-shard queries, harder operations. You can always add shards with consistent hashing.

How do I do cross-shard transactions?

Distributed transactions (2-phase commit, SAGA pattern) are expensive and slow. The sharding philosophy prefers eventual consistency: writes go to one shard, async events propagate to others. For true ACID across shards, use a distributed transaction coordinator (Vitess, CockroachDB Google Spanner). Prefer single-shard transactions + async reconciliation wherever possible.