Ride Sharing

A ride-sharing platform must answer "who is the closest driver to this rider, right now?" hundreds of millions of times a day under tight latency. The core problem is geospatial: index a sea of moving driver pings, match them to riders, price the trip dynamically, and predict ETA — while keeping per-region fan-out costs bounded as the number of active drivers in a city grows. The interesting decisions are cell shape (S2 squares vs H3 hexagons), fan-out granularity, and matching policy.

Architecture

Capacity Estimation

Metric	Value	Notes
Active drivers worldwide	~5 M	Uber-scale peak
Location updates/s	~1 M	5 s heartbeat × 5 M
Active drivers per city center	~10 K	Manhattan, peak
Ride requests/s peak	~30 K	Friday 6 PM aggregated
Match latency p95	< 2 s	request to offer
Geo index memory	~5 GB	5 M drivers × 1 KB state

Geospatial Indexing: S2, Geohash, H3

Three flavors, same goal: convert (lat, lon) to a cell ID so neighbors share a prefix and range queries are cheap.

Geohash — base32-encoded interleaved lat/lon bits; longer prefix = smaller cell. Easy to teach, but cell shapes are rectangular and aspect-ratio distorted at the equator vs poles, and neighboring cells can have wildly different lengths.
S2 (Google) — projects the sphere onto an inscribed cube and tiles each face with quadtree cells (Hilbert curve). Square-ish cells, well-defined hierarchy, used by Google Maps and Foursquare.
H3 (Uber) — hexagonal tiling. Hexagons have uniform neighbor distance (every neighbor is the same distance from the center), no awkward corners, and clean "k-ring" expansion. Uber moved from quadtrees to H3 specifically for the consistency of neighbor queries.

For ride matching, H3 wins: the rider's cell + 1-ring (6 neighbors) covers the natural search radius without the rectangular bias. Choose resolution by city density: H3 res 8 (~0.7 km²) for typical cities, res 9 for Manhattan-scale hotspots.

Driver-Rider Matching

The naive matcher finds the closest driver. Reality is messier: you optimize a multi-objective function over a window:

Distance — ETA from driver to rider.
Driver utilization — spread work to keep drivers earning, not all-to-one.
Pickup vs dropoff direction — prefer drivers heading toward the rider's destination (less deadhead).
Acceptance rate — assign to drivers likely to accept, reducing reoffer latency.

Implementation: batched bipartite matching. Hold ride requests for 1–5 seconds, build a cost matrix of (rider, driver) ETAs, run Hungarian algorithm or its approximations. This is strictly better than the per-request greedy at the cost of a small accepted delay.

After matching, the offer goes to the driver. If declined within 10 s, requeue and try the next-best. Persistent decline causes a soft penalty on the driver's acceptance score.

Surge Pricing

Surge balances supply and demand: when riders > drivers in a cell, the price multiplier rises to (a) deter price-insensitive demand, (b) attract drivers from neighboring cells. The classic formula:

multiplier = max(1.0, demand_rate / supply_rate * elasticity_factor) per (H3 cell, 5-minute window). Smoothed with EWMA to avoid step-function jumps that confuse riders.

The hard part is spillover: a high-surge cell sucks drivers from neighbors, which then surge. The system must enforce a floor (drivers per cell minimum) and gradient bounds. Empirically Uber publishes 1.0–5.0× bounds; airports and stadiums get separate policies.

ETA Calculation

Per-request ETA is a routing graph problem: shortest path on a road network with edge weights = current travel time. The graph has tens of millions of edges; Dijkstra alone is too slow. Production systems use contraction hierarchies or customizable contraction hierarchies (CCH) for sub-millisecond queries on continental graphs.

Edge weights come from a stream of GPS pings from active vehicles fed into a real-time pipeline (Kafka + Flink, often). Per-edge median travel time over the last 5 minutes, with seasonal baselines for cold edges. ML models add features (weather, events, road closures); RMSE on Uber's ETA is reportedly ~30 s on a 10-min trip.

Real-time Location Updates

Drivers stream GPS at 4–5 s intervals via a long-lived gRPC stream (or WebSocket). Server-side: ingest at the edge, sample/compress, write to:

In-memory geo index — latest position per driver, for matching. Redis GEOADD works for small fleets; for millions of drivers, a custom sharded structure keyed by H3 cell.
Kafka topic — for downstream consumers (ETA model retraining, analytics, fraud).
Trip path log — for the active trip only, written to a compact time-series store; needed for receipts and disputes.

Battery is a real concern: motion-sensitive heartbeating cuts updates when stationary. Server adapts: if a driver hasn't moved in 60 s, reduce frequency to 30 s.

The Fan-out Problem

"How many drivers in this H3 cell?" is asked thousands of times per second per dense cell. If the geo index is sharded by driver_id, every query fans out to all shards. Better: shard by H3 cell, so each cell's driver set lives on one shard. Trade-off: cell-shard skew (Manhattan shard is hot) requires the hottest cells to be replicated.

Uber's public papers describe a hierarchical structure: ringpop (gossip-based) routes by H3 cell parent, with hot cells split. The general lesson: locality of the query beats locality of the entity.

Failure Modes

Surge runaway — bug in supply estimate sends multiplier to 50×. Hard ceiling and human-approval gate above 5×.
Stale driver pings — driver lost connection but stays in the index, gets matched, never accepts. TTL pings with 30 s timeout; on TTL expiry, evict from index.
Boundary cell starvation — rider sits exactly on a cell boundary; only drivers in their cell are searched, missing close drivers in the neighbor. Always search the rider's cell + 1-ring.
Cross-region disaster — entire region goes down. Active-active per-city deployment: each city is a self-contained shard; a regional outage kills only that city.

FAQ

Why hexagons over squares?

Hexagons have a single neighbor distance (6 neighbors all equidistant); squares have two (4 edge + 4 corner). For radius search and clustering, hexagons remove a class of edge-case bugs.

Why batched matching, not greedy?

Greedy is locally optimal but globally bad: it gives close drivers to early requests and leaves later requests stranded. A 2–5 second batch window plus Hungarian matching reduces total wait time across all riders.

How do you handle the airport queue?

Special "virtual queue" cells: drivers entering the airport perimeter are queued in arrival order; the next ride is offered FIFO. Without this, drivers race to the curb.

Where does ML fit?

ETA prediction (gradient-boosted trees on traffic features), surge demand forecasting, fraud detection (driver/rider account abuse). Matching itself is mostly classical optimization with ML-tweaked weights.