Design Uber / Ride-Sharing
Geospatial Indexing, Ride Matching, Surge Pricing, ETA Calculation, and Real-Time Location Tracking
Uber processes 25M+ rides per day across 10,000+ cities. The core challenges: tracking millions of drivers sending location updates every few seconds via WebSockets, performing geospatial queries to match riders with nearby drivers in real time using geohash or H3 indexing, computing dynamic ETA factoring in traffic, road networks, and historical patterns, implementing surge pricing that balances supply and demand, and building a matching algorithm that optimizes for rider wait time, driver utilization, and fairness. At peak, the system handles millions of location updates per second and completes matches in under 5 seconds.
Geospatial Indexing Visualizer (Geohash)
Geohash divides the world into a grid of cells. Each character added to the hash increases precision. Nearby locations share common prefixes, enabling efficient spatial queries. Click a cell to see its geohash at different precision levels.
Ride Matching Algorithm Simulator
When a rider requests a ride, the system finds nearby drivers, scores them by distance, rating, and acceptance rate, then matches the best candidate. Click on the grid to place a rider, then watch the matching algorithm run.
Capacity Estimation
Estimate the infrastructure requirements for a ride-sharing platform at scale.
ETA Calculation Engine
ETA combines straight-line distance with road network factors and real-time traffic conditions. Click on the grid to set start and end points, then adjust the time of day to see how traffic affects the estimate.
Surge Pricing Calculator
Surge pricing dynamically adjusts fares based on the ratio of ride requests (demand) to available drivers (supply). When demand exceeds supply, prices increase to incentivize more drivers and manage rider expectations.
Architecture
Key Design Decisions
Geohash vs QuadTree vs H3
- String-based, easy to store in Redis
- Prefix matching for proximity
- Edge cases at cell boundaries
- Simple to implement
- Adaptive resolution per region
- Better for non-uniform distribution
- In-memory tree structure
- Complex to distribute across nodes
- Hexagonal grid, uniform distances
- Hierarchical resolution levels
- No boundary edge cases
- Open source, battle-tested
Location Storage: Redis vs In-Memory
Redis with GEO commands is the recommended choice. GEOADD stores driver locations, GEORADIUS queries nearby drivers within a given radius. Redis handles 100K+ operations/sec per node. For extreme scale, shard by geohash prefix. In-memory stores are faster but lose data on restart and are harder to distribute.
Push vs Pull for Driver Locations
Push model via WebSockets β drivers push location updates every 3-5 seconds. The server does not poll. WebSocket connections are persistent, reducing connection overhead. For 5M active drivers, that means ~5M concurrent WebSocket connections distributed across a fleet of connection servers. Use consistent hashing to route drivers to specific servers.