Design a Ride-Sharing Service (Uber)

advanced 4-7 YOE ride-sharing system-design geospatial real-time matching

We’re designing a ride-sharing service like Uber or Lyft. This is a fantastic interview question because it hits real-time systems, geospatial queries, matching algorithms, and distributed state — all in one problem.

The core challenge: a rider requests a ride, and within seconds we need to find the nearest available driver, match them, track both in real time, calculate the fare, and process payment. All of this for 20 million rides happening per day across hundreds of cities.

Step 1: Requirements

Functional Requirements

  • Riders can request a ride by setting pickup and dropoff locations
  • The system matches riders with the nearest available driver
  • Real-time tracking of the driver’s location (for both rider and driver)
  • ETA calculation for driver arrival and trip duration
  • Fare estimation before the ride and final fare calculation after
  • Payment processing at the end of the ride
  • Rating system for riders and drivers
  • Ride history for both riders and drivers

Non-Functional Requirements

  • Low latency matching — find a driver in < 10 seconds
  • Real-time updates — location updates should reflect in < 1 second
  • High availability — riders stranded without a working app is a very bad look
  • Consistency for payments — we can never double-charge or lose a payment
  • Scale — 20M rides/day, 5M active drivers, location updates every 4 seconds

Step 2: Estimation

Assumptions:

  • 100M total riders, 5M active drivers
  • 20M rides per day
  • Each active driver sends location every 4 seconds
  • Average ride duration: 15 minutes
  • Peak hours: 3x average load

QPS:

Ride requests:      20M / 86,400 ≈ ~230 requests/sec
Peak ride requests: ~700 requests/sec

Location updates:   5M drivers × (1 update / 4 sec) = 1.25M updates/sec
Peak:               ~2M updates/sec

That location update number is wild. 1.25 million writes per second just for driver locations. This is the single hardest scaling challenge in this system.

Storage:

Location update size: ~100 bytes (driver_id, lat, lng, timestamp, heading)
Location writes/day:  1.25M/sec × 86,400 = ~108B writes/day
If we keep 30 days of history: 108B × 100 bytes × 30 = ~324 TB

Ride data: 20M rides/day × 1 KB per ride = 20 GB/day

We don’t need to store every single location update forever. Current driver locations live in Redis (real-time). Historical location traces can go to a time-series database for analytics and trip reconstruction.

Step 3: High-Level Design

Ride-Sharing — High-Level Architecture
Rider App
Driver App
│ REST + WebSocket
│ REST + WebSocket
API Gateway / Load Balancer
Matching Service
find nearby drivers
Location Service
track all drivers
Trip Service
manage ride lifecycle
Pricing Service
fare + surge
Payment Service
charge rider
Redis (driver locations)
PostgreSQL (rides, users)
Kafka (events)
Ride flow: Rider requests → Matching finds driver → Trip manages ride → Payment charges at end
Location flow: Driver app → Location Service → Redis (every 4 sec) → pushed to rider via WebSocket

The ride lifecycle:

  1. Rider requests a ride — sends pickup/dropoff location to the API
  2. Pricing Service — calculates the estimated fare (including surge pricing if applicable)
  3. Matching Service — queries the Location Service for nearby available drivers, picks the best one
  4. Driver notified — push notification + WebSocket update. Driver accepts or declines.
  5. Trip starts — driver heads to pickup. Rider sees driver’s real-time location.
  6. In transit — driver picks up rider, heads to destination. Location tracked throughout.
  7. Trip ends — driver marks ride complete. Final fare calculated based on actual distance/time.
  8. Payment — rider charged, driver paid (minus platform fee)

Step 4: API Design

POST /api/v1/rides/estimate
  Body: { "pickup": { "lat": 37.7749, "lng": -122.4194 },
          "dropoff": { "lat": 37.7849, "lng": -122.4094 } }
  Response: { "estimated_fare": "$12.50", "estimated_time": "15 min",
              "surge_multiplier": 1.2 }

POST /api/v1/rides/request
  Body: { "pickup": { "lat": 37.7749, "lng": -122.4194 },
          "dropoff": { "lat": 37.7849, "lng": -122.4094 },
          "ride_type": "standard" }
  Response: { "ride_id": "ride_456", "status": "matching",
              "estimated_pickup": "4 min" }

GET /api/v1/rides/{ride_id}
  → Current ride status, driver info, location, ETA

POST /api/v1/rides/{ride_id}/cancel
POST /api/v1/rides/{ride_id}/rate
  Body: { "rating": 5, "comment": "Great ride!" }

-- Driver endpoints:
PUT /api/v1/drivers/location
  Body: { "lat": 37.7750, "lng": -122.4195, "heading": 180, "speed": 30 }

POST /api/v1/rides/{ride_id}/accept
POST /api/v1/rides/{ride_id}/start      -- driver picked up the rider
POST /api/v1/rides/{ride_id}/complete    -- driver arrived at destination

Real-time communication happens over WebSocket. Both the rider and driver apps maintain a persistent WebSocket connection. Through this, we push:

  • Driver location updates to the rider
  • Ride status changes (driver assigned, arriving, trip started, etc.)
  • Navigation updates to the driver

Step 5: Data Model

-- Users table (PostgreSQL)
CREATE TABLE users (
    user_id         BIGINT PRIMARY KEY,
    type            VARCHAR(10),             -- 'rider' or 'driver'
    name            VARCHAR(100),
    email           VARCHAR(255) UNIQUE,
    phone           VARCHAR(20),
    rating          DECIMAL(3,2) DEFAULT 5.0,
    total_rides     INT DEFAULT 0,
    created_at      TIMESTAMP
);

-- Driver details (PostgreSQL)
CREATE TABLE drivers (
    driver_id       BIGINT PRIMARY KEY REFERENCES users(user_id),
    vehicle_make    VARCHAR(50),
    vehicle_model   VARCHAR(50),
    vehicle_plate   VARCHAR(20),
    vehicle_color   VARCHAR(30),
    license_number  VARCHAR(50),
    status          VARCHAR(20),             -- 'available', 'busy', 'offline'
    current_city    VARCHAR(50)
);

-- Rides table (PostgreSQL)
CREATE TABLE rides (
    ride_id         BIGINT PRIMARY KEY,
    rider_id        BIGINT NOT NULL,
    driver_id       BIGINT,
    status          VARCHAR(20),             -- 'matching', 'accepted', 'arriving',
                                             -- 'in_progress', 'completed', 'cancelled'
    pickup_lat      DECIMAL(10,7),
    pickup_lng      DECIMAL(10,7),
    dropoff_lat     DECIMAL(10,7),
    dropoff_lng     DECIMAL(10,7),
    estimated_fare  DECIMAL(10,2),
    actual_fare     DECIMAL(10,2),
    surge_multiplier DECIMAL(3,2) DEFAULT 1.0,
    distance_km     DECIMAL(10,2),
    duration_min    INT,
    requested_at    TIMESTAMP,
    started_at      TIMESTAMP,
    completed_at    TIMESTAMP,
    INDEX idx_rider (rider_id, requested_at DESC),
    INDEX idx_driver (driver_id, requested_at DESC)
);

-- Payments table (PostgreSQL)
CREATE TABLE payments (
    payment_id      BIGINT PRIMARY KEY,
    ride_id         BIGINT UNIQUE NOT NULL,
    rider_id        BIGINT NOT NULL,
    driver_id       BIGINT NOT NULL,
    amount          DECIMAL(10,2),
    platform_fee    DECIMAL(10,2),
    driver_payout   DECIMAL(10,2),
    status          VARCHAR(20),             -- 'pending', 'charged', 'paid_out', 'refunded'
    payment_method  VARCHAR(20),
    processed_at    TIMESTAMP
);

-- Driver locations (Redis — real-time, not persistent)
-- Using Redis GEO commands for geospatial queries
-- GEOADD drivers:available {lng} {lat} {driver_id}
-- GEORADIUS drivers:available {lng} {lat} 5 km COUNT 20 ASC

Step 6: Deep Dives

Deep Dive 1: Geospatial Indexing — Finding Nearby Drivers

When a rider requests a ride, we need to answer: “Which available drivers are within 5 km of this location?” And we need to answer it in milliseconds, across millions of drivers.

Option A: Brute force (don’t do this)

Scan all 5M drivers, calculate the distance to the rider for each one, filter by radius. That’s O(n) for every request. Way too slow.

Option B: Geohashing

Think of it like a zip code for GPS coordinates. We divide the entire earth into a grid, and each cell gets a hash string. The clever part: cells that are geographically close share a common prefix in their hash.

Geohash: 9q8yyk → a grid cell in San Francisco
Geohash: 9q8yym → the cell right next to it

They share prefix "9q8yy" → they're neighbors

How we use it:

  1. When a driver sends a location update, we compute their geohash and store it
  2. When a rider requests a ride, we compute the rider’s geohash
  3. We search the rider’s geohash cell AND all neighboring cells for available drivers
  4. Since geohash cells have a fixed size, this narrows our search from 5M drivers to maybe 50-100 in the area

Option C: Redis GEO (what we’d actually use)

Redis has built-in geospatial support using a sorted set with geohash encoding under the hood.

GEOADD drivers:available -122.4194 37.7749 driver_42
GEOADD drivers:available -122.4095 37.7850 driver_99

GEORADIUS drivers:available -122.4194 37.7749 5 km COUNT 20 ASC
→ Returns the 20 closest drivers within 5 km, sorted by distance

In simple language, Redis GEO does the geohashing for us. We just say “add this driver at this coordinate” and “find me drivers near this point.” It’s incredibly fast because it’s all in-memory and uses a sorted set internally.

Why not a quadtree? Quadtrees work great too — Uber actually used a custom quadtree for a while. But Redis GEO is simpler to operate and scales well for most ride-sharing scenarios. At Uber’s scale, they moved to a custom solution (H3 — a hexagonal grid system), but for an interview, Redis GEO or geohashing is the right answer.

Deep Dive 2: Driver-Rider Matching

Finding nearby drivers is step one. But which driver do we actually assign? The closest one isn’t always the best choice.

Simple approach: Closest driver

Find the nearest available driver, send them the request. If they decline, move to the next closest. Simple, but not optimal.

Better approach: ETA-based matching

The closest driver by straight-line distance might be on the other side of a highway. A driver slightly farther away might actually arrive sooner because of road layout.

Driver A: 1.2 km away (straight line), but ETA = 8 minutes (blocked by river)
Driver B: 1.8 km away (straight line), but ETA = 4 minutes (clear road)

→ We should pick Driver B

We compute the actual driving ETA (using a routing engine like OSRM or Google Maps Directions API) for the top 5-10 closest drivers, then pick the one with the shortest ETA.

Advanced approach: Scoring function

Uber uses a scoring function that considers multiple factors:

Score = w1 × (1 / ETA)                  -- shorter ETA is better
      + w2 × driver_rating              -- higher-rated drivers preferred
      + w3 × acceptance_rate            -- drivers who accept more rides
      + w4 × earnings_fairness          -- distribute rides fairly

The matching service computes this score for the top candidates and sends the request to the highest-scoring driver. If they don’t accept within 10 seconds, it moves to the next one.

Batch matching:

During peak hours, there might be many riders and drivers in the same area. Instead of matching one-by-one, we can batch — collect all ride requests and available drivers in a time window (say 2 seconds), and solve the optimal matching problem for the whole batch. This gives globally better matches but adds a small delay.

Deep Dive 3: Real-Time Location Tracking

Every active driver sends their GPS location to our system every 4 seconds. At 5M active drivers, that’s 1.25M location updates per second. How do we handle this firehose?

The write path:

  1. Driver app sends location to the API gateway
  2. API gateway routes to the Location Service
  3. Location Service updates Redis (current location) AND publishes to Kafka (event stream)
Driver → Location Service → Redis GEOADD (current position)
                          → Kafka topic: "driver-locations" (for history/analytics)

Why Redis for current locations?

We only care about the current location for matching. We don’t need a durable database for this. Redis is in-memory, so writes are microseconds fast. If Redis loses data, the next location update (4 seconds later) will repopulate it. No big deal.

Why Kafka for the stream?

We publish every location update to Kafka for multiple consumers:

  • Trip Service — to track the active ride and compute distance/fare
  • ETA Service — to update arrival estimates
  • Analytics — to build heatmaps, optimize driver positioning
  • Fraud detection — to verify the driver is actually driving the route

Pushing location to the rider:

When a rider is waiting for their driver, we need to push the driver’s location to the rider’s app in real time.

Driver sends location every 4 sec
→ Location Service updates Redis
→ Location Service publishes to Kafka
→ Trip consumer reads from Kafka
→ Trip consumer pushes to rider via WebSocket
→ Rider's app updates the map

The rider sees the little car moving on the map, updating every 4 seconds. Smooth enough to feel real-time.

Scaling location updates:

1.25M writes/sec is a lot. We can handle it by:

  • Sharding Redis by city — each city gets its own Redis cluster. Drivers in NYC only exist in the NYC shard. This also makes sense because we’d never match a driver in NYC with a rider in London.
  • Kafka partitioning by city — same idea. Each city is a partition (or set of partitions).
  • Batching on the client — instead of sending every single GPS point, the driver app can batch 2-3 points and send them together. Reduces QPS by 2-3x.

Step 7: Scaling

Location Service:

  • Shard by city/region — each region gets its own Redis cluster
  • 5M drivers across maybe 500 cities = ~10K drivers per city on average
  • A single Redis instance can handle 100K+ ops/sec. Even busy cities are fine.
  • For mega-cities (NYC, London, Mumbai), use Redis Cluster with multiple shards

Matching Service:

  • Stateless — can scale horizontally with more instances
  • The bottleneck is the geospatial query + ETA computation
  • Cache ETA results for common routes (e.g., airport to downtown)
  • During peak hours, spin up more matching workers

Trip Service:

  • Each active ride is a state machine: matching → accepted → arriving → in_progress → completed
  • Store active rides in Redis for fast status updates
  • Persist completed rides to PostgreSQL

Surge pricing:

  • Divide each city into hexagonal zones
  • Track supply (available drivers) and demand (ride requests) per zone in real time
  • When demand exceeds supply, apply a surge multiplier (1.2x, 1.5x, 2x)
  • Surge data lives in Redis — it changes every few minutes

Payment processing:

  • Process payments asynchronously after the ride ends
  • Use a payment queue to handle retries and failures
  • Double-charge prevention: use idempotency keys on every payment request
  • The ride can only be marked “completed” after payment succeeds (saga pattern)

Database scaling:

  • Rides table: partition by date range (current month in hot storage, older in archive)
  • Read replicas for analytics queries
  • The users table is relatively small — standard PostgreSQL with caching handles it

Multi-region deployment:

  • Each region operates independently (a ride in NYC doesn’t need to talk to London)
  • User accounts are global (replicated across regions)
  • When a user travels, their account data is fetched from the global store and cached locally

In simple language, a ride-sharing system is built around three core problems: knowing where all the drivers are (Location Service + Redis GEO), finding the best driver for a rider (Matching Service with geospatial queries + ETA), and managing the ride from start to finish (Trip Service as a state machine). The location firehose (1M+ updates/sec) is the biggest scaling challenge, and we solve it by sharding by city and using Redis for current positions. Everything else — payments, pricing, ratings — is standard microservice territory.