3 minute read

Google maps helps user find directions and navigate to their destination, and it has enormous complexity of google map, it’s important to nail down which features we should support.

Step1: Scope the problem

Let’s assume we will focus on 3 key features:

  1. user location update
  2. navigation service, including ETA service
  3. map rendering

Server Throughput

To estimate the server throughput, let’s review the types of requests we need to support.

There are two types of request:

  1. navigation request
  2. location update request

Assume peak QPS is five times the avergage.

Step2: High level design

location service

responsible for recording user location update.

There are several benefits if we can know user’s location periodically:

  1. improve our system performance
  2. monitor live traffic
  3. provide more accurate ETA

Location history can be buffered on client and send to server in batch.

Even though, the write volume is still very high, we need a database that is optimized for high write volume

navigation service

the calculated route doesn’t need to be the fastest one, but accuracy is critical.

map rendering

when a client needs a map tile, it first determines the map tile collection to use based on zoom level, then computes the map tile URL and download from CDN.

overall flows should looks like:

  1. mobile user calls the map tile service to fetch the tile URLs
  2. load balancer forwards the request to map tile service
  3. map tile service takes the client’s location and zoom level then return 9 URLs of tiles to client
  4. client downloads the tile from the CDN

Step3: Design deep dive

Routing tiles

First we should run a periodic offline processing pipeline, what we called as routing tile service, to transform the raw road dataset into routing tiles.

The output of above process is routing tiles, each tile contains a list of graph nodes and edges represent the intersections and roads within the area covered by the tile.

We would only use the database as storage, it seems it’s expensive to store bits of data, we also don’t need any features for routing tiles.

More efficient way to store these tiles in object storage like S3, and cache it aggresively on the routing service that uses these tiles.

Location service

User location data is essential, it supports many use cases. like personalization or traffic monitoring.

To support those use cases, we log this information into message queue such as Kafka.

Other service consume the location data stream from Kafka for various use cases.

Navigation service

responsible for finding the fastest and accurate routings.

geo-coding service

we need to have a service to resolve an address to a location of latitude and longitude pair.

Navigation service will rely on this service to transform the origin and destination and pass them to downstream service.

route planner service

compute the suggested route that is optimized based on some standards

shortest-path service

Find top-k shortest path only based on road structure, without considering the traffic and current conditions.

Algorithm starts from origin position, traverse the graph data structure, find several possible paths

ETA service

ETA service evaluate each paths provided by shortest-path service, taking the traffic into account and get the time estimation.

ranker service

ranker service will apply some filters defined by users like avoid troll etc.

protocol

WebSocket is generally considered to be a better option for server to push event to client stably.

Step4: Wrap Up

In this article we introduced a simple version google map. It has two main flows:

  1. user location update
  2. navigation service

User location update has been introduced multiple times before, attention need to paid is we can buffer the update event on client.

More important feature is navigation, several service need to be collaborated, like the routing tile processing, path finding service, ETA service and ranking service