2 minute read

Problem

If our application and its data only reside in one data center, then if we want provide our service to the users across the world, first their request need to fly a long distance, then our response need to fly back, inevitably causing a high latency.

What is CDN

CDN is the solution of above problem. It’s a group of geographically distributed proxy servers, which are closer to end users.

Then our original server will periodically update the content in proxy servers, and proxy servers could serve the request from end users quickly.

CDN Functional Requirements

From the perspective of CDN system, it should fulfill following functional requirements:

  • Retrieve: fetch the content/data from original server
  • Response: CDN should be able to serve requests from end users.

Design

Now we can start the design of overall CDN system based on its requirement.

First we can figure out the needed component, then we dive into the integration among them.

Components

  • client: plays the role of end users
  • proxy servers: the servers we place near the users
  • routing service: this component could direct client to the nearest proxy servers which contains the needed content.
  • origin server: the server which contains all the necessary contents
  • distribution service: this service will distribute the content from original server to proxy servers.
  • management service: this service measures the important metrics like latency, server load and accounting information etc.

Integrations

the overall workflow among above components could be:

  1. origin server publish the content to distribution service
  2. distribution service distribute the content across proxy servers, and provide the update to routing service, like which proxy server contains which content.
  3. when client send request to routing service, routing service could find a proper proxy server and return its IP back to client
  4. client send the request to proxy server, proxy server return the requested resources, and provide update to management service.
  5. if the requested proxy server don’t have the resources, it will redirect client to origin server.

Evaluation

Let’s evaluate above design, but first, we need to understand, what properties should be evaluated.

Performance

of course, we should know how effective the system is, remember, the reason we introduce CDN is to reduce the latency. The more content provided by CDN, the more effective it is.

To achieve better performance, we have following designs:

  1. we put proxy server closer to users
  2. origin server publish the content to proxy server periodically
  3. through routing service, we provide the nearest proxy server to client

Consistency

No doubt we increase the availability of system, as everyone knows, availability and consistency is like the two side of a coin.

There are several ways we can increase the consistency as much as possible

  1. proxy server can periodically fetch the content from original server
  2. we set TTL for each content, the content exceed its TTL cannot be served to client

Retrospect

CDN pattern is very common in the real world, I think every application is leveraging CDN-like flow, more or less.

the 5 components are classic as well, say, client, origin server, middle server, the distribution service connecting middle server and origin server, the routing service connecting client and middle servers.