System Design Case Study: Youtube
Everyone knows youtube, it has a very rich features and impossible to cover all of them in a single article.
In this article, we can only discuss a specific set of features in Youtube.
Step1: Scope the problem
Let’s assume the problem scope is:
- ability to upload videos
- smooth video streaming
- can leverage existing cloud service
Step2: High Level Design
CDN and blob storage are the cloud services we will leverage.
Within the limited time frame, choosing the right technology to do the job right is more important than explaining how the technology works in details.
From the most high level perspective, system can be broken into two flows:
- video uploading
- video watching
Video Uploading
- Metadata DB: video metadata are stored in Metadata DB.
- Original Storage: A blob storage system which is used to store the original video.
- Transcoding Server: it’s a process to convert a video format to other formats, so that we can provide the best video stream for different devices and bandwidth capability
- CDN: Videos are cached in CDN, when you watch the video, it’s streamed from CDN
- Completion Queue: A message queue that stores the information about the transcoding results
Step3: Detailed Design
In high level design, there are several area deserving more discussions. for example
Video Transcoding Details
Transcoding a video is a computationally expensive and time consuming.
To make this process easier for developer, we can implement it in “Direct Acyclic Graph “ programming model.
For example, original video can be split into “video”, “audio” and “metadata”. These three part can be executed in parallel.
-
pre-processor:
- split video: videos can be split into small chunks so can increase the papalism
- DAG generation: generate DAG configuration files
-
DAG scheduler:
- split the DAG graph into stages and put them in task queue in resource manager
-
resource manager:
responsible for managing the efficiency of resource allocation. it contains 3 main components
- task queue: priority queue contains tasks to be executed
- worker queue: priority queue contains all the worker utilization info
- running queue: contains the info of current running tasks
-
task worker: run the task based on DAG configuration file
-
encoded video: final output of transcoding service
to make the system more loosely coupled, we can introduce MQ
Step4: Wrap Up
There are some interesting topic not covered above, for examples
- video copyrights protection: we can leverage DRM(Digital Rights Manager) system to protect the copyrights
- visual watermarking: we put an image overlay on top of video
- streaming latency: to speed up the video watching experience, we can leverage CDN to bring popular videos closer to user.