System Design Case Study: News Feed System
What is News Feed?
According to the facebook help page, News Feed is the constantly updating list of stories in the middle of your home page.
This is a very popular interview questions.
Step1: Understand the problem
News Feed system is also a complex system, let’s only focus on 2 features:
- publish a post
- generate news feed when user open the app
Also, each post could contains text, video or images.
Step2: High Level Design
We can start from API design.
Basically our service should expose two API based on the requirements:
- POST /publish
- content
- auth_token
- GET /newsfeed
- auth_token
Those API can be HTTP based.
Now we know, we need to provide two flows for different API calls.
Publish Flow
- HTTP request will first hit load balancer
- load balancer redirect request to web server
- web server will handle the business logic, then redirect request to “publish service”
- publish service will
- store the post into post DB
- store the media into blob DB
- push the post into fanout service
- fanout service will
- fetch user’s friends from Graph DB
- push the messages into Message Queue
- fanout worker will fetch the message from MQ
- store the post to friends News Feed cache
Browse Flow
- User send API request to hit load balancer
- load balancer direct request to web server
- web server handle business logic, then redirect request to “news feed service”
- news feed service retrieve the pushed feed for user
- retrieve the real data of post id from DB
Step3: Deep Dive
Fanout Service
Fanout service is the process of delivering the post to all friends.
There are two type of action: push action or pull action, each one has its pros and cons
- push model: A new post is delivered to friends’ cache immediately when it’s published
- pros: news feed is generated in real time and can be pushed to friends immediately, so it can be fetched fast
- cons: if user has many friends, fetching friends list and deliver the post to all of them is time-consuming.
- pre-computing news feed for those inactive users waste resources
- pull model: all posts from friends are fetched when user open the newsfeed app.
- pros: save the resources for inactive users. and avoid the issue of push model
- cons: fetching the post at read time will reduce the performance
so we usually adopt hybrid solution, where we use push model for active user and pull model for inactive user.
also for the “popular” users’ post, we let the followers to fetch the post on-demand to avoid system overload.
Cache
Cache is extremely important in News Feed System, in our design, we can also add cache at Data Layer to improve the performance.