2 minute read

What is News Feed?

According to the facebook help page, News Feed is the constantly updating list of stories in the middle of your home page.

This is a very popular interview questions.

Step1: Understand the problem

News Feed system is also a complex system, let’s only focus on 2 features:

  • publish a post
  • generate news feed when user open the app

Also, each post could contains text, video or images.

Step2: High Level Design

We can start from API design.

Basically our service should expose two API based on the requirements:

  • POST /publish
    • content
    • auth_token
  • GET /newsfeed
    • auth_token

Those API can be HTTP based.

Now we know, we need to provide two flows for different API calls.

Publish Flow

  1. HTTP request will first hit load balancer
  2. load balancer redirect request to web server
  3. web server will handle the business logic, then redirect request to “publish service”
  4. publish service will
    1. store the post into post DB
    2. store the media into blob DB
    3. push the post into fanout service
  5. fanout service will
    1. fetch user’s friends from Graph DB
    2. push the messages into Message Queue
    3. fanout worker will fetch the message from MQ
    4. store the post to friends News Feed cache

Browse Flow

  1. User send API request to hit load balancer
  2. load balancer direct request to web server
  3. web server handle business logic, then redirect request to “news feed service”
  4. news feed service retrieve the pushed feed for user
  5. retrieve the real data of post id from DB

Step3: Deep Dive

Fanout Service

Fanout service is the process of delivering the post to all friends.

There are two type of action: push action or pull action, each one has its pros and cons

  • push model: A new post is delivered to friends’ cache immediately when it’s published
    • pros: news feed is generated in real time and can be pushed to friends immediately, so it can be fetched fast
    • cons: if user has many friends, fetching friends list and deliver the post to all of them is time-consuming.
      • pre-computing news feed for those inactive users waste resources
  • pull model: all posts from friends are fetched when user open the newsfeed app.
    • pros: save the resources for inactive users. and avoid the issue of push model
    • cons: fetching the post at read time will reduce the performance

so we usually adopt hybrid solution, where we use push model for active user and pull model for inactive user.

also for the “popular” users’ post, we let the followers to fetch the post on-demand to avoid system overload.

Cache

Cache is extremely important in News Feed System, in our design, we can also add cache at Data Layer to improve the performance.