System Design: Building Blocks - Distributed MQ
## Problem
the communication/interaction between two components is very common in modern application, while sometimes, when component A pass the task to component B, usually A need to wait B to complete the task then continue its own work.
if component B process the task slower, then the performance of A could be impacted as well.
So if the component A don’t need to wait B, we can just decouple them to ensure A’s performance.
Then we introduce message queue.
What is Message Queue?
A message queue is an intermediate component between the interacting entities known as producer and consumer
producer produces message and place them in the queue, while the consumer retrieve the message from the queue and process them
Design
Requirements
- Queue wise
- create message queue
- delete message queue
- Message wise
- sorting the message
- send message
- receive message
- delete message
Challenge
The concurrency need to be considered in following two stages:
- when multiple message arrives at the same time
- when multiple consumer request for message concurrently
one solution is through lock, which is not scalable or performant.
another solution is to serialize the request using buffer at both ends of the queue.
Components
-
load balancer: receive the request from producer and consumers, then forward to one of the front-end server.
-
front-end service: comprises stateless machine distributed across centers. including following usages:
- request validation / authentication
- caching
- request dispatching
- usage data collection
-
metadata service: responsible for soring, retrieving and updating the metadata of queues.
Integration
- when front-end service receive the message, it refers to the metadata service to determine the host where the message need to be sent.
- the message then forwarded to the host and replicated on relevant hosts to overcome a possible availability issue.
Evaluation
based on the main goal of introducing the message queue, performance and availability are the main metrics we care about.
Availability
our data components, including metadata and actual messages, are properly replicated.
Performance
for better performance, we use caches, data replication and partitions, which reduce the data reads and write time.
the performance could be better if we relax the ordering of messages.
Scalability
our design components are horizontally scalable, we can add or remove their capability based on the requirements.