1 minute read

Why we need logging?

Everyone knows what is logging, but not everyone know why we need logging.

logging is essential to understand the flow of an events in a distributed system.

it can help pinpoint system’s problem, and figure out the root cause.

also, in a distributed system, a distributed version of logging is necessary as well.

Design

Requirement

  • gather the message: the logging system should be able to get the information from other parts of system
  • search the log: we should be able to find the logs we need in a short time

Components

in a distributed system, clients across the globe generated events by requesting the logging service of logging system.

then each nodes aggregate the logs locally

  • log accumulator: the service which will collect the logs from each logging node, and dump them into distributed storage
  • distributed storage: logs need to be stored somewhere for searching purpose, we can use blobs store here
  • indexer: for the ease of query, we need to build appropriate index on logs
  • visualizer

Integration

Lets assume there are two Apps – App1 and App2 are adopting our logging system.

Each application has various micro services running as well.

Every micro services could generate logs, we generate unique id from app_id & service_id

From the perspective of our system, it could looks like

  • receive the event from client
  • assign a unique id to event
  • store the event into local storage
  • accumulate the local logs
  • push the accumulated logs into distributed pub-sub system to cater to the scalability issue

Evaluation

Scalability

Pub-sub system is always good for the scalability

also for pub-sub system itself, we should make it horizontally scalable.

performance

we should ensure that the logging system shouldn’t affect the performance other part of systems,

To achieve this, we can send the logs asynchronously through low-pri thread in pub-sub system

While, in this way we may loss the data, so it’s a trade-off between user’s perceivable latency and the log data persistency.