본문 바로가기

emotional developer/detect-server

System Design — Scaling from Zero to Millions Of Users

 

https://medium.com/geekculture/system-design-scaling-from-zero-to-millions-of-users-deca270ef784

 

System Design — Scaling from Zero to Millions Of Users

Note: I have read this great book System Design Interview — An insider’s guide by Alex Xu in depth. So most of my definitions and images…

medium.com

시스템설계/증설/변경 시, 필수참고

 

Load Balancer

A load balancer evenly distributes incoming traffic among web servers that are defined in a load-balanced set. The user connects to the public IP of the load balancer which further connects with the private IP of our defined servers.

Image source System Design Interview — An insider’s guide by Alex Xu

 

To handle it we can follow the Data Replication strategy.

Wikipedia: “Database replication can be used in many database management systems, usually with a master/slave relationship between the original (master) and the copies (slaves)”.

A master database generally only supports write operations. A slave database gets copies of the data from the master database and only supports read operations.

Most applications require a much higher ratio of reads to writes; thus, the number of slave databases in a system is usually larger than the number of master databases.

Image source System Design Interview — An insider’s guide by Alex Xu

 

After all the above discussions, our system design:

Image source System Design Interview — An insider’s guide by Alex Xu

Now, we have a basic understanding of the web and data tiers, it is time to improve the load/response time.

 

Caching

A cache is a temporary storage area that stores the result of expensive responses or frequently accessed data in memory so that subsequent requests are served more quickly. The application performance is greatly affected by calling the database repeatedly. The cache can help in solving this problem.

Once we get the request, the web server checks if the data is present in the cache. If it is present (also called cache-hit), then our cache access it and directly sends the data to the client. If data is not present in our cache (also called cache-miss), it queries the database then stores the response in the cache and sends data back to the client.

Image source System Design Interview — An insider’s guide by Alex Xu

CDN

A CDN is a network of geographically dispersed servers used to deliver static content. CDN servers cache static content like images, videos, CSS, JavaScript files, etc.

Note: There is also a concept of dynamic content caching which enables the caching of HTML pages that are based on request path, query strings, cookies, and request headers.

Workflow of CDN:

Image source System Design Interview — An insider’s guide by Alex Xu

 

Now let’s add cache and CDN to our existing system design:

Image source System Design Interview — An insider’s guide by Alex Xu

 

Database Scaling

Database Scaling can also be done in 2 ways: Vertical and Horizontal scaling

Vertical scaling, also known as scaling up, is the scaling by adding more power (CPU, RAM, DISK, etc.) to an existing machine. We can generally get highly powerful database servers say around 24 TB of RAM, these kinds of servers can store and handle lots and lots of data.

But it comes with few limitations: Restricted user base, higher risk of SPOF, and vertical scaling becomes highly expensive as you go higher to such powerful database servers.

Horizontal scaling, also known as sharding is the practice of adding more servers

Image source System Design Interview — An insider’s guide by Alex Xu

 

User data is allocated to a database server based on a particular sharding key. Anytime you access data, a hash function is used to find the corresponding shard.

In our example, user_id % 4 is used as the hash function. If the result equals 0, shard 0 is used to store and fetch data. If the result equals 1, shard 1 is used and so on…

Image source System Design Interview — An insider’s guide by Alex Xu

Sharding is a good enough strategy but it has quite a few limitations:

  1. Resharding data: Resharding data is needed when 1) a single shard could no longer hold more data due to rapid growth. 2) Certain shards might experience shard exhaustion faster than others due to uneven data distribution. When shard exhaustion happens, it requires updating the sharding function and moving data around. Consistent Hashing is used to resolve this, which needs an article on its own.
    To read about this, kindly check out this amazing article — A Guide To Consistent Hashing
  2. Celebrity problem: Excessive access to a specific shard could cause server overload. Imagine data for Messi, CR7, and Neymar all end up on the same shard. For social applications, that shard will be overwhelmed with read operations. To solve this problem, we may need to allocate a shard for each celebrity. Each shard might even require further partition.

 

Logging, metrics, automation

Monitoring error logs is important because it helps to identify errors and problems in the system as our system and daily active users (DAU) grow.

Metrics: Collecting different types of metrics help us to gain business insights and understand the health status of the system.

When a system gets big and complex, we need to build or leverage automation tools to improve productivity. Each code check-in can be passed through some default checks and verified through automation, allowing teams to detect problems early. Besides, automating your build, test, deploy process, etc. could improve developer productivity significantly.

With all these things in place our final system design:

Image source System Design Interview — An insider’s guide by Alex

 

반응형

'emotional developer > detect-server' 카테고리의 다른 글

The Architecture of Prometheus  (0) 2023.06.09
Publishing to Kafka — Synchronous vs Asynchronous  (0) 2023.05.30
Netflix Timestone  (0) 2023.05.30