Scaling Memcache Deep Dive for System Design

Explore challenges and solutions in scaling Memcache for large distributed systems. Understand data replication, handling hotspots, maintaining cache hit rates, and balancing latency with consistency across server, cluster, regional, and cross-regional levels.

We'll cover the following...

Key-value stores and big data
Why scale Memcache?
Requirements
- Functional requirements
- Non-functional requirements
Problem statement
Bird's eye view of Scaling Memcache

Key-value stores and big data

When designing large-scale social media websites, we need to understand that there are huge volumes of data and that a number of requests are being made. The large scale of data on such platforms entails very intense load and spontaneous I/O requests, which might not be optimized using prefetchingPrefetching loads data into the cache before the user requests it..

This is where caching in memory can be used to allow for linear scaling. A key-value store is one way to cache data. Key-value stores can run on network-enabled systems that share information with other machines without requiring specialized hardware. This allows architects to scale their key-value stores by adding more front-end servers. One such key-value store is Memcached.

Note: We'll call the distributed key-value store as MemcacheMemcache without the 'd'. and the source code or the running binary as Memcached.

Why scale Memcache?

Memcache allows developers to add caching to their applications while having simple operations like set, get, and delete. Although it comes with a lot of features for single-server use, there are major hurdles in scaling it to a global level since it doesn't have built-in features (like replication, or clustering) that allow it to scale out for large-scale applications.

Scaling Memcache requires us to solve novel problems like:

Managing hot spots that arise due to highly popular content.
Batching requests to reduce network load.
Managing stale sets and thundering herds to reduce load.

Requirements

To keep up with the cache demands of a large-scale service, we need to consider the following requirements.

Functional requirements

In key-value stores, we have the usual requirements for the set, get and delete functions. Additionally, we have the following needs:

Accessing and updating highly viewed content: Sometimes, many clients start to access the same key-value pairs in Memcached servers. This can bring an enormous load on the Memcached servers themselves, and can also cause too many cache misses, which can in turn may bring the load to the storage layer. Our system should be able to manage such workloads efficiently.
Handling different workloads: Different key-value pairs have different characteristics. Some have very large keys while others are very popular for just a few hours. We need efficient methods to manage these differences.
Maintaining hit rate: We want to maintain the hit rate while the service is live.
Reducing network load: At the cluster level, one web request can make hundreds of Memcache calls within a cluster, which could lead to an extensive load on the network. Our system should have an intelligent placement of Memcache components and good request routing to appropriate components.
Insulating storage layer: Too many cache misses can overload the storage layer, so we want to reduce the frequency of the key-value store fetches to the database.

Non-functional requirements

Near real-time communication: Latency should be low throughout the system, but we'll have to make a trade-off to maintain consistency.
Real-time feed generation from multiple sources: Front-end servers should be able to communicate with hundreds of Memcached servers quickly to respond to a request.
Memory efficient: The caching system from the cluster level to the cross-regional level should be memory efficient.
Replication: To cater to high-demand popular content, replication becomes necessary to process millions of user requests per second. Replication occurs on the regional and cross-regional levels.
Eventual consistency: All replica regions across the globe should eventually become consistent.
Read-after-write consistency: In a local context, clients ought to be able to update an item and then view the changes made to an item.
Handling failures: We need to safely handle a cluster failure to avoid a cascading failure of clusters.

Problem statement

When we have multiple key-value (Memcached) servers throughout the world, how can we replicate data and maintain its consistency while balancing the tradeoffs (for example, between consistency and latency)?

When providing service worldwide, considerations must be made on different levels as each level has different requirements. Because a component of latency is directly related to the distance (propagation delay), we make different tradeoffs as the distance between Memcache nodes changes.

The system is similar to an onion, where we want to ensure that each layer is healthy so that it can support the layer above it. For example, if there is a problem with the server-level implementation, all the layers above will be impacted, but at a much larger scale.

We can analyze and optimize at four different levels.

Single server
Cluster level (multiple servers)
Region level (multiple clusters, also known as a data center)
Cross Regional Level (multiple data centers)

The image below compares the architecture of a single-node Memcached system with the architecture used to scale Memcached into a distributed key-value store.

Differentiating the Layers

Levels	Problems
Single server level	Single-server bottleneck Non-disruptive software upgrades Better memory efficiency
Cluster level	Replication within frontend clusters to accommodate user traffic Reducing the number of network round trips Reducing incast congestion Replication within pools of clusters Handling stale sets and thundering herds Varying workloads Read-after-write consistency
Region level	Sending invalidations to frontend clusters (where replication occurs) from storage clusters Handling incast congestion due to too many Memcached servers Bringing new cold clusters online
Cross-regional level	Making replications and writes over vast distances, with eventual-consistency

Different requirements of each level

	Single Server Level	Cluster Level	Region Level	Cross Regional Level
Replication	No replication is required	We require replication within cluster pools to efficiently serve read-heavy workload	We want replica clusters to route around failures to provide high availability	We want cross regional level replication to reduce latency caused by vast distances
Latency	Negligible latency	We want to reduce latency as much as possible for highly popular data	Synchronous replication within a data center is usually fast	Asynchronous replication is one way to reduce latency
Consistency	Read-your-write consistency at the server level	Read-your-writes consistency	Read-your-writes consistency	We provide eventual consistency across data centers

1.Prologue

2.File Systems

3.Google File System (GFS)

4.Google Colossus File System

5.Facebook's Tectonic File System

6.Databases

7.Google Bigtable

8.Google Megastore

9.Google Spanner

10.Key-value Stores

11.Many-core Key-value Store

12.Scaling Memcache

13.SILT

14.Amazon DynamoDB

15.Concurrency Management

16.Two-phase Locking (2PL)

17.Google Chubby Locking Service

18.ZooKeeper

19.Big Data Processing: Batch to Stream Processing

20.MapReduce

21.Spark

22.Kafka

23.Consensus

24.Understanding Consensus: Two Generals, FLP, & Byzantine Generals

25.Two-phase Commit

26.State Machine Replication

27.Paxos

28.Raft

29.Epilogue