Skip to content

Cache-cache¹ strategies

Updated: at 08:56 PM (16 min read)

This post is a clean version of an investigation I made 2 years ago (May 2022) as part of a feature I was developing with a hard requirement for implementing a cache. The main reason was to prevent traffic burst and because the service will get data from a downstream service that is vended daily. In this post, we will explore what caching is, when to use it, and some strategies to take into account while considering caching in your software.

abstract-cache-cache-strategies-main-img.jpg

Table of contents

Open Table of contents

What is caching?

In computing, a cache is a high-speed data storage layer that stores a subset of data, typically transient in nature. So that future requests for that data are served up faster than is possible by accessing the data’s primary storage location. Caching allows you to efficiently reuse previously retrieved or computed data[2].

Caching aims to improve the performance and scalability of a system. It does this by temporarily copying frequently accessed data to fast storage that is located close to the application. As a result, it improves response times for client applications by serving data more quickly.

When to use caching?

The basic reason to use a cache is, when you have a pool of data, you might need to retrieve very frequently and the frequency at which it changes is low then you are just going to access the same data repeatedly over a period of time, you should consider using a cache.

Several factors[3] can lead a software developer to consider adding a cache layer to their system. Some of them are listed below:

When not to use caching?

When requests typically require a unique query to the dependent service with unique-per request results, then a cache would have a negligible hit rate and the cache does not good.

How to use caching?

Before deep diving into how to use caching, let’s define some terms.

Cache hit

A cache hit is a state in which data requested for processing by a component or application is found in the cache memory. It is a faster means of delivering data to the processor, as the cache already contains the requested data.

cache hit path

A cache hit serves data more quickly, as the data can be retrieved by reading the cache memory. The cache hit also can be in disk caches where the requested data is stored and accessed at the first query.

Cache miss

Cache Miss is a state where the data requested for processing by a component or application is not found in the cache memory. It causes execution delays by requiring the program or application to fetch the data from other cache levels or the main memory.

cache miss

A cache miss occurs either because the data was never placed in the cache, or because the data was removed (“evicted”) from the cache by either the caching system itself or an external application that specifically made that eviction request. Eviction by the caching system itself occurs when space needs to be freed up to add new data to the cache, or if the time-to-live policy on the data expired.

How to apply caching

Caching is applicable to a wide variety of use cases, but fully exploiting caching requires some planning. When deciding whether to cache a piece of data, consider the following questions.

Caching Strategies

Read Strategies

Lazy Loading or Cache aside

Lazy Loading or Cache aside is a caching strategy that loads data into the cache only when necessary. If application needs data for some key x, search in the cache first. If data is present, return the data, otherwise, retrieve the data from the data source, put it into cache and then return.

cache aside

This approach has some advantages.

There are some disadvantages you can consider before choosing this approach.

Read through

The read-through is a caching strategy where the cache is positioned as an intermediary between the application and the underlying data source (like a database). When an application needs data, it first checks the cache. If the data is already in the cache (a cache miss), it’s returned immediately, offering low-latency access. If the data isn’t in the cache (a cache miss), the cache itself takes on the responsibility of fetching the data from the database, storing it, and then returning it to the application. In essence, the cache “reads through” to the database when necessary, ensuring the application always has seamless access to the data, whether it’s cached or not.

read through

This approach has some advantages.

Like the other strategies, this approach has some disadvantages.

Write strategies

Write-through

The write-through or inline cache adds data or updates data in the cache whenever data is written to the database.

write through

This approach has certain advantages over lazy loading.

However, write-through caching also has some disadvantages.

Write behind

The Write behind caching writes data directly to the caching system. Then after a certain configured interval, then written data is asynchronously synced to the underlying data source. Here the caching service has to maintain a queue of write operations so that they can be synced in order of insertion.

write behind

This approach has certain advantages over the others.

As the others approach, it has some disadvantages too. The main one is the eventual consistency between a database and caching system. So any direct operation on a database or joining operation may use stale data.

Refresh Ahead

The refresh ahead caching is a technique in which the cached data is refreshed before it gets expired. What it does is it essentially refreshed the cache at a configured interval just before the next possible cache access, although it might take some time due to network latency to refresh the data, and meanwhile a few thousand read operations
already might happen in a very highly read heavy system in just a duration of few milliseconds.

refresh ahead

The main advantages of this approach are:

One of the disadvantages of this method is that it is probably a little hard to implement since service takes extra pressure to refresh all the keys as, and they are accessed. But in a read heavy environment, it is worth it.

Write around

write around

Adding Time-to-live (TTL)

TTL is an integer value that specifies the number of seconds until the keys expires. When an application attempts to read an expired key, it is treated as though the key is not found.

Cache expiration

Cache expiration can get really complex quickly. Unfortunately, there is no panacea for this issue. But there are a few strategies that you can use:

Eviction Policy

Evictions occur when memory is over filled or greater than the maximum memory setting in the cache, resulting into the engine to select keys to evict to manage its memory. The keys that are chosen are based on the eviction policy that is selected.

A good strategy in selecting an appropriate eviction policy is to consider the data stored in your cluster and the outcome of keys being evicted. Generally, LRU based policies are more common for basic caching use-cases, but depending on your objectives, you may want to leverage a TTL or Random based eviction policy if that better suits your requirements.

The thundering herd or the cache stampede

The thundering herd effect is what happens when many application processes simultaneously request a cache key, get a cache miss, and then each hits the same database query in parallel. The more expensive this query is, the bigger impact it has on the database. If the query involved is a top 10 query that requires ranking a large dataset, the impact can be a significant hit.

Caching technologies

There are two big families of caching technologies: local caches and external caches.

Local caches

On-box caches, commonly implemented in process memory, are relatively quick and easy to implement and can provide significant improvements with minimal work. Local caches come with no additional operational overhead, so they are fairly low-risk to integrate into an existing service.

In-Memory caches come with some downsides among which we have:

External caches

An external cache stores cached data in a separate fleet. Cache coherence issue is reduced because the external cache holds the value used by all servers in the fleet. Overall load on downstream services is reduced compared to in-memory caches and is not proportional to fleet size. Cold start issues during events like deployments are not present since the external cache remains populated throughout the deployment. Finally, external caches provide more available storage space than in-memory caches, reducing occurrences of cache eviction due to space constraints.

External caches come also with its downsides.

Cache classification

Caching solution can be classified per layer.

LayerUse caseTechnologiesSolutions
Client-sideAccelerate retrieval of web content from websitesHTTP cache headers, browsersBrowser specific
DNSDomain to IP resolutionDNS ServersRoute 53
WebAccelerate retrieval of web content from web/app servers. Manage web sessionsHTTP cache Headers, CDNs, Reverse Proxies, Web Accelerators, Key/Value storesCloudfront, Elasticache (Redis or Memcached)
AppAccelerate application performance and data accessKey/Value data stores, Local cachesElasticache (Redis or Memcached), MemoryDB for redis
DatabaseReduce latency associated with database query requests.Database buffers, Key/Value data storesElasticache (Redis or Memcached), MemoryDB for redis

Conclusion

Caching is a powerful tool that can dramatically improve the performance and scalability of your applications when used correctly. By storing frequently accessed data closer to the application, it reduces latency and offloads pressure from downstream services. However, implementing a cache requires careful consideration of factors like data consistency, eviction strategies, and cache expiration. The choice of caching strategy—whether it’s lazy loading, read-through, write-through, or others—depends on your specific use case and access patterns.

It’s crucial to strike a balance between performance gains and the complexities that caching introduces. From the thundering herd problem to stale data risks, each caching solution comes with its own trade-offs. Understanding these nuances and planning accordingly will ensure that your caching implementation provides the maximum benefit without introducing new challenges.

Appendix

  1. Cache-cache, “hide and seek” in English is the name of the game we all played during our childhood in which at least two players conceal themselves in a set environment, to be found by one or more seekers. The use of the term “cache-cache” in the title is purely for pun purposes.
  2. Caching challenges and strategies
  3. Caching challenges and strategies
  4. The performance impact of “Russian doll” caching
  5. Page replacement algorithm
  6. This blog post image is generated using Dall-E 2. The prompt use to generate the image is the following: An abstract scene of children playing hide and seek. The image has a blend of cinematographic and drawing-like elements, with vibrant yet soft colors. The children are in a playful, imaginative environment, with exaggerated shadows and whimsical shapes. Some children are hiding behind stylized trees and surreal objects while others are counting with eyes closed. The atmosphere is dreamy, with motion blur to suggest movement and excitement, blending realism with artistic, hand-drawn textures.