distributed caching strategies

Distributed Caching Strategies: Redis, Cache Invalidation, and Stampede Prevention

Caching is the most impactful performance optimization you can make — a well-designed cache can reduce database load by 90% and cut response times from 200ms to 5ms. However, distributed caching strategies introduce complexity around consistency, invalidation, and failure modes that single-server caching doesn’t face. Therefore, this guide covers the caching patterns that work in production, the ones that don’t, and how to handle the hardest problems in distributed systems.

Cache-Aside (Lazy Loading): The Default Pattern

Cache-aside is the most common caching pattern because it’s simple and flexible. The application checks the cache first. On a cache hit, return the cached value. On a miss, load from the database, store in cache, and return. The application controls all caching logic — the cache and database are independent systems.

import redis
import json
import hashlib

class CacheAside:
    def __init__(self, redis_client, db_client, default_ttl=3600):
        self.cache = redis_client
        self.db = db_client
        self.default_ttl = default_ttl

    def get_user(self, user_id):
        cache_key = f"user:{user_id}"

        # Step 1: Check cache
        cached = self.cache.get(cache_key)
        if cached:
            return json.loads(cached)  # Cache hit

        # Step 2: Cache miss — load from database
        user = self.db.query("SELECT * FROM users WHERE id = %s", user_id)
        if user is None:
            # Cache negative result to prevent repeated DB queries
            self.cache.setex(f"user:{user_id}:null", 300, "1")
            return None

        # Step 3: Store in cache with TTL
        self.cache.setex(cache_key, self.default_ttl, json.dumps(user))
        return user

    def update_user(self, user_id, data):
        # Update database first, then invalidate cache
        self.db.execute("UPDATE users SET ... WHERE id = %s", user_id)
        self.cache.delete(f"user:{user_id}")
        # Don't set the new value — let the next read populate it
        # This avoids race conditions between concurrent updates

The key insight with cache-aside is: on write, delete the cache entry rather than updating it. If you update the cache, concurrent writes can leave stale data. Deleting forces the next read to reload from the database, which is always authoritative. Additionally, always set a TTL as a safety net — even if your invalidation logic has bugs, stale data eventually expires.

Distributed caching architecture diagram — Cache-aside pattern: application manages the cache, database remains the source of truth

Write-Through and Write-Behind Patterns

Write-through caching writes to the cache and database simultaneously on every update. This ensures the cache is always fresh but adds write latency since every write hits both systems. It works well when you read frequently and write infrequently — user profiles, product catalogs, and configuration data.

Write-behind (write-back) caching writes to the cache immediately and asynchronously flushes to the database later. This provides the lowest write latency but risks data loss if the cache fails before flushing. Moreover, implementing write-behind correctly requires careful handling of ordering, batching, and failure recovery.

class WriteBehindCache:
    """Write-behind cache with batched async persistence"""

    def __init__(self, redis_client, db_client, flush_interval=5):
        self.cache = redis_client
        self.db = db_client
        self.flush_interval = flush_interval
        self.pending_writes = "pending_writes"  # Redis sorted set

    def write(self, key, value):
        pipe = self.cache.pipeline()
        # Write to cache immediately
        pipe.set(key, json.dumps(value))
        # Add to pending writes queue (score = timestamp for ordering)
        pipe.zadd(self.pending_writes, {key: time.time()})
        pipe.execute()

    async def flush_to_database(self):
        """Periodically flush pending writes to database"""
        while True:
            # Get oldest pending writes
            pending = self.cache.zrangebyscore(
                self.pending_writes, "-inf", "+inf", start=0, num=100
            )

            if pending:
                batch = []
                for key in pending:
                    value = self.cache.get(key)
                    if value:
                        batch.append((key, json.loads(value)))

                # Batch write to database
                self.db.bulk_upsert(batch)

                # Remove from pending queue
                self.cache.zrem(self.pending_writes, *pending)

            await asyncio.sleep(self.flush_interval)

Cache Invalidation: The Hard Problem

There are only two hard things in Computer Science: cache invalidation and naming things. Cache invalidation is hard because distributed systems don’t have a single, consistent view of time. Here are the patterns that work in practice:

TTL-based expiration: The simplest approach. Set a TTL on every cache entry and accept that data may be stale for up to TTL seconds. For many applications, serving data that’s 60 seconds old is perfectly acceptable.

Event-driven invalidation: When the database changes, publish an event that triggers cache deletion. Use database triggers, change data capture (CDC), or application-level events. This is more complex but provides near-real-time cache freshness.

Version-based invalidation: Include a version number in the cache key. When data changes, increment the version. Old cache entries naturally expire while new reads use the new version key. This avoids race conditions between invalidation and population.

# Version-based cache invalidation
class VersionedCache:
    def get(self, entity_type, entity_id):
        version = self.cache.get(f"{entity_type}:version:{entity_id}") or "1"
        cache_key = f"{entity_type}:{entity_id}:v{version}"
        return self.cache.get(cache_key)

    def invalidate(self, entity_type, entity_id):
        # Increment version — old cached value is orphaned (expires via TTL)
        self.cache.incr(f"{entity_type}:version:{entity_id}")
        # No need to delete old cache entry — it won't be read again

Cache invalidation strategies visualization — Version-based invalidation avoids race conditions by never deleting cache entries

Cache Stampede Prevention

A cache stampede occurs when a popular cache entry expires and hundreds of concurrent requests all miss the cache simultaneously, flooding the database with identical queries. This can cascade into a full database outage. Three techniques prevent stampedes:

import threading
import time

class StampedeProtectedCache:
    def __init__(self, cache, db, lock_timeout=5):
        self.cache = cache
        self.db = db
        self.lock_timeout = lock_timeout

    def get_with_lock(self, key, loader_fn, ttl=3600):
        """Pattern 1: Locking — only one request rebuilds the cache"""
        value = self.cache.get(key)
        if value:
            return json.loads(value)

        lock_key = f"lock:{key}"
        # Try to acquire lock (SET NX with expiry)
        acquired = self.cache.set(lock_key, "1", nx=True, ex=self.lock_timeout)

        if acquired:
            # This request rebuilds the cache
            try:
                value = loader_fn()
                self.cache.setex(key, ttl, json.dumps(value))
                return value
            finally:
                self.cache.delete(lock_key)
        else:
            # Another request is rebuilding — wait and retry
            time.sleep(0.1)
            return self.get_with_lock(key, loader_fn, ttl)

    def get_with_early_refresh(self, key, loader_fn, ttl=3600, refresh_at=0.8):
        """Pattern 2: Probabilistic early refresh"""
        value = self.cache.get(key)
        remaining_ttl = self.cache.ttl(key)

        if value and remaining_ttl > ttl * (1 - refresh_at):
            return json.loads(value)

        if value:
            # Cache still valid but close to expiry — refresh in background
            threading.Thread(
                target=lambda: self._refresh(key, loader_fn, ttl)
            ).start()
            return json.loads(value)  # Return stale data while refreshing

        # Cache miss — load synchronously
        return self._refresh(key, loader_fn, ttl)

The locking pattern is the most reliable but adds latency for waiting requests. Probabilistic early refresh works well for high-traffic keys — some percentage of requests refresh the cache before it expires, spreading the load. Furthermore, serving slightly stale data during refresh is usually acceptable and prevents the stampede entirely.

Redis Cluster: Scaling Beyond One Node

A single Redis node handles 100,000+ operations per second, but when you need more throughput or more memory, Redis Cluster distributes data across multiple nodes using hash slots. Each key is assigned to one of 16,384 hash slots, and slots are distributed across nodes. Consequently, a 3-node cluster provides roughly 3x the throughput and 3x the memory capacity.

Key considerations for Redis Cluster: multi-key operations (MGET, pipeline) only work when all keys are on the same node. Use hash tags (e.g., {user:123}:profile, {user:123}:sessions) to co-locate related keys on the same slot. Cross-slot operations like SUNION across different hash slots will fail.

Redis cluster data distribution — Redis Cluster distributes 16,384 hash slots across nodes for horizontal scaling

Related Reading:

Resources:

In conclusion, distributed caching is essential for performance but demands careful attention to invalidation, consistency, and failure modes. Start with cache-aside and TTL-based expiration — it handles 90% of use cases. Add stampede protection for high-traffic keys and event-driven invalidation when you need stronger consistency. The cache should always be treated as ephemeral — your application must work (slowly) if the cache disappears entirely.

Distributed Caching Strategies: Redis, Memcached, and Application-Level Patterns