Distributed Caching Strategies: Redis, Cache Invalidation, and Stampede Prevention
Caching is the most impactful performance optimization you can make — a well-designed cache can reduce database load by 90% and cut response times from 200ms to 5ms. However, distributed caching strategies introduce complexity around consistency, invalidation, and failure modes that single-server caching doesn’t face. Therefore, this guide covers the caching patterns that work in production, the ones that don’t, and how to handle the hardest problems in distributed systems.
Cache-Aside (Lazy Loading): The Default Pattern
Cache-aside is the most common caching pattern because it’s simple and flexible. The application checks the cache first. On a cache hit, return the cached value. On a miss, load from the database, store in cache, and return. The application controls all caching logic — the cache and database are independent systems.
import redis
import json
import hashlib
class CacheAside:
def __init__(self, redis_client, db_client, default_ttl=3600):
self.cache = redis_client
self.db = db_client
self.default_ttl = default_ttl
def get_user(self, user_id):
cache_key = f"user:{user_id}"
# Step 1: Check cache
cached = self.cache.get(cache_key)
if cached:
return json.loads(cached) # Cache hit
# Step 2: Cache miss — load from database
user = self.db.query("SELECT * FROM users WHERE id = %s", user_id)
if user is None:
# Cache negative result to prevent repeated DB queries
self.cache.setex(f"user:{user_id}:null", 300, "1")
return None
# Step 3: Store in cache with TTL
self.cache.setex(cache_key, self.default_ttl, json.dumps(user))
return user
def update_user(self, user_id, data):
# Update database first, then invalidate cache
self.db.execute("UPDATE users SET ... WHERE id = %s", user_id)
self.cache.delete(f"user:{user_id}")
# Don't set the new value — let the next read populate it
# This avoids race conditions between concurrent updatesThe key insight with cache-aside is: on write, delete the cache entry rather than updating it. If you update the cache, concurrent writes can leave stale data. Deleting forces the next read to reload from the database, which is always authoritative. Additionally, always set a TTL as a safety net — even if your invalidation logic has bugs, stale data eventually expires.
Write-Through and Write-Behind Patterns
Write-through caching writes to the cache and database simultaneously on every update. This ensures the cache is always fresh but adds write latency since every write hits both systems. It works well when you read frequently and write infrequently — user profiles, product catalogs, and configuration data.
Write-behind (write-back) caching writes to the cache immediately and asynchronously flushes to the database later. This provides the lowest write latency but risks data loss if the cache fails before flushing. Moreover, implementing write-behind correctly requires careful handling of ordering, batching, and failure recovery.
class WriteBehindCache:
"""Write-behind cache with batched async persistence"""
def __init__(self, redis_client, db_client, flush_interval=5):
self.cache = redis_client
self.db = db_client
self.flush_interval = flush_interval
self.pending_writes = "pending_writes" # Redis sorted set
def write(self, key, value):
pipe = self.cache.pipeline()
# Write to cache immediately
pipe.set(key, json.dumps(value))
# Add to pending writes queue (score = timestamp for ordering)
pipe.zadd(self.pending_writes, {key: time.time()})
pipe.execute()
async def flush_to_database(self):
"""Periodically flush pending writes to database"""
while True:
# Get oldest pending writes
pending = self.cache.zrangebyscore(
self.pending_writes, "-inf", "+inf", start=0, num=100
)
if pending:
batch = []
for key in pending:
value = self.cache.get(key)
if value:
batch.append((key, json.loads(value)))
# Batch write to database
self.db.bulk_upsert(batch)
# Remove from pending queue
self.cache.zrem(self.pending_writes, *pending)
await asyncio.sleep(self.flush_interval)Cache Invalidation: The Hard Problem
There are only two hard things in Computer Science: cache invalidation and naming things. Cache invalidation is hard because distributed systems don’t have a single, consistent view of time. Here are the patterns that work in practice:
TTL-based expiration: The simplest approach. Set a TTL on every cache entry and accept that data may be stale for up to TTL seconds. For many applications, serving data that’s 60 seconds old is perfectly acceptable.
Event-driven invalidation: When the database changes, publish an event that triggers cache deletion. Use database triggers, change data capture (CDC), or application-level events. This is more complex but provides near-real-time cache freshness.
Version-based invalidation: Include a version number in the cache key. When data changes, increment the version. Old cache entries naturally expire while new reads use the new version key. This avoids race conditions between invalidation and population.
# Version-based cache invalidation
class VersionedCache:
def get(self, entity_type, entity_id):
version = self.cache.get(f"{entity_type}:version:{entity_id}") or "1"
cache_key = f"{entity_type}:{entity_id}:v{version}"
return self.cache.get(cache_key)
def invalidate(self, entity_type, entity_id):
# Increment version — old cached value is orphaned (expires via TTL)
self.cache.incr(f"{entity_type}:version:{entity_id}")
# No need to delete old cache entry — it won't be read againCache Stampede Prevention
A cache stampede occurs when a popular cache entry expires and hundreds of concurrent requests all miss the cache simultaneously, flooding the database with identical queries. This can cascade into a full database outage. Three techniques prevent stampedes:
import threading
import time
class StampedeProtectedCache:
def __init__(self, cache, db, lock_timeout=5):
self.cache = cache
self.db = db
self.lock_timeout = lock_timeout
def get_with_lock(self, key, loader_fn, ttl=3600):
"""Pattern 1: Locking — only one request rebuilds the cache"""
value = self.cache.get(key)
if value:
return json.loads(value)
lock_key = f"lock:{key}"
# Try to acquire lock (SET NX with expiry)
acquired = self.cache.set(lock_key, "1", nx=True, ex=self.lock_timeout)
if acquired:
# This request rebuilds the cache
try:
value = loader_fn()
self.cache.setex(key, ttl, json.dumps(value))
return value
finally:
self.cache.delete(lock_key)
else:
# Another request is rebuilding — wait and retry
time.sleep(0.1)
return self.get_with_lock(key, loader_fn, ttl)
def get_with_early_refresh(self, key, loader_fn, ttl=3600, refresh_at=0.8):
"""Pattern 2: Probabilistic early refresh"""
value = self.cache.get(key)
remaining_ttl = self.cache.ttl(key)
if value and remaining_ttl > ttl * (1 - refresh_at):
return json.loads(value)
if value:
# Cache still valid but close to expiry — refresh in background
threading.Thread(
target=lambda: self._refresh(key, loader_fn, ttl)
).start()
return json.loads(value) # Return stale data while refreshing
# Cache miss — load synchronously
return self._refresh(key, loader_fn, ttl)The locking pattern is the most reliable but adds latency for waiting requests. Probabilistic early refresh works well for high-traffic keys — some percentage of requests refresh the cache before it expires, spreading the load. Furthermore, serving slightly stale data during refresh is usually acceptable and prevents the stampede entirely.
Redis Cluster: Scaling Beyond One Node
A single Redis node handles 100,000+ operations per second, but when you need more throughput or more memory, Redis Cluster distributes data across multiple nodes using hash slots. Each key is assigned to one of 16,384 hash slots, and slots are distributed across nodes. Consequently, a 3-node cluster provides roughly 3x the throughput and 3x the memory capacity.
Key considerations for Redis Cluster: multi-key operations (MGET, pipeline) only work when all keys are on the same node. Use hash tags (e.g., {user:123}:profile, {user:123}:sessions) to co-locate related keys on the same slot. Cross-slot operations like SUNION across different hash slots will fail.
Related Reading:
Resources:
In conclusion, distributed caching is essential for performance but demands careful attention to invalidation, consistency, and failure modes. Start with cache-aside and TTL-based expiration — it handles 90% of use cases. Add stampede protection for high-traffic keys and event-driven invalidation when you need stronger consistency. The cache should always be treated as ephemeral — your application must work (slowly) if the cache disappears entirely.