惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Engineering at Meta
Engineering at Meta
Hacker News: Ask HN
Hacker News: Ask HN
Know Your Adversary
Know Your Adversary
C
Cisco Blogs
T
The Exploit Database - CXSecurity.com
T
Threat Research - Cisco Blogs
Scott Helme
Scott Helme
T
Tor Project blog
T
Tenable Blog
P
Privacy & Cybersecurity Law Blog
C
Cybersecurity and Infrastructure Security Agency CISA
S
Securelist
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
Hacker News - Newest:
Hacker News - Newest: "LLM"
S
Secure Thoughts
大猫的无限游戏
大猫的无限游戏
腾讯CDC
L
LangChain Blog
IT之家
IT之家
Recent Commits to openclaw:main
Recent Commits to openclaw:main
月光博客
月光博客
N
News and Events Feed by Topic
GbyAI
GbyAI
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
T
Tailwind CSS Blog
Jina AI
Jina AI
S
Security Affairs
T
The Blog of Author Tim Ferriss
博客园 - Franky
H
Hacker News: Front Page
Martin Fowler
Martin Fowler
D
DataBreaches.Net
酷 壳 – CoolShell
酷 壳 – CoolShell
Webroot Blog
Webroot Blog
L
Lohrmann on Cybersecurity
C
CXSECURITY Database RSS Feed - CXSecurity.com
U
Unit 42
S
Schneier on Security
B
Blog
Schneier on Security
Schneier on Security
Latest news
Latest news
TaoSecurity Blog
TaoSecurity Blog
Google DeepMind News
Google DeepMind News
The Register - Security
The Register - Security
Recorded Future
Recorded Future
O
OpenAI News
雷峰网
雷峰网
H
Heimdal Security Blog

The Practical Developer

The Libuv Thread Pool Trap: Why Node.js Async APIs Stall Under Load Postgres Covering Indexes with INCLUDE: Eliminate Heap Fetches on Read-Heavy Workloads Postgres DISTINCT ON: The Fastest Way to Get the Latest Row Per Group Postgres Transaction Isolation: The Anomalies Your App Actually Faces in Production Linux TCP Tuning for Node.js Microservices: The Kernel Settings That Stop Silent Connection Drops Under Load Postgres HOT Updates and Fillfactor: Why Not All Writes Are Created Equal Database Connection Pool Leaks: Finding the Promise That Never Returns Its Seat Linux OOM Killer in Production: Why Your Node.js Containers Die Without a Stack Trace Postgres Materialized Views: Refresh Strategies That Do Not Lock Your Dashboards API Dependency Health Checks: Why /health Is Not Enough Authorization with Zanzibar Tuples: How Google Manages Permissions and How To Build the Same Check in Node.js Postgres Advisory Locks: The 20-Character Primitive That Replaces Redis for Coordination Dead Letter Queues: The Message Queue Pattern That Saves You at 2 a.m. File Descriptor Exhaustion: The Kernel Limit That Silently Drops Node.js Connections Graceful Degradation: The Pattern That Turns Total Outages into Partial Success PostgreSQL Full-Text Search: Dropping Elasticsearch for 90% of Use Cases S3 Presigned Multipart Uploads: Stop Your API Server from Being a File Upload Bottleneck MessagePack vs JSON: The Binary Serialization Switch That Cut Our Internal RPC Overhead by 40% DNS Caching in Node.js: The Silent Cause of Production Latency Spikes Reliable Cron Jobs: The Pattern That Stops Double Runs, Missed Executions, And The 2 AM Page GraphQL Query Complexity: Stop the OOM Query Before It Reaches Your Resolver Node.js Event Loop Lag: The Hidden Metric Behind Random Latency Spikes API Request Validation with Zod: The Schema That Catches Bad Input Before It Corrupts Your Database Load Shedding in Node.js: How to Reject Traffic Before You Drown Request Hedging: Cut Tail Latency In Half Without Overprovisioning Git Bisect: The Automated Binary Search That Finds Breaking Commits in Minutes Node.js Garbage Collection Tuning: Stop Letting V8 Pause Your Event Loop Node.js Server Timeouts: The Settings That Stop Slow Clients from Holding Sockets Hostage Postgres BRIN Indexes: The Time-Series Secret That Shrinks Indexes by 99% Event Sourcing with PostgreSQL: The Pragmatic 80% Solution Node.js Cluster Mode: Scaling the Event Loop Across CPU Cores Postgres Partial Indexes: Stopping Soft Deletes from Ruining Your Query Performance Request Coalescing with the Singleflight Pattern: Stop Drowning Your Database on Every Cache Miss The Bulkhead Pattern: Why One Slow Endpoint Should Not Drown Your Whole Service Node.js AsyncLocalStorage: End-to-End Request Context Without the Propagation Hell Postgres Deadlocks: Logging the Victim, Reproducing the Race, and Fixing the Lock Order Your Node.js HTTP Client Is the Bottleneck: Connection Pool Tuning That Works Optimistic Locking in Postgres: Stop Losing Data to Race Conditions Postgres Read Replicas: Stop Serving Stale Data to Your Users Cursor Pagination: Why Offset Queries Explode at Scale and How to Fix Them Node.js Worker Threads: 60 Lines That Stop a CSV Upload from Timing Out Every Other Request Reliable Webhook Delivery: Architecture for Outbound HTTP You Can Trust Request Timeouts and Deadline Propagation: Stop the Chain of Slowness Advanced Security Practices in Node.js Graceful Shutdown in Node.js: The 40 Lines That Stop 502s During Deploys Finding Node.js Memory Leaks with Heap Snapshots Idempotency Keys in 30 Lines: Stop Your Webhook From Charging Customers Twice Backpressure In Node.js: The Fix For Slow-Motion Queue Meltdowns Retries Done Right: Jitter, Budgets, and the Stampede You Did Not See Coming The Cache Stampede: Why Your "Just Add Redis" Layer Crashes Postgres at 3 a.m. Postgres SKIP LOCKED: An 80-Line Job Queue You Can Run Without Redis Stop Doing Work Nobody Wants: AbortController in Node.js, Done Right The N+1 Query Problem: We Found 23 In One Codebase And Killed Every One I Tried 5 AI Coding Tools for a Month. Here Is What I Actually Use CI/CD From Zero to Production in 30 Minutes With GitHub Actions Node.js vs Bun vs Deno: Which Runtime Should You Pick in 2025? Kubernetes Resource Requests And Limits: The Numbers That Decide If Your Cluster Is Stable The Three Pillars of Observability Are A Myth: What Actually Matters In Production pnpm Vs npm Vs yarn Vs Bun For Monorepos: Which One Earns The Migration In 2024 JSONB Indexing In Postgres: GIN Vs Expression Indexes, And When Each Is The Right Choice A Code Review Checklist That Ends The Same Three Arguments Every Sprint gRPC Vs REST In 2024: When The Switch Pays For Itself React Suspense For Data Fetching: The Pattern That Replaces Half Your Loading State Code The Five-Stage Rollout: How To Ship A Risky Change Without Holding Your Breath GitHub Actions In A Monorepo: Caching, Path Filters, And Secret Boundaries That Actually Work The Blameless Postmortem That Actually Improves Things: A Template And Six Hard-Won Rules Recursive CTEs In Postgres: How To Query A Tree Without N Round Trips Node.js Streams: When They Actually Help, And When They Just Add Complexity Playwright Vs Cypress In 2024: The Honest Comparison Of Which One Earns The Test Time React Server Components: The Mental Model That Makes The "use client" Boundary Obvious Pod Disruption Budgets: The K8s Object That Keeps Your Service Up During Cluster Maintenance Postgres LISTEN/NOTIFY: The Pub/Sub You Already Have And Are Not Using Chaos Engineering Starter Kit: The Five Drills That Don't Need Netflix-Scale Spec-Driven API Development With OpenAPI: How To Stop Drifting From Your Docs Kubernetes Autoscaling Beyond CPU: The Custom-Metric HPA Pattern That Actually Works Postgres Partitioning For Time-Series: The Boring Setup That Saves Your Database Distributed Locks With Redis: An Honest Look At Redlock And When You Don't Need It HTTP/2 vs HTTP/3: What Actually Changes For Your App, And What Doesn't Image Optimization For The Web In 2023: srcset, AVIF, And The Lighthouse Score You Actually Want Kafka vs RabbitMQ: A Decision Tree That Doesn't Hate You UUID vs Bigint Primary Keys In Postgres: The Index Math That Decides For You Flame Graphs: How To Find The Slow Function In 30 Seconds Without Profiling Theatre Postgres Streaming Vs. Logical Replication: Which One Solves Your Actual Problem ESLint Rules That Earn Their Keep: The Twelve I Enable On Every Project Pre-Commit Hooks That Pay For Themselves: Husky, lint-staged, And The Five Rules That Stick Zero-Downtime Database Migrations: The Six-Step Pattern That Rules Them All Circuit Breakers In Node.js: 50 Lines That Stop A Failing Dependency From Taking Down Your Service Postgres VACUUM Is Not Magic: How Your Hot Table Bloats To 80GB And How To Fix It Kubernetes Liveness And Readiness Probes: The Difference That Causes Half Your Outages Rate Limiting In Production: A Token Bucket In 30 Lines Of Redis The Outbox Pattern: How To Stop Losing Events When Postgres And Kafka Disagree Load Testing With k6: The Three Scenarios That Find Real Bugs (Not Synthetic Numbers) Postgres Row-Level Security For Multi-Tenant Apps: The Pattern That Stops You From Leaking Data Rebase vs. Merge: The Team Policy That Ends The Argument Forever OpenTelemetry in Node.js: Distributed Tracing That Actually Helps During an Incident Feature Flags That Pay Rent: The 4 Flag Types And When To Delete Each ETag, Last-Modified, and the Caching Headers Most APIs Get Wrong Connection Pooling Without the Cargo Cult: pgbouncer in 100 Lines of Config JSONB Is Not a Schema: When To Reach For It in Postgres, And When To Stop Bash Strict Mode: The Three Lines That Stop Your Deploy Script From Lying To You
Stale-While-Revalidate: The Caching Pattern That Makes 0ms Feel Normal
The Practica · 2026-06-09 · via The Practical Developer

You have a perfectly tuned cache-aside layer. Redis responds in 2ms. Your p50 latency is 8ms. You are proud of the cache stampede protection you added after the last incident, the single-flight lock that keeps Postgres from getting 4,000 identical queries at once when a hot key expires.

But here is the thing you still feel: every cache miss. Even with single-flight, the first request after a key expires waits for the full database query. 380ms if the query is fast. 2 seconds if it is slow. The user staring at a loading spinner does not care about the elegant single-flight logic that saved the other 3,999 requests. They care that this one request took 2 seconds.

Stale-while-revalidate (SWR) is the pattern that eliminates that wait entirely. Instead of waiting for a cache miss to trigger a refresh, you serve the stale (expired) value immediately and refresh the cache in the background. The user always gets a response in cache-hit time. The database gets updated asynchronously. The word “miss” effectively disappears from your vocabulary.

This post is the SWR pattern implemented in Node.js with Redis: the TTL strategy that enables it, the revalidation lock that prevents thundering herds on the async refresh, the stale buffer that keeps data fresh enough, and the metrics that tell you whether your SWR window is right. By the end you will have a ready-to-deploy cache client that never makes a user wait for a cold query.

The problem with cache-expire-and-block

Standard cache-aside works like this:

1. Check cache for key
2. If found and not expired -> return it (hit)
3. If expired or missing -> query database, write cache, return (miss)

Step 3 is the problem. The user cannot get a response until the database finishes. Even with single-flight, where only one request hits the database and the rest wait on a promise, the first user in line waits for the full query.

Here is what that looks like in code you have probably written:

async function getWithTTL<T>(key: string, ttlSec: number, fetch: () => Promise<T>): Promise<T> {
  const cached = await redis.get(key);
  if (cached !== null) {
    return JSON.parse(cached);
  }

  // Cache miss. User waits.
  const value = await fetch();
  await redis.set(key, JSON.stringify(value), 'EX', ttlSec);
  return value;
}

That await fetch() is the pause. For a 200ms database query, the user waits 200ms. For a 2-second aggregation, they wait 2 seconds. If your TTL is 60 seconds and your key is hot (1,000 req/s), this happens 1,440 times a day for every key. Over a fleet of 200 endpoints, that is a lot of slow responses.

The naive fix is to increase the TTL. But long TTLs mean stale data. The dashboard shows numbers from 15 minutes ago. The inventory count is wrong. The user refreshes and sees a different result. Long TTLs trade freshness for speed, and eventually someone files a bug about “the data is not updating.”

SWR solves this by separating the cache into two layers: the “serve from” value and the “fresh until” timestamp.

How SWR works

The core idea comes from HTTP RFC 5861 (which defines the stale-while-revalidate Cache-Control directive) and has been popularized by client libraries like SWR and React Query. The server-side version works like this:

1. Check cache for key
2. If found and TTL not expired -> return immediately (fresh hit)
3. If found but TTL is expired but within stale window -> return immediately (stale hit), schedule async refresh
4. If not found or beyond stale window -> query database, write cache, return (cold miss)

The key insight: step 3 returns the stale data instantly. The user gets a response in cache-read time (2ms). Meanwhile, a background promise refreshes the cache. The next request gets fresh data.

This changes the cache hit/miss curve from a binary cliff into a soft slope:

  • Fresh hit: 0 wait
  • Stale hit (SWR): 0 wait
  • Cold miss (cache never populated): full query wait (rare, happens once per key)

In practice, cold misses only happen on first access after a deployment or for long-tail keys that expire completely. For hot keys, the SWR window keeps them perpetually warm.

The Redis SWR implementation

The trick is storing the expiration timestamp alongside the data so you can distinguish “fresh” from “stale but usable.” A Redis hash works perfectly:

interface SwrEntry<T> {
  data: T;
  expiresAt: number;   // Unix ms: when fresh -> stale
  staleAt: number;     // Unix ms: when stale -> dead
}

// Cache module
import { createClient } from 'redis';

const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();

// Track in-flight revalidations to avoid duplicate refreshes
const pendingRefreshes = new Map<string, Promise<void>>();

export async function swrGet<T>(
  key: string,
  ttlSec: number,
  staleSec: number,
  fetch: () => Promise<T>
): Promise<T> {
  const now = Date.now();

  // Read the raw hash
  const raw = await redis.hGetAll(key);

  // Cold start: never cached, or stale window expired
  if (!raw || !raw.data) {
    const value = await fetch();
    const entry: SwrEntry<T> = {
      data: value,
      expiresAt: now + ttlSec * 1000,
      staleAt: now + (ttlSec + staleSec) * 1000,
    };
    await redis.hSet(key, {
      data: JSON.stringify(value),
      expiresAt: String(entry.expiresAt),
      staleAt: String(entry.staleAt),
    });
    await redis.expire(key, ttlSec + staleSec);
    return value;
  }

  const expiresAt = Number(raw.expiresAt);
  const staleAt = Number(raw.staleAt);
  const data: T = JSON.parse(raw.data);

  // Case 1: Still fresh, return immediately
  if (now < expiresAt) {
    return data;
  }

  // Case 2: Stale but within SWR window, return stale + refresh async
  if (now < staleAt) {
    // Fire-and-forget refresh (with dedup)
    scheduleRefresh(key, ttlSec, staleSec, fetch);
    return data;
  }

  // Case 3: Beyond stale window, cold refresh
  const value = await fetch();
  const entry: SwrEntry<T> = {
    data: value,
    expiresAt: now + ttlSec * 1000,
    staleAt: now + (ttlSec + staleSec) * 1000,
  };
  await redis.hSet(key, {
    data: JSON.stringify(value),
    expiresAt: String(entry.expiresAt),
    staleAt: String(entry.staleAt),
  });
  await redis.expire(key, ttlSec + staleSec);
  return value;
}

The scheduleRefresh function handles the async revalidation:

function scheduleRefresh<T>(
  key: string,
  ttlSec: number,
  staleSec: number,
  fetch: () => Promise<T>
): void {
  // Deduplicate: if a refresh is already in flight, skip
  if (pendingRefreshes.has(key)) return;

  const refreshPromise = (async () => {
    try {
      const value = await fetch();
      const now = Date.now();
      const entry: SwrEntry<T> = {
        data: value,
        expiresAt: now + ttlSec * 1000,
        staleAt: now + (ttlSec + staleSec) * 1000,
      };
      await redis.hSet(key, {
        data: JSON.stringify(value),
        expiresAt: String(entry.expiresAt),
        staleAt: String(entry.staleAt),
      });
      await redis.expire(key, ttlSec + staleSec);
    } catch (err) {
      // Refresh failed. The stale data stays in cache.
      // The next request will try again.
      console.error(`SWR refresh failed for key ${key}:`, err);
    } finally {
      pendingRefreshes.delete(key);
    }
  })();

  pendingRefreshes.set(key, refreshPromise);

  // Prevent unhandled rejection by attaching a noop catch
  refreshPromise.catch(() => {});
}

This is deliberately simple and synchronous-looking for the caller. The swrGet function always returns a value in cache-read time (cold start aside). The refresh happens out of band. If the refresh fails, the stale data remains in cache and the next request will schedule another refresh. The cache never goes empty.

What about the thundering herd on the async refresh?

Single-flight on cache miss is the standard defense against a thundering herd. SWR has a similar problem: if the background refresh catches an error (e.g., database timeout) and 10,000 requests are all serving stale data, they will all schedule a refresh at once. The pendingRefreshes Map handles this: only one refresh per key is ever in flight, regardless of how many requests read stale data.

But there is a subtler problem. What if the SWR window is 30 seconds, the database is slow (800ms per query), and all 10,000 requests arrive within that 800ms window? The first request triggers the async refresh. The other 9,999 skip because pendingRefreshes already has the key. One database query. Good.

But what if the refresh takes 2 seconds and the stale window expires before the refresh completes? Then the next batch of requests after the stale window expires hits a cold miss and blocks on a synchronous refresh. This is the SWR equivalent of a cache stampede.

The fix is to extend the stale window when a refresh is in flight:

async function scheduleRefresh<T>(
  key: string,
  ttlSec: number,
  staleSec: number,
  fetch: () => Promise<T>
): Promise<void> {
  if (pendingRefreshes.has(key)) return;

  const refreshPromise = (async () => {
    try {
      // Extend staleAt so concurrent readers keep getting stale data
      const now = Date.now();
      await redis.hSet(key, 'staleAt', String(now + staleSec * 1000));

      const value = await fetch();
      const freshNow = Date.now();
      const entry: SwrEntry<T> = {
        data: value,
        expiresAt: freshNow + ttlSec * 1000,
        staleAt: freshNow + (ttlSec + staleSec) * 1000,
      };
      await redis.hSet(key, {
        data: JSON.stringify(value),
        expiresAt: String(entry.expiresAt),
        staleAt: String(entry.staleAt),
      });
      await redis.expire(key, ttlSec + staleSec);
    } catch (err) {
      console.error(`SWR refresh failed for key ${key}:`, err);
    } finally {
      pendingRefreshes.delete(key);
    }
  })();

  pendingRefreshes.set(key, refreshPromise);
  refreshPromise.catch(() => {});
}

The await redis.hSet(key, 'staleAt', ...) at the start of the refresh pushes the stale window forward. Any request that arrives while the refresh is in flight will still return the stale data. The only way to hit a cold miss is if both the cache is empty and there is no refresh in flight, which only happens on initial population or after a Redis eviction.

Picking your TTL and stale window

The two numbers that control SWR behavior are the TTL (how long data is “fresh”) and the stale window (how long after the TTL you accept stale data before forcing a refresh).

A good starting point for most endpoints:

  • TTL: 30 seconds. Aggressive enough that data is never more than 30 seconds stale. Most dashboards and API responses tolerate this.
  • Stale window: 60 seconds. A full minute of stale-serve coverage. The async refresh has 60 seconds to complete before any request would block.

This means a key is cacheable for 90 seconds total (30 fresh + 60 stale). The data age at the user’s eyes is at most 90 seconds. The refresh rate is determined by the request rate: every request that reads stale data schedules a refresh, but the dedup ensures only one refresh per TTL+stale window cycle.

For slower endpoints (5-second database queries), increase the stale window to 120 seconds to give the refresh plenty of time. For endpoints where data freshness matters (inventory counts, available balances), reduce TTL to 5 seconds and stale window to 10 seconds. At 5/10, the data is at most 15 seconds old and the user never waits for a cache miss unless Redis evicts the key.

// Configuration presets
const swrConfigs = {
  dashboard:    { ttlSec: 30,  staleSec: 60  },  // Typical dashboard
  realtime:     { ttlSec: 5,   staleSec: 10  },  // Near realtime
  reference:    { ttlSec: 300, staleSec: 300 },  // Slow-changing reference data
  coldTolerant: { ttlSec: 10,  staleSec: 120 },  // Slow queries need large stale window
} as const;

Metrics that matter

SWR hides latency, which means it can also hide problems. If your database is slow and every refresh takes 5 seconds but the stale window is 10 seconds, users never see the slowness. They see 2ms responses from stale data. Great for the user. Terrible for your ability to notice the database degrading.

You need three metrics exposed from your SWR client:

// Counters for your metrics system (Prometheus, OpenTelemetry, etc.)
const swrHitsFresh = new Counter({ name: 'swr_hits_fresh_total', help: 'Served from fresh cache' });
const swrHitsStale = new Counter({ name: 'swr_hits_stale_total', help: 'Served from stale cache with async refresh' });
const swrMissesCold = new Counter({ name: 'swr_misses_cold_total', help: 'Cache empty, blocked on sync refresh' });
const swrRefreshDuration = new Histogram({ name: 'swr_refresh_duration_seconds', help: 'Time for background refresh' });

Track these and alert on:

  • swr_misses_cold_total > 0: A key was not in cache at all. This should be rare. If it happens frequently, your Redis memory is too small or your stale window is too short.
  • swr_refresh_duration p99 approaching the stale window: The refresh is barely making it. Increase the stale window or investigate the database query.
  • swrHitsStale / (swrHitsFresh + swrHitsStale) > 0.5: More than half of your responses are stale. Either your TTL is too short or the data is accessed less frequently than you expected. Consider increasing TTL.

A healthy SWR endpoint in production should show 0 cold misses, < 20% stale hits (most hits should land in the fresh window), and refresh durations safely below the stale window.

Edge cases that will bite you

Redis eviction. If Redis runs out of memory and evicts your SWR key, the next request gets a cold miss. This is the one case where SWR cannot help. Mitigate by setting a sensible maxmemory-policy (prefer allkeys-lru over noeviction for a cache), and monitor eviction rates. Also, the expire set on the key ensures Redis does not hold data past the stale window.

Large payloads. Storing the data field as a JSON string in a Redis hash is fine for payloads under 1MB. For larger payloads, consider splitting: store a Redis key for the data separately and use the hash to store expiry metadata and a pointer. Or switch to a dedicated cache that handles large objects natively.

Serialization cost. JSON.parse on every read and JSON.stringify on every write adds up. For hot keys, consider a binary serialization format like MessagePack. The SWR pattern does not care about the encoding, only about the data/expiresAt/staleAt structure.

Clock skew. If your Redis server and application server have different clocks, the Unix timestamps in expiresAt and staleAt drift. Use the Redis server time via TIME command (or normalize all timestamps to Redis time at write time). In practice, sub-second clock skew is fine for a 30-second TTL.

Stale data avalanche. If your database goes down for 5 minutes and every key reaches the stale window limit, every request becomes a cold miss and every cold miss fails. This is the same failure mode as a normal cache stampede, just delayed by the stale window. The fix is the same as any database outage: use graceful degradation to serve the stale data even past the stale window, with a circuit breaker for the database call.

The complete client

Here is the full SWR cache client in about 80 lines:

import { createClient, RedisClientType } from 'redis';
import { Counter, Histogram } from './metrics'; // Your metrics system

interface SwrEntry<T> {
  data: T;
  expiresAt: number;
  staleAt: number;
}

export class SwrCache {
  private redis: RedisClientType;
  private pendingRefreshes = new Map<string, Promise<void>>();
  private metrics = {
    hitsFresh: new Counter({ name: 'swr_hits_fresh_total', help: '' }),
    hitsStale: new Counter({ name: 'swr_hits_stale_total', help: '' }),
    missesCold: new Counter({ name: 'swr_misses_cold_total', help: '' }),
    refreshDuration: new Histogram({ name: 'swr_refresh_duration_seconds', help: '' }),
  };

  constructor(redisUrl: string) {
    this.redis = createClient({ url: redisUrl });
    this.redis.connect();
  }

  async get<T>(
    key: string,
    ttlSec: number,
    staleSec: number,
    fetch: () => Promise<T>
  ): Promise<T> {
    const now = Date.now();
    const raw = await this.redis.hGetAll(key);

    if (!raw || !raw.data) {
      this.metrics.missesCold.inc(1);
      const value = await fetch();
      await this.writeEntry(key, value, ttlSec, staleSec);
      return value;
    }

    const expiresAt = Number(raw.expiresAt);
    const staleAt = Number(raw.staleAt);
    const data: T = JSON.parse(raw.data);

    if (now < expiresAt) {
      this.metrics.hitsFresh.inc(1);
      return data;
    }

    if (now < staleAt) {
      this.metrics.hitsStale.inc(1);
      this.scheduleRefresh(key, ttlSec, staleSec, fetch);
      return data;
    }

    // Cold miss (beyond stale window or evicted)
    this.metrics.missesCold.inc(1);
    const value = await fetch();
    await this.writeEntry(key, value, ttlSec, staleSec);
    return value;
  }

  private async scheduleRefresh<T>(
    key: string,
    ttlSec: number,
    staleSec: number,
    fetch: () => Promise<T>
  ): Promise<void> {
    if (this.pendingRefreshes.has(key)) return;

    const p = (async () => {
      const start = Date.now();
      try {
        // Extend stale window so concurrent readers keep getting stale data
        await this.redis.hSet(key, 'staleAt', String(Date.now() + staleSec * 1000));
        const value = await fetch();
        await this.writeEntry(key, value, ttlSec, staleSec);
      } catch (err) {
        console.error(`SWR refresh failed for ${key}:`, err);
      } finally {
        this.metrics.refreshDuration.observe((Date.now() - start) / 1000);
        this.pendingRefreshes.delete(key);
      }
    })();

    this.pendingRefreshes.set(key, p);
    p.catch(() => {});
  }

  private async writeEntry<T>(key: string, data: T, ttlSec: number, staleSec: number): Promise<void> {
    const now = Date.now();
    await this.redis.hSet(key, {
      data: JSON.stringify(data),
      expiresAt: String(now + ttlSec * 1000),
      staleAt: String(now + (ttlSec + staleSec) * 1000),
    });
    await this.redis.expire(key, ttlSec + staleSec);
  }

  async disconnect(): Promise<void> {
    await this.redis.quit();
  }
}

Usage in a route handler:

const cache = new SwrCache(process.env.REDIS_URL);

app.get('/api/dashboard/:userId', async (req, res) => {
  const data = await cache.get(
    `dashboard:${req.params.userId}`,
    30,   // TTL: 30 seconds of fresh
    60,   // Stale window: 60 seconds of stale-serve
    () => db.queryDashboard(req.params.userId)
  );
  res.json(data);
});

Every response returns in 2-4ms (Redis read time) except the very first request for that key, which pays the database cost once. After that, the cache is self-sustaining: reads refresh it, the stale window absorbs latency, and cold misses only happen if Redis evicts the key.

The takeaway

A normal cache-aside layer turns every cache expiration into a latency cliff. Users wait for the database query. Even with single-flight, the first user in line pays the full cost. Stale-while-revalidate eliminates that cliff by serving expired data immediately and refreshing in the background. The user always gets a response in cache-read time. The database gets quiet, async updates.

The pattern is not complex. It is three timestamps in a Redis hash, a dedup Map for pending refreshes, and a conditional read path that serves stale data instead of blocking. It is the same pattern that makes React Query and SWR on the frontend feel instant, adapted for the server side where the data source is a database and the cache is Redis.

Wire it once. Set a 30-second TTL and 60-second stale window. Watch your cold miss counter stay at 0 and your p99 latency drop to your Redis round-trip time. Then forget about it, because the cache is no longer something you page about.


A note from Yojji

The difference between a caching layer that feels fast and one that causes 3 a.m. incidents is often just a handful of deliberate patterns: single-flight, early refresh, and the humble stale-while-revalidate window. Production backend engineering is full of these small, high-leverage decisions that separate teams who chase problems from teams who move past them.

Yojji is an international custom software development company founded in 2016, with offices in Europe, the US, and the UK. Their senior engineers specialize in building the Node.js microservices, caching architectures, and database-backed APIs that stay fast and reliable under real-world traffic, whether through dedicated outstaffing or end-to-end product delivery.