慣性聚合 高效追讀感興趣之博客、新聞、科技資訊
閱原文 以慣性聚合開啟

推薦訂閱源

博客园 - 司徒正美
V
V2EX
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
aimingoo的专栏
aimingoo的专栏
Apple Machine Learning Research
Apple Machine Learning Research
IT之家
IT之家
Blog — PlanetScale
Blog — PlanetScale
A
About on SuperTechFans
月光博客
月光博客
T
The Blog of Author Tim Ferriss
宝玉的分享
宝玉的分享
Martin Fowler
Martin Fowler
博客园 - 聂微东
The GitHub Blog
The GitHub Blog
V
Visual Studio Blog
WordPress大学
WordPress大学
酷 壳 – CoolShell
酷 壳 – CoolShell
Engineering at Meta
Engineering at Meta
GbyAI
GbyAI

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python)
2026年缓存层:CDN、应用、数据库、查询:何者归何处
Gabriel Anha · 2026-05-24 · via DEV Community

四层缓存,介于用户之请与所询之行列之间。众团队多取其二(常以 Redis 面向数据库,兼以 CDN 负责静态资材),而视余二者为他者之责。

如是,则 Redis 之层为之劳,而 CDN 之责废;数据库默用其计划缓存为质之衾;每遇热键之期,则群起而争。各层答不同之问。择非其宜,则偿所失之层。

何以缓存分层,非择一

其式之要,因问叠故也。

  • CDN答曰:"可避本源乎?"
  • 应用缓存答曰:"可避数据库往返乎?"
  • 数据库缓存答曰:"可避重算结果乎?"
  • 查询缓存答曰:"可避解析与规划语句乎?"

每得一"是",则下层数层皆断;每得一"否",则请命下传。凡四层皆及之请命,其感无殊于未及一者,惟费与迟滞或有别耳。

谬在于将此归并为单层。众团队将万事皆推于Redis,因其已备此层。Redis为佳之应用缓存。然其为劣之CDN,更劣之实观,于预置语句规划者全无裨益.

第一层:CDN边缘缓存

CDN近用户而居。此乃应求之最廉,盖求无至服务器也。然2026之奇术,CDN所缓存者,非止于此。.jpg文件也。若告之方,则亦缓存API之应也。

此乃Cloudflare缓存之规,用于公共产品列表之端点。

// Cloudflare Worker - cache /api/products/* responses for 60s at edge
export default {
  async fetch(request, env, ctx) {
    const url = new URL(request.url);
    const cache = caches.default;

    // only cache GETs for the public catalog
    if (request.method !== "GET" ||
        !url.pathname.startsWith("/api/products/")) {
      return fetch(request);
    }

    // strip auth-affecting query params from the cache key
    url.searchParams.delete("trace_id");
    const cacheKey = new Request(url.toString(), request);

    let response = await cache.match(cacheKey);
    if (response) {
      return response; // edge hit, never touches origin
    }

    response = await fetch(request);
    if (response.status === 200) {
      const cached = new Response(response.body, response);
      cached.headers.set("Cache-Control", "public, max-age=60, s-maxage=60");
      cached.headers.set("CDN-Cache-Control", "max-age=60");
      ctx.waitUntil(cache.put(cacheKey, cached.clone()));
      return cached;
    }
    return response;
  },
};

入全景模式 出全屏模式

未完之句,难续其意。s-maxage者,共享缓存之指令也。其示CDN者,令其持应答之时,不系于浏览器所为。此二Cache-Control之指令,使汝授CDN六十秒,而示浏览器异时(常为max-age=0, must-revalidate)。

此间所入:公然之取,无名之应,万民视之若同。货品之列,宣导之页,公然之页,无别之索,OG图像之端,地之图.xml。凡因人而异者,皆非其宜,除非以用户之识为缓存之钥.

此间所去:凡有者,皆非其宜。Set-Cookie者,凡经认证,有写入之效者。若CDN于/api/*之命中率超五,则已胜众矣。

层二:应用缓存(Redis)

所谓"缓存",即应用缓存也。此缓存居诸君之服务进程,或与Redis并置。凡CDN所不能应者(盖因其需认证或个性化),此缓存则应之,不须再访数据库。

善行者之范式:

import json
import redis
from typing import Optional

r = redis.Redis(host="cache.internal", port=6379, decode_responses=True)

def get_user_profile(user_id: str) -> dict:
    key = f"user:profile:{user_id}"

    # try cache first
    cached = r.get(key)
    if cached is not None:
        return json.loads(cached)

    # cache miss - hit the DB
    profile = db.fetch_one(
        "SELECT id, name, plan, created_at FROM users WHERE id = $1",
        user_id,
    )

    # set with a TTL so stale data times out even if invalidation fails
    r.setex(key, 300, json.dumps(profile, default=str))
    return profile

def invalidate_user_profile(user_id: str) -> None:
    # called from any writer that mutates the user row
    r.delete(f"user:profile:{user_id}")

入全屏模式 出全屏模式

人有二事而略之:一曰TTL之安全网,二曰写入端之失效。TTL独用,则写后五分钟犹漏陈读。失效独用,则网脉微颤而DEL失,永漏无期。二者皆需。譬若腰带、悬带,复加一手执腰带耳。

此中可置:用户数据、会话状态、重建成本高昂的计算聚合,凡每请求数值之重,然变易不频者。八二之律:择占数据库八二负载之二十查询,首为是者缓存之。

所不取者:须强一致者(如将用之银行余额),阅多于写者,陈旧之阅较五十毫秒迟滞为劣者。

第三层:数据库缓存(物化视图)

物化视图者,众队多忘其存也。其居数据库之中,预算昂查询之果。数据库若存表,以存其果。读之速,如检一表,其效为 O(1),非七联与窗函数之繁。

以 Postgres 为例。每日每户之营收总汇,否则每载仪表盘之载,必扫实表:

CREATE MATERIALIZED VIEW account_revenue_daily AS
SELECT
    account_id,
    date_trunc('day', created_at)::date AS day,
    sum(amount_cents) AS revenue_cents,
    count(*) AS txn_count
FROM transactions
WHERE status = 'settled'
GROUP BY account_id, day;

CREATE UNIQUE INDEX ON account_revenue_daily (account_id, day);

-- refresh policy: every 10 minutes via pg_cron
-- CONCURRENTLY needs the unique index above to work
SELECT cron.schedule(
    'refresh-account-revenue',
    '*/10 * * * *',
    $$REFRESH MATERIALIZED VIEW CONCURRENTLY account_revenue_daily$$
);

入全景模式 退出全屏模式

REFRESH ... CONCURRENTLY者,人莫之闻也。无此,则刷新必持ACCESS EXCLUSIVE锁,阻读之事。有此,则刷新书于影本,而换之若一。换时,稍费磁盘;而阻仪表之事止矣。

此中容何物:四表以上之聚合、联结,凡数据之变慢于查询者。日度之汇总、排行榜、搜索之维度,凡分析师欲以CTE书写者。

此中不容何物:需实时之结果。物化视图于刷新间必为陈旧。若用户期其行动立现,此层非其解也。

此中玄机乃默然:物化视图之陈。一队既成,达标于仪表盘之迟滞,遂去。半载后,其下之表已三倍其大,更新之程历十二辰,而视图多陈于鲜。察更新之期如察查询之策。

第四层:查询缓存(预语句之策库)

至深之层,亦最隐晦。每度驭者发SQL之令,则库必析之、筹之、行之。前二步若用预制之令,可蓄于缓存。

大抵ORM皆拙于此,盖每询必赍新令文(WHERE id = 1WHERE id = 2),破缓存之效。其解在参数系缚:

# bad - new statement every call, plan cache miss every time
def get_order_bad(order_id: int):
    return db.execute(f"SELECT * FROM orders WHERE id = {order_id}")

# good - same statement text, only parameters change, plan reused
def get_order_good(order_id: int):
    return db.execute(
        "SELECT * FROM orders WHERE id = $1",
        order_id,
    )

入全景模式 退出全屏模式

在Postgres中,可见缓存之内容:

-- requires pg_stat_statements extension
SELECT
    query,
    calls,
    mean_exec_time,
    rows
FROM pg_stat_statements
WHERE query LIKE 'SELECT % FROM orders WHERE id = $1'
ORDER BY calls DESC
LIMIT 10;

进入全屏模式 退出全屏模式

若见同一逻辑查询,以不同字面值反复出现,而非$1,则ORM绕过计划缓存。须修正ORM配置(Eloquent的DB::statementDB::select与绑定,GORM之RawWhere等,皆须先调,而后他调勿扰。

此中可置:热OLTP查询。如按ID索,如入订单,如更新客户最近所见。每查询之胜微(或一毫秒,或二毫秒)。每秒五十万查询之胜,则数据库二实例与十实例之别也。

所不适用者:形态多变之问。若汝之WHERE子句因用户输入而结构易变,则不可复用其策,此亦无妨。

决策之表

“此数据属何层”之捷径:

数据之形 CDN 应用缓存 物化之观 策之缓存
靜態資產,匿名
公共GET,無權限 或許
個人配置,熱門
时辰之析,聚而观之 不可。 或(TTL) 不可。
事务处理标识查询 不。 不。
强一致读 不。 不可。 不。
实时写入反馈

所谓"或然"之列,乃诸队争辩于设计之会。诚言当曰"先量而后决"。若汝之小时汇总每分钟遭击二百次,则生成视图实有所值。若每日仅击三次,则设Redis缓存三十分钟之TTL已足,生成视图乃过工也。

层级失效策略

每层欲有异效之故

CDN TTL者,众队之实策也。可由API清除特定URL,然速率高时,清除需秒余方及,不可恃其正。宜用短TTL(热端用60秒,目录数据用5分),且容其陈旧。长TTL之资,宜更URL版次 (/static/app.a3f7b2.js),使新版为新钥,非无效之令。

应用缓存。 事件驱动之失效。凡变易行之写路径,皆呼 r.delete(key) 于该行缓存衍生之每一处。此法可行,直至有十二处写于 users,而人增第十三处,竟忘失效。集中之:凡写皆经一库,此库于提交后钩中发失效之警。

伪观。 定时更新。自定可接受之陈旧时日(十刻?一辰?),设 cron 之约。欲减陈旧时日之迟滞,可叠事件驱动之 REFRESH ... CONCURRENTLY 触发,然须慎之:更新本即耗力之务,不欲其每刻发百次。

缓存之计。 汝勿使其失效。数据库自管之。汝之责,乃撰查询,令其于规划者视之无异。

惊险之所在:层叠奔逃

层叠设计之致命缺陷:当热缓存条目于诸层同时过期,诸并发请求皆失诸层,遂共赴源端而奔逃。

实例:CDN缓存之/api/homepage 将于午正十二时届期而逝。千钧并至之请,失于CDN,直击君之应用,失于应用之缓存(亦于午正十二时而竭,盖因两TTL皆六十分,且俱设于子正五十九分),直击数据库,皆触发同一之实视图重建,而数据库遂倾颓。

有二法可防此患。其一为 singleflight:将N个相同之并请合并为一,广播其果。

import (
    "context"
    "encoding/json"
    "github.com/redis/go-redis/v9"
    "golang.org/x/sync/singleflight"
)

var sf singleflight.Group

func GetHomepage(ctx context.Context, rdb *redis.Client) ([]byte, error) {
    key := "homepage:v3"

    if cached, err := rdb.Get(ctx, key).Bytes(); err == nil {
        return cached, nil
    }

    // singleflight: only the first concurrent miss recomputes;
    // every other caller waits on the same future
    result, err, _ := sf.Do(key, func() (interface{}, error) {
        fresh, err := buildHomepageFromDB(ctx)
        if err != nil {
            return nil, err
        }
        payload, _ := json.Marshal(fresh)
        // SETEX with a small jitter so 1000 keys don't expire on the same second
        rdb.SetEx(ctx, key, payload, ttlWithJitter(60))
        return payload, nil
    })
    if err != nil {
        return nil, err
    }
    return result.([]byte), nil
}

入全景模式 出全屏模式

其次者锁并服务陈旧持过时之值于缓存,逾其TTL,俟一请至而重建时,乃返之。伪式:

def get_with_stale(key: str, fetch_fn, fresh_ttl=60, stale_ttl=600):
    payload = r.get(key)
    meta = r.get(f"{key}:meta")

    if payload and meta and meta == "fresh":
        return json.loads(payload)

    # try to grab the rebuild lock; if we get it, rebuild
    lock_key = f"{key}:lock"
    got_lock = r.set(lock_key, "1", nx=True, ex=10)

    if got_lock:
        try:
            fresh = fetch_fn()
            r.setex(key, stale_ttl, json.dumps(fresh))
            r.setex(f"{key}:meta", fresh_ttl, "fresh")
            return fresh
        finally:
            r.delete(lock_key)

    # we didn't get the lock - serve stale if we have it
    if payload:
        return json.loads(payload)

    # no stale, no lock - wait briefly and retry the read
    time.sleep(0.05)
    payload = r.get(key)
    return json.loads(payload) if payload else fetch_fn()

入全景模式 出全屏模式

今千请同发,触一复建。其九百九十九者,或得稍陈之值(可容),或候五十毫秒,取新书者。数据库见一额外查询,非千也。

凡此理,施诸诸层。CDNs(内容分发网络)本然为之,与stale-while-revalidate之指令。应用缓存需汝为之接引。物化视图者,本就设计为在重新验证时即为陈旧。计划缓存则无此弊。

星期一当何为

导一请托于汝之系统,标明每段数据当栖于何层。多队发现有一层实兼三层之职,或有三层皆缓存同物,然TTL各异而不协。

择失率至高之层数,加之。度之。复之。其旨非尽用诸层数也。意在使每数据各入其应答其问之层数,复谨使无有争逐之患。

尔当前系统,何层数劳作最多,何层数默然阙如?


若此有益

四层缓存堆乃系统设计面试中常问之事,亦为运维团队所感念之交付。系统设计袖珍指南:基础 覆述"性能原语"章中缓存之道,兼论均衡负载、队列、复制诸式,此皆决设计能否持重于规模者也。若此文有所助益,则全书皆此声也,贯250余页之构筑基石。

System Design Pocket Guide: Fundamentals — Core Building Blocks for Scalable Systems