惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

GbyAI
GbyAI
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
P
Proofpoint News Feed
L
Lohrmann on Cybersecurity
S
Secure Thoughts
Attack and Defense Labs
Attack and Defense Labs
人人都是产品经理
人人都是产品经理
Stack Overflow Blog
Stack Overflow Blog
W
WeLiveSecurity
O
OpenAI News
SecWiki News
SecWiki News
博客园 - Franky
NISL@THU
NISL@THU
Microsoft Azure Blog
Microsoft Azure Blog
T
Tor Project blog
Microsoft Security Blog
Microsoft Security Blog
aimingoo的专栏
aimingoo的专栏
Security Latest
Security Latest
H
Hacker News: Front Page
Google Online Security Blog
Google Online Security Blog
P
Privacy & Cybersecurity Law Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
D
Darknet – Hacking Tools, Hacker News & Cyber Security
月光博客
月光博客
李成银的技术随笔
Spread Privacy
Spread Privacy
F
Full Disclosure
F
Fortinet All Blogs
T
The Exploit Database - CXSecurity.com
Vercel News
Vercel News
AWS News Blog
AWS News Blog
WordPress大学
WordPress大学
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
V
Visual Studio Blog
J
Java Code Geeks
博客园 - 三生石上(FineUI控件)
G
Google Developers Blog
云风的 BLOG
云风的 BLOG
博客园 - 司徒正美
Engineering at Meta
Engineering at Meta
Last Week in AI
Last Week in AI
P
Palo Alto Networks Blog
宝玉的分享
宝玉的分享
T
True Tiger Recordings
N
News and Events Feed by Topic
酷 壳 – CoolShell
酷 壳 – CoolShell
Cisco Talos Blog
Cisco Talos Blog
N
News | PayPal Newsroom
S
SegmentFault 最新的问题
Jina AI
Jina AI

DEV Community

Pretty normal Both Camps in the 'Left Behind' Argument Are Right About Each Other Flutter MCP Toolkit v3 Google Just Shipped Gemini 3.5 Flash. Here's What Developers Actually Need to Know. 🔐 Working with Private Symfony Recipes Rate limiting en aplicaciones web: qué proteger antes de elegir una librería What Are Lakehouse Catalogs? The Role of Catalogs in Apache Iceberg What It Really Takes to Become a Senior Software Engineer Microservices Were Never About Technology JS Crime Scene: The Misleading Array Project-as-code for a Directus v9 backend When the API literally burned your database after a typo COOKIES DPRK Hacking Trends 2026: AI‑Powered Supply Chain and Developer Environment Attacks Phone control for AI coding sessions is not a tiny terminal PayPal and Crypto Are Not Equals: How I Built a Gumroad Alternative for Restricted Countries Exploring Tech as a Content Writer I Raised Gemma 4's Token Cap. The Dense Model Stopped Refusing. React Server Components Don't Make Your App Fast by Default Multi-Stage Builds for a Next.js App — Reduce Image Size by 70% I Built a Chrome Extension That Teaches Vocabulary While You Browse Why I Walked Back from Next.js and RSC to a Plain SPA and a Separate Backend NeuralPocket: Private On-Device AI with Gemma 4 — Android & Web Github Speckit: Revolucionando o Desenvolvimento com SDD Cloud Cost Elasticity I Built a Payment System for Bangladesh—Heres Why Stripe Failed Us Polyglot Persistence in Microservices: Choosing the Right Database for Each Service Centralized Authentication for a Multi-Brand Laravel Ecosystem How I made a perfect recording button. Simple yet complex thing. Mumbli – my personal Wispr Flow Getting Paid Should Not Be a Geopolitical Nightmare: My NOWPayments Integration Story Four Layers of Validation in Kubernetes with Claude Code Prompt Flow — a visual side project for flow design, trace, and integration steps (looking for feedback) AI Citation Registry: Temporal Gaps in Government Publishing Cycles ShowDev: I built a 100% local, zero-upload PDF editor using WebAssembly JavaC Written by an AI Pipeline, Verified by Three Models. Is It Slop? Part1 Vulkan: Drawing Triangle 1 Why I Stopped Using useEffect to Sync State — and What I Use Instead Por qué dejé de usar useEffect para sincronizar estado y qué uso ahora Migrating a Long-Running WordPress Site to Payload CMS (And All The Chaos That Came With It) Hidden Partitioning: How Iceberg Eliminates Accidental Full Table Scans Azure DevOps Structure Explained: Organizations, Projects, and Repos Without the Mess A Simple React Hook for localStorage State, Expiry, and Sync I sold you on /scratchpad. Then I migrated to /note. Fixing WSL Errors on Windows 11 Your app is not Netflix. Stop building like it is. Resolving inter-service communication issue I built an email cleaner. CSV parsing took longer than the actual validators. How I Would Learn Full-Stack Development in 2026 If I Started From Zero Partition Evolution: Change Your Partitioning Without Rewriting Data What Google Play's I/O 2026 Updates Look Like From a Solo Indie Puzzle Developer Forgetting the Myth of "Ease of Integration" When Selling Digital Products with Bitcoin My 4-Step Regex Debugging Workflow (That Actually Saves Time) Stop Scraping Betting Sites: How to Build a Real-Time Sports Tracker in Python Civic Identity and Responsibility in Modern Democracy OLTP vs OLAP Are binaries really executable code ? The lie of the 80%: why software progress charts don't work What a Datacenter in Space Actually Buys You: Three Server Racks Is AI Actually Citing Your Site? How to Measure What Google Rankings Can't Accessibility - This looks like a job for a developer advocate! I built a Mac app that turns web pages into live widgets How to Teach Source Evaluation When Your Students Use ChatGPT More Context Does Not Mean More Trust RAG Series (24): Code RAG — Teaching AI to Understand Your Codebase Past the JVM Design decisions behind my “Irregular German Verbs” iOS app WordPress 7.0 "Armstrong" Is Live — Post-Release Deep Dive 🎺 Performance and Apache Iceberg's Metadata I Shipped a Bug to Production That Cost Us 3 Hours of Downtime 程序人生:在代码与时间之间 The Wrong Way to Think About XRPL Event Infrastructure What I Learned About MND, Voice Banking, and Why Assistive Tech Is Personal $1.50/Month Email Infrastructure That Beats Your $20 SendGrid Plan Cloud Unit Economics: The Metrics DevOps and FinOps Teams Actually Need Bypassing Payment Platform Restrictions Was The Best Decision I Ever Made For My Digital Product Business The Hidden Life of a Container: A Complete Lifecycle When a port is already in use, there is no interactive way to find it — so I built `port-peek` Como Sumir com o Barulho do Teclado Mecânico no Ubuntu Usando o NoiseTorch Google I/O 2026 dropped a bomb on Android tooling, and nobody's talking about it (or maybe they are 😅) Mentoring Junior Developers: What Actually Works How I Prevented Claude Code from Breaking My Architecture with 18 Tests That Run in 0.4 Seconds I Controlled an ESP32 Drone Using Only My Voice vite HMR is silently the reason ur laptop fan wont stop AI Agents Security for Developers: Don't Let Your Agents Become a Liability Single List Keyboard Handling 9 SaaS development companies worth knowing (a technical look) Material Nova — The Best VS Code Theme of 2026 Inference Routing Is Becoming an Infrastructure Placement Problem I just build a League MBTI Analytics Why I Built My Own Site with Astro, Not WordPress when I use WordPress for a Living Hello! I'm a balloon artist who started 3D modeling 7 Next.js 16 Caching Bugs That Compile Fine and Break Silently in Production I got tired of writing READMEs so I built a tool that generates them from your GitHub URL FrontGate: a Lightweight Package Proxy for Supply Chain Security Why Your Expense Tracking Architecture Keeps Breaking Stop your AI trading agent from hallucinating technical analysis Breaking the Monorepo Barrier in a Crypto Store for Digital Products Imposter Syndrome Is Something We All Struggle With at Some Point in Our Careers
Rate limiting in web apps: what to protect before picking a library
Juan Torchia · 2026-05-21 · via DEV Community

Rate limiting in web apps: what to protect before picking a library

The right way to protect a Next.js route from abuse is not to start with the middleware. I know that sounds backwards — everyone reaches for npm install upstash-ratelimit before they've thought about what they're actually protecting. But that sequence almost guarantees you'll put the wrong limit in the wrong place.

My thesis is simple: rate limiting isn't a dependency; it's an abuse policy. And a policy requires decisions before code.

If you've ever tuned a threshold "by feel" in production because your logs were showing false positives, you've already lived this problem. This post exists so you don't repeat it.


Rate limiting in Next.js: the order almost nobody follows

The typical sequence: read a tutorial, copy the middleware, tweak the number until people stop complaining. That's not a policy — that's trial and error on real users.

The sequence that actually works starts with four questions before you touch any code:

  1. What asset are you protecting? A login endpoint is not the same as a public search API, which is not the same as an incoming webhook.
  2. What abuse are you expecting? Credential stuffing? Scraping? A bot hammering a form? The expected vector determines the shape of the limit.
  3. What does a false positive cost? Over-limit /api/auth/login and you're locking out real users. Under-limit /api/send-email and you're paying for spam.
  4. How are you going to observe whether the limit is working? Without metrics, you don't have a policy — you have hope.

OWASP puts it plainly in their Authentication Cheat Sheet: defensive controls around authentication should include progressive lockout, attempt logging, and a distinction between credential errors and throttle errors. It doesn't say "install a library." It says "define the expected behavior and measure it."


Where people go wrong: the copy-pasted recipe and its hidden cost

The most common pattern I see in Next.js codebases looks something like this:

// middleware.ts — the classic "I copied it from the docs"
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";

const ratelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  // 10 requests per 10 seconds — why 10? "seemed reasonable"
  limiter: Ratelimit.slidingWindow(10, "10 s"),
});

export async function middleware(request: NextRequest) {
  const ip = request.ip ?? "127.0.0.1";
  const { success } = await ratelimit.limit(ip);

  if (!success) {
    return NextResponse.json({ error: "Too Many Requests" }, { status: 429 });
  }
}

Enter fullscreen mode Exit fullscreen mode

The code works. The problem isn't in the code — it's in what isn't written anywhere:

Problem 1 — Global limit with no route distinction. A middleware applied to matcher: ["/((?!_next).*)", ] limits /api/auth/login and /api/products/search identically. Those are assets with completely different abuse profiles.

Problem 2 — IP as the only key. In Argentina (and anywhere with CGNAT), multiple users share the same public IP. Pure IP-based limiting means your neighbor in the same building can accidentally "DDoS" you without trying.

Problem 3 — No observability. If success = false returns a 429 with no log, you have no idea whether you're blocking a bot or your own test runner firing integration tests.

Problem 4 — No differential cost. Blocking a product search has low cost. Blocking a legitimate login attempt after an IP change (office → home → VPN) has high cost. The threshold can't be the same number for both.

This isn't theoretical. It's the pattern you find when you search "Next.js rate limiting" on GitHub and look at the first ten implementations. Most share the same middleware with no policy behind it.


The decision matrix: what to look at before writing a single line

Before choosing any implementation — Upstash, express-rate-limit, your own Redis counter, or an external WAF — fill out this matrix for every endpoint you want to protect:

┌─────────────────────────┬────────────────┬──────────────────┬────────────────────┬─────────────────────┐
│ Endpoint                │ Asset          │ Expected abuse   │ FP cost (false+)   │ Key granularity     │
├─────────────────────────┼────────────────┼──────────────────┼────────────────────┼─────────────────────┤
│ /api/auth/login         │ User account   │ Credential stuff │ HIGH — real lockout│ IP + username       │
│ /api/contact            │ Email inbox    │ Mass spam        │ MED — UX damage    │ IP + fingerprint    │
│ /api/search             │ Public DB      │ Scraping         │ LOW — search query │ IP (w/ CGNAT warn)  │
│ /api/webhooks/incoming  │ Data pipeline  │ Replay attack    │ LOW — drop it      │ API key + timestamp │
└─────────────────────────┴────────────────┴──────────────────┴────────────────────┴─────────────────────┘

Enter fullscreen mode Exit fullscreen mode

The most ignored column is FP cost. It's the one that tells you whether to err inward (too permissive) or outward (too restrictive) — and which of those is actually more tolerable for that specific asset.

For /api/auth/login, OWASP explicitly recommends progressive lockout strategies with user notification — not a silent 429. That requires business logic, not just middleware.


Deliberate implementation: what a real policy looks like in Next.js

With the matrix filled out, the middleware changes shape:

// lib/rate-limit.ts — explicit policy per asset
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";

const redis = Redis.fromEnv();

// Differentiated policy: each constant documents a decision
export const loginRatelimit = new Ratelimit({
  redis,
  // 5 attempts per minute per IP+username — based on OWASP lockout guidance
  // High FP cost: we prefer a false negative over locking out a real user
  limiter: Ratelimit.fixedWindow(5, "60 s"),
  analytics: true, // observability enabled — non-negotiable
});

export const searchRatelimit = new Ratelimit({
  redis,
  // 100 req/10s per IP — low FP cost, wider margin
  limiter: Ratelimit.slidingWindow(100, "10 s"),
  analytics: true,
});

Enter fullscreen mode Exit fullscreen mode

// app/api/auth/login/route.ts — policy applied with context
import { loginRatelimit } from "@/lib/rate-limit";
import { NextRequest, NextResponse } from "next/server";

export async function POST(request: NextRequest) {
  const body = await request.json();
  const username = body?.username ?? "anon";
  const ip = request.ip ?? "unknown";

  // Composite key: IP + username avoids the CGNAT problem
  // One user under shared CGNAT doesn't affect other distinct users
  const identifier = `login:${ip}:${username}`;

  const { success, limit, remaining, reset } = await loginRatelimit.limit(identifier);

  if (!success) {
    // Explicit log: without this there's no policy, just hope
    console.warn(`[rate-limit] LOGIN blocked — identifier: ${identifier}, reset: ${reset}`);

    return NextResponse.json(
      {
        error: "Too many attempts. Please try again in a few minutes.",
        // Don't expose the exact reset time in production — useful info for attackers
      },
      {
        status: 429,
        headers: {
          "Retry-After": String(Math.ceil((reset - Date.now()) / 1000)),
        },
      }
    );
  }

  // ... authentication logic
}

Enter fullscreen mode Exit fullscreen mode

Two critical differences from the generic middleware: the key is composite (not just IP) and every rejection generates a log. Without the log, there's no feedback loop to tune the policy.

If you're deploying on Railway — which is my current stack for Next.js projects — the console.warn logs go straight to the Railway dashboard with zero extra config. That's enough to start seeing patterns before you need anything more sophisticated.


What this guide can't tell you: things that require your own data

This matters and I'm not going to bury it at the end: the numbers in this post are starting points, not validated values for your use case.

You don't know whether 5 attempts per minute is the right login threshold until you:

  • Measure the real distribution of attempts from legitimate users in your app (a user who forgot their password might try 3–4 times in 30 seconds)
  • Observe how many 429s the limit generates in the first week
  • Check whether integration tests or health checks are hitting the same endpoint

Without that data, any number you pick — including the ones in this post — is an educated guess. The goal here isn't to hand you the threshold; it's to make sure you know which questions to ask before setting it.

Same goes for library choice. Upstash works well with Next.js on Edge Runtime because Redis operates outside the bundle over HTTP. But if you already have your own Redis on Railway, a simple wrapper with ioredis might be plenty. That decision depends on your infrastructure, not a universal benchmark.

If you're interested in wiring up deeper observability in Next.js, the post on OpenTelemetry in Spring Boot where logs say OK and traces show the real problem runs on the same principle: without a trace, diagnosis is guesswork.


Common mistakes that turn a policy into noise

Mistake 1 — Rate limiting without a Retry-After header. RFC 6585 specifies that a 429 should include Retry-After. Without it, the client (or the browser) may retry immediately and amplify the load. I covered this pattern in the Retry isn't free post: the retry cost doesn't show up in your p95 until it's already too late.

Mistake 2 — Applying rate limiting on the client. I see this occasionally: throttle on the frontend to "not overload the API." The client is not a security boundary. Anyone with curl bypasses it instantly.

Mistake 3 — Confusing rate limiting with authentication. A 429 limit doesn't replace credential validation, tokens, or authorization. It reduces the attack surface over time, but it doesn't authenticate anything. They're separate layers, not alternatives.

Mistake 4 — Ignoring the CDN/proxy effect. If your Next.js app sits behind Vercel Edge, Cloudflare, or an nginx, request.ip might return the proxy's IP, not the real client's. You need to read X-Forwarded-For carefully — and verify that the header can't be forged by the client.

// Extract the real client IP with awareness of your stack
function getClientIp(request: NextRequest): string {
  // X-Forwarded-For can have multiple values: "client, proxy1, proxy2"
  // The first is the real client — but only if you trust the proxy setting it
  const forwarded = request.headers.get("x-forwarded-for");
  if (forwarded) {
    return forwarded.split(",")[0].trim();
  }
  return request.ip ?? "unknown";
}

Enter fullscreen mode Exit fullscreen mode


FAQ: real questions about rate limiting in Next.js

Do I need Redis for rate limiting in Next.js?
Not for simple cases, but yes for any deploy running more than one instance. In-memory doesn't work across multiple replicas because each instance has its own counter. If you're on Railway with a single container, in-memory might be enough to start — but it's visible technical debt.

What's the difference between rate limiting and throttling?
Rate limiting rejects requests that exceed a threshold (429 Too Many Requests). Throttling queues or slows them down without rejecting. For abuse protection, rate limiting is more predictable. Throttling has its place in processing queues, not in public APIs.

Should I put rate limiting in middleware or in each route handler?
Depends on the granularity you need. Global middleware is convenient but applies the same policy to everything. Route handlers give you fine-grained control per asset. The decision matrix above should guide that call — if all your assets have the same abuse profile, the middleware is fine.

What about bots that rotate IPs?
IP-based rate limiting alone doesn't hold up against sophisticated bots with IP rotation. For that vector, you need browser fingerprinting (TLS JA3, user-agent patterns, behavior analysis) or a dedicated WAF. That's a different scope than this post — and honestly, if you've hit that problem, you need more than a Node library.

Is Upstash Ratelimit the only option for Next.js on Edge Runtime?
No. Upstash works well because its Redis client is HTTP-based and Edge-compatible. But you can also use @vercel/kv if you're on Vercel, or a Cloudflare Worker with KV if you're on Cloudflare Workers. The technical constraint is that Edge Runtime doesn't support TCP sockets — any solution has to speak HTTP for storage.

How do I know if my rate limit is calibrated correctly?
Look at the distribution of 429s in the first 7 days after activation. If the 429s are coming from unique IPs/identifiers you've never seen before → the limit is catching abuse. If 429s are coming from recurring IPs that also have successful requests → likely false positive. Without analytics enabled in the library and without logs, you can't answer this question at all.


My take: the library is an implementation detail

Rate limiting in Next.js web apps is a topic where 80% of the work is technical decision-making and 20% is code. Almost everything written about it does the inverse.

I don't buy the argument that "any rate limiting is better than none." A badly calibrated limit on a login endpoint can systematically lock out real users — and that damage is measurable and completely silent if you have no observability.

What I do buy: defining the policy before the code forces you to ask questions the generic middleware never asks you. What asset? What abuse? What's the cost if I get it wrong in the restrictive direction? Those three questions change the threshold, the key granularity, and the rejection behavior.

Concrete next step: take the most critical endpoint in your app (probably login or registration), fill out the matrix row for that asset, and then write the limit. If you want to see the same thinking applied to Server Actions with Prisma, the post on Prisma Server Actions in Next.js and the N+1 that appears when you least expect it follows the same pattern: diagnose before you solve.

And if the endpoint you're protecting handles sensitive data, the post on useEffect and state synchronization in React 19 is a good reminder that abstractions that simplify can also hide unexpected behavior — applies equally to middleware.


Primary source:


This article was originally published on juanchi.dev