Rate limiting in web apps: what to protect before picking a library
The right way to protect a Next.js route from abuse is not to start with the middleware. I know that sounds backwards — everyone reaches for npm install upstash-ratelimit before they've thought about what they're actually protecting. But that sequence almost guarantees you'll put the wrong limit in the wrong place.
My thesis is simple: rate limiting isn't a dependency; it's an abuse policy. And a policy requires decisions before code.
If you've ever tuned a threshold "by feel" in production because your logs were showing false positives, you've already lived this problem. This post exists so you don't repeat it.
Rate limiting in Next.js: the order almost nobody follows
The typical sequence: read a tutorial, copy the middleware, tweak the number until people stop complaining. That's not a policy — that's trial and error on real users.
The sequence that actually works starts with four questions before you touch any code:
- What asset are you protecting? A login endpoint is not the same as a public search API, which is not the same as an incoming webhook.
- What abuse are you expecting? Credential stuffing? Scraping? A bot hammering a form? The expected vector determines the shape of the limit.
-
What does a false positive cost? Over-limit
/api/auth/loginand you're locking out real users. Under-limit/api/send-emailand you're paying for spam. - How are you going to observe whether the limit is working? Without metrics, you don't have a policy — you have hope.
OWASP puts it plainly in their Authentication Cheat Sheet: defensive controls around authentication should include progressive lockout, attempt logging, and a distinction between credential errors and throttle errors. It doesn't say "install a library." It says "define the expected behavior and measure it."
Where people go wrong: the copy-pasted recipe and its hidden cost
The most common pattern I see in Next.js codebases looks something like this:
// middleware.ts — the classic "I copied it from the docs"
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";
const ratelimit = new Ratelimit({
redis: Redis.fromEnv(),
// 10 requests per 10 seconds — why 10? "seemed reasonable"
limiter: Ratelimit.slidingWindow(10, "10 s"),
});
export async function middleware(request: NextRequest) {
const ip = request.ip ?? "127.0.0.1";
const { success } = await ratelimit.limit(ip);
if (!success) {
return NextResponse.json({ error: "Too Many Requests" }, { status: 429 });
}
}
The code works. The problem isn't in the code — it's in what isn't written anywhere:
Problem 1 — Global limit with no route distinction. A middleware applied to matcher: ["/((?!_next).*)", ] limits /api/auth/login and /api/products/search identically. Those are assets with completely different abuse profiles.
Problem 2 — IP as the only key. In Argentina (and anywhere with CGNAT), multiple users share the same public IP. Pure IP-based limiting means your neighbor in the same building can accidentally "DDoS" you without trying.
Problem 3 — No observability. If success = false returns a 429 with no log, you have no idea whether you're blocking a bot or your own test runner firing integration tests.
Problem 4 — No differential cost. Blocking a product search has low cost. Blocking a legitimate login attempt after an IP change (office → home → VPN) has high cost. The threshold can't be the same number for both.
This isn't theoretical. It's the pattern you find when you search "Next.js rate limiting" on GitHub and look at the first ten implementations. Most share the same middleware with no policy behind it.
The decision matrix: what to look at before writing a single line
Before choosing any implementation — Upstash, express-rate-limit, your own Redis counter, or an external WAF — fill out this matrix for every endpoint you want to protect:
┌─────────────────────────┬────────────────┬──────────────────┬────────────────────┬─────────────────────┐
│ Endpoint │ Asset │ Expected abuse │ FP cost (false+) │ Key granularity │
├─────────────────────────┼────────────────┼──────────────────┼────────────────────┼─────────────────────┤
│ /api/auth/login │ User account │ Credential stuff │ HIGH — real lockout│ IP + username │
│ /api/contact │ Email inbox │ Mass spam │ MED — UX damage │ IP + fingerprint │
│ /api/search │ Public DB │ Scraping │ LOW — search query │ IP (w/ CGNAT warn) │
│ /api/webhooks/incoming │ Data pipeline │ Replay attack │ LOW — drop it │ API key + timestamp │
└─────────────────────────┴────────────────┴──────────────────┴────────────────────┴─────────────────────┘
The most ignored column is FP cost. It's the one that tells you whether to err inward (too permissive) or outward (too restrictive) — and which of those is actually more tolerable for that specific asset.
For /api/auth/login, OWASP explicitly recommends progressive lockout strategies with user notification — not a silent 429. That requires business logic, not just middleware.
Deliberate implementation: what a real policy looks like in Next.js
With the matrix filled out, the middleware changes shape:
// lib/rate-limit.ts — explicit policy per asset
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";
const redis = Redis.fromEnv();
// Differentiated policy: each constant documents a decision
export const loginRatelimit = new Ratelimit({
redis,
// 5 attempts per minute per IP+username — based on OWASP lockout guidance
// High FP cost: we prefer a false negative over locking out a real user
limiter: Ratelimit.fixedWindow(5, "60 s"),
analytics: true, // observability enabled — non-negotiable
});
export const searchRatelimit = new Ratelimit({
redis,
// 100 req/10s per IP — low FP cost, wider margin
limiter: Ratelimit.slidingWindow(100, "10 s"),
analytics: true,
});
// app/api/auth/login/route.ts — policy applied with context
import { loginRatelimit } from "@/lib/rate-limit";
import { NextRequest, NextResponse } from "next/server";
export async function POST(request: NextRequest) {
const body = await request.json();
const username = body?.username ?? "anon";
const ip = request.ip ?? "unknown";
// Composite key: IP + username avoids the CGNAT problem
// One user under shared CGNAT doesn't affect other distinct users
const identifier = `login:${ip}:${username}`;
const { success, limit, remaining, reset } = await loginRatelimit.limit(identifier);
if (!success) {
// Explicit log: without this there's no policy, just hope
console.warn(`[rate-limit] LOGIN blocked — identifier: ${identifier}, reset: ${reset}`);
return NextResponse.json(
{
error: "Too many attempts. Please try again in a few minutes.",
// Don't expose the exact reset time in production — useful info for attackers
},
{
status: 429,
headers: {
"Retry-After": String(Math.ceil((reset - Date.now()) / 1000)),
},
}
);
}
// ... authentication logic
}
Two critical differences from the generic middleware: the key is composite (not just IP) and every rejection generates a log. Without the log, there's no feedback loop to tune the policy.
If you're deploying on Railway — which is my current stack for Next.js projects — the console.warn logs go straight to the Railway dashboard with zero extra config. That's enough to start seeing patterns before you need anything more sophisticated.
What this guide can't tell you: things that require your own data
This matters and I'm not going to bury it at the end: the numbers in this post are starting points, not validated values for your use case.
You don't know whether 5 attempts per minute is the right login threshold until you:
- Measure the real distribution of attempts from legitimate users in your app (a user who forgot their password might try 3–4 times in 30 seconds)
- Observe how many 429s the limit generates in the first week
- Check whether integration tests or health checks are hitting the same endpoint
Without that data, any number you pick — including the ones in this post — is an educated guess. The goal here isn't to hand you the threshold; it's to make sure you know which questions to ask before setting it.
Same goes for library choice. Upstash works well with Next.js on Edge Runtime because Redis operates outside the bundle over HTTP. But if you already have your own Redis on Railway, a simple wrapper with ioredis might be plenty. That decision depends on your infrastructure, not a universal benchmark.
If you're interested in wiring up deeper observability in Next.js, the post on OpenTelemetry in Spring Boot where logs say OK and traces show the real problem runs on the same principle: without a trace, diagnosis is guesswork.
Common mistakes that turn a policy into noise
Mistake 1 — Rate limiting without a Retry-After header. RFC 6585 specifies that a 429 should include Retry-After. Without it, the client (or the browser) may retry immediately and amplify the load. I covered this pattern in the Retry isn't free post: the retry cost doesn't show up in your p95 until it's already too late.
Mistake 2 — Applying rate limiting on the client. I see this occasionally: throttle on the frontend to "not overload the API." The client is not a security boundary. Anyone with curl bypasses it instantly.
Mistake 3 — Confusing rate limiting with authentication. A 429 limit doesn't replace credential validation, tokens, or authorization. It reduces the attack surface over time, but it doesn't authenticate anything. They're separate layers, not alternatives.
Mistake 4 — Ignoring the CDN/proxy effect. If your Next.js app sits behind Vercel Edge, Cloudflare, or an nginx, request.ip might return the proxy's IP, not the real client's. You need to read X-Forwarded-For carefully — and verify that the header can't be forged by the client.
// Extract the real client IP with awareness of your stack
function getClientIp(request: NextRequest): string {
// X-Forwarded-For can have multiple values: "client, proxy1, proxy2"
// The first is the real client — but only if you trust the proxy setting it
const forwarded = request.headers.get("x-forwarded-for");
if (forwarded) {
return forwarded.split(",")[0].trim();
}
return request.ip ?? "unknown";
}
FAQ: real questions about rate limiting in Next.js
Do I need Redis for rate limiting in Next.js?
Not for simple cases, but yes for any deploy running more than one instance. In-memory doesn't work across multiple replicas because each instance has its own counter. If you're on Railway with a single container, in-memory might be enough to start — but it's visible technical debt.
What's the difference between rate limiting and throttling?
Rate limiting rejects requests that exceed a threshold (429 Too Many Requests). Throttling queues or slows them down without rejecting. For abuse protection, rate limiting is more predictable. Throttling has its place in processing queues, not in public APIs.
Should I put rate limiting in middleware or in each route handler?
Depends on the granularity you need. Global middleware is convenient but applies the same policy to everything. Route handlers give you fine-grained control per asset. The decision matrix above should guide that call — if all your assets have the same abuse profile, the middleware is fine.
What about bots that rotate IPs?
IP-based rate limiting alone doesn't hold up against sophisticated bots with IP rotation. For that vector, you need browser fingerprinting (TLS JA3, user-agent patterns, behavior analysis) or a dedicated WAF. That's a different scope than this post — and honestly, if you've hit that problem, you need more than a Node library.
Is Upstash Ratelimit the only option for Next.js on Edge Runtime?
No. Upstash works well because its Redis client is HTTP-based and Edge-compatible. But you can also use @vercel/kv if you're on Vercel, or a Cloudflare Worker with KV if you're on Cloudflare Workers. The technical constraint is that Edge Runtime doesn't support TCP sockets — any solution has to speak HTTP for storage.
How do I know if my rate limit is calibrated correctly?
Look at the distribution of 429s in the first 7 days after activation. If the 429s are coming from unique IPs/identifiers you've never seen before → the limit is catching abuse. If 429s are coming from recurring IPs that also have successful requests → likely false positive. Without analytics enabled in the library and without logs, you can't answer this question at all.
My take: the library is an implementation detail
Rate limiting in Next.js web apps is a topic where 80% of the work is technical decision-making and 20% is code. Almost everything written about it does the inverse.
I don't buy the argument that "any rate limiting is better than none." A badly calibrated limit on a login endpoint can systematically lock out real users — and that damage is measurable and completely silent if you have no observability.
What I do buy: defining the policy before the code forces you to ask questions the generic middleware never asks you. What asset? What abuse? What's the cost if I get it wrong in the restrictive direction? Those three questions change the threshold, the key granularity, and the rejection behavior.
Concrete next step: take the most critical endpoint in your app (probably login or registration), fill out the matrix row for that asset, and then write the limit. If you want to see the same thinking applied to Server Actions with Prisma, the post on Prisma Server Actions in Next.js and the N+1 that appears when you least expect it follows the same pattern: diagnose before you solve.
And if the endpoint you're protecting handles sensitive data, the post on useEffect and state synchronization in React 19 is a good reminder that abstractions that simplify can also hide unexpected behavior — applies equally to middleware.
Primary source:
- OWASP Authentication Cheat Sheet (Rate Limiting and Lockout): https://cheatsheetseries.owasp.org/cheatsheets/Authentication_Cheat_Sheet.html
This article was originally published on juanchi.dev

























