






















Your Node.js app works fine in development. You hit localhost:3000 directly, everything is snappy, and you never think about reverse proxies.
Then you deploy.
Clients timeout. WebSocket connections drop after 60 seconds. Logs show client IPs as 10.0.0.1 instead of the real user IP. And that one endpoint that uploads a 15 MB file gets a 413 error that takes three hours to debug.
Every one of these is a Nginx misconfiguration, not a code bug.
This guide covers the five Nginx patterns that make the difference between a proxy that silently degrades your app and one that actively protects it. Each pattern includes the exact config you can copy, the reasoning behind it, and the failure mode it prevents.
The default Nginx proxying behavior reuses HTTP/1.0 connections to the backend. That means your Node.js server opens a new TCP connection for every proxied request, adds TLS handshake overhead (if you terminate TLS at the proxy), and burns through file descriptors under load.
The fix is an upstream keepalive pool.
http {
upstream app {
server 127.0.0.1:3000;
# Keep up to 256 idle connections to Node.js
keepalive 256;
# Only evict idle connections when the pool is full (not per single request)
keepalive_timeout 120s;
# Max requests per connection before Nginx recycles it
keepalive_requests 10000;
}
server {
listen 80;
location / {
proxy_pass http://app;
# Tell Nginx to speak HTTP/1.1 to the backend (required for keepalive)
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
}
Three things happen here:
proxy_http_version 1.1 changes Nginx from HTTP/1.0 to HTTP/1.1 when talking to your Node backend. HTTP/1.0 does not support keepalive by default.proxy_set_header Connection "" strips Nginx’s own hop-by-hop Connection header so the backend does not close the socket after the first response.keepalive 256 keeps a pool of reusable connections. Your Node process handles requests without the TCP handshake tax on every one.The failure mode without this: Under load, your Node.js process opens and closes connections constantly. ss -s shows thousands of TIME_WAIT sockets. The event loop spends more time on socket lifecycle than actual request handling. Connection-pooling databases like PgBouncer or your Redis client see connection storms because every proxied request arrives on a fresh TCP stream.
Benchmark: A simple Express hello-world behind Nginx with default settings handles about 3,000 req/s on a 2-core machine. With upstream keepalive (pool of 64), that same setup hits 8,000+ req/s. The TCP handshake overhead is not free.
Nginx buffers responses from the backend by default. That is good — it lets Nginx read the full response from Node quickly and trickle it to slow clients without tying up your Node process. But the default buffer sizes are too small for API responses, and the client-side buffer settings are too generous for request bodies.
Here is the tuned config:
http {
proxy_buffering on;
# Response buffer: size per buffer and number of buffers
proxy_buffer_size 4k;
proxy_buffers 8 16k;
proxy_busy_buffers_size 32k;
# Request body: reject oversized bodies before they reach Node
client_body_buffer_size 16k;
client_max_body_size 10m;
client_body_timeout 30s;
}
The response buffers let Nginx read an entire API response (up to 128 KB across 8 buffers) from Node in one go, then serve it to the client at whatever speed the client can handle. Your Node process goes back to handling other requests instead of sitting idle waiting for a hotel WiFi client to acknowledge each TCP segment.
The critical setting few teams tune: client_body_buffer_size and client_max_body_size.
client_max_body_size 10m rejects request bodies above 10 MB with a 413 response, before Nginx sends a single byte to Node.js. Without this, a 2 GB upload request consumes memory and I/O on your backend until the timeout fires.client_body_buffer_size 16k keeps small request bodies in memory (fast path) and spills larger ones to temp files (slow path). If your API handles JSON payloads under 16 KB, all of them stay in RAM.The failure mode without this: A single slow client with a small receive window makes your Node process block on response.write() for seconds. If you have 500 concurrent slow clients, every process in your cluster is busy writing bytes to the network instead of running your application logic. The proxy_buffers config decouples response generation from response delivery.
WebSocket connections start as HTTP upgrades. Nginx handles this with the upgrade header dance. But the default proxy timeouts kill idle WebSocket connections after 60 seconds, which breaks any real-time feature that keeps a connection open for longer.
map $http_upgrade $connection_upgrade {
default upgrade;
'' close;
}
server {
location /ws {
proxy_pass http://app;
proxy_http_version 1.1;
# WebSocket upgrade headers
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
# Kill the default 60s proxy timeout
proxy_read_timeout 86400s;
proxy_send_timeout 86400s;
# Optional: enable HTTP/2 for the WebSocket connection
# (Nginx 1.25+ supports WebSocket over HTTP/2)
# http2 on;
}
}
The map block handles the Connection header correctly for both regular and WebSocket requests. Without it, Nginx sends Connection: upgrade on every request, not just WebSocket upgrades.
The timeouts are the important part. Nginx defaults proxy_read_timeout to 60 seconds. If your WebSocket sends no data for 61 seconds (common in stock ticker apps, chat rooms, or dashboard UIs), Nginx closes the connection. Your client library fires a reconnect event, the app feels flaky, and someone files a bug titled “connection drops randomly.”
Setting both timeouts to 86400 seconds (24 hours) effectively removes the timeout. Your application handles disconnect logic instead.
The failure mode without this: WebSocket connections drop at exactly 60 seconds of idle time. You add a ping-pong heartbeat to your client code, but the real fix was the Nginx timeout. Teams waste days on this because they assume the disconnect is in the application layer.
When Nginx proxies requests, req.ip in Express or Fastify shows Nginx’s IP (usually 127.0.0.1), not the client’s real IP. Every log line, rate limiter, and geo-IP middleware returns wrong data.
The fix is the X-Real-IP and X-Forwarded-For headers, combined with the realip module.
server {
# Trust X-Forwarded-For from the proxy
set_real_ip_from 127.0.0.1;
set_real_ip_from 10.0.0.0/8;
set_real_ip_from 172.16.0.0/12;
set_real_ip_from 192.168.0.0/16;
real_ip_header X-Forwarded-For;
real_ip_recursive on;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Log format that actually helps debugging
log_format detailed '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent" '
'rt=$request_time uct=$upstream_connect_time '
'uht=$upstream_header_time urt=$upstream_response_time '
'upstream=$upstream_addr';
access_log /var/log/nginx/app-access.log detailed;
error_log /var/log/nginx/app-error.log warn;
}
The set_real_ip_from directives tell Nginx which IPs are trusted proxies. When a request arrives from one of those IPs with a X-Forwarded-For header, Nginx replaces $remote_addr with the real client IP. The real_ip_recursive on setting handles chains of proxies by taking the rightmost trusted IP (the actual client).
The custom log_format includes upstream timing variables:
upstream_connect_time — how long Nginx took to connect to your Node processupstream_header_time — time until Nginx received the first response byteupstream_response_time — total time to receive the full responseThese three values tell you exactly where time is spent. If upstream_connect_time is high, your connection pool is too small or your Node process is saturated. If upstream_header_time is high but upstream_connect_time is low, your Node process is slow to generate the first byte (maybe waiting on a database query). If upstream_response_time is high but upstream_header_time is low, the response body is large or the client is slow.
The failure mode without this: Rate limiters use 127.0.0.1 as the client key, so all traffic maps to one bucket and the rate limit is never enforced. Geo-IP middleware returns the datacenter location. Logs show internal IPs, making incident debugging useless.
If you run multiple Node.js processes (via cluster, PM2, or multiple containers), Nginx distributes traffic across them. The default round-robin is rarely the best choice for Node.js.
upstream app {
# least_conn: send to the backend with fewest active connections
least_conn;
# Option A: multiple processes on one machine (cluster mode)
server 127.0.0.1:3001;
server 127.0.0.1:3002;
server 127.0.0.1:3003;
server 127.0.0.1:3004;
# Option B: multiple containers
# server app1:3000;
# server app2:3000;
# server app3:3000;
keepalive 256;
}
server {
# Passive health checks: Nginx stops sending to a failing backend
# after n failures within a time window
location / {
proxy_pass http://app;
# Failures before marking backend as down
proxy_next_upstream_tries 3;
# Time window for the failure count
proxy_next_upstream_timeout 10s;
# Which failures count
proxy_next_upstream error timeout http_500 http_502 http_503;
}
# Active health check (requires nginx-plus or a hack with status endpoint)
# Without nginx-plus, point Docker/K8s probes at the backend directly
}
least_conn matters because not all requests are equal. A request that hits a slow database query holds a connection for 2 seconds. Round-robin keeps sending new requests to that same backend. least_conn sends new requests to the backend with fewer active connections, which naturally balances by current load, not by request count.
The proxy_next_upstream settings tell Nginx to try another backend if the first one returns a 5xx error, times out, or drops the connection. Without this, a single backend crash takes down every request routed to it until you restart it.
The failure mode without this: Round-robin sends a burst of slow requests to process 1 while processes 2-4 sit idle. The event loop on process 1 lags. Health checks start failing. The orchestration layer restarts the “healthy” processes 2-4 unnecessarily while process 1 struggles. Connection queuing at Nginx builds up and users see 502 errors.
Here is a single Nginx config that combines all five patterns:
worker_processes auto;
worker_rlimit_nofile 65535;
events {
worker_connections 4096;
multi_accept on;
use epoll;
}
http {
include mime.types;
default_type application/octet-stream;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
# Log format with upstream timing
log_format detailed '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent" '
'rt=$request_time uct=$upstream_connect_time '
'uht=$upstream_header_time urt=$upstream_response_time '
'upstream=$upstream_addr';
access_log /var/log/nginx/access.log detailed;
error_log /var/log/nginx/error.log warn;
# Upstream keepalive (Pattern 1)
upstream app {
least_conn; # Pattern 5
server 127.0.0.1:3000;
keepalive 256;
keepalive_timeout 120s;
keepalive_requests 10000;
}
# WebSocket upgrade map (Pattern 3)
map $http_upgrade $connection_upgrade {
default upgrade;
'' close;
}
server {
listen 80;
server_name api.example.com;
# Real-IP (Pattern 4)
set_real_ip_from 10.0.0.0/8;
set_real_ip_from 172.16.0.0/12;
set_real_ip_from 192.168.0.0/16;
real_ip_header X-Forwarded-For;
real_ip_recursive on;
# Buffer tuning (Pattern 2)
client_body_buffer_size 16k;
client_max_body_size 10m;
client_body_timeout 30s;
proxy_buffering on;
proxy_buffer_size 4k;
proxy_buffers 8 16k;
proxy_busy_buffers_size 32k;
# Generic proxy headers
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Keepalive to backend (Pattern 1)
proxy_http_version 1.1;
proxy_set_header Connection "";
# Timeouts
proxy_connect_timeout 5s;
proxy_send_timeout 10s;
proxy_read_timeout 30s;
# Retry logic (Pattern 5)
proxy_next_upstream_tries 3;
proxy_next_upstream_timeout 10s;
proxy_next_upstream error timeout http_500 http_502 http_503;
# API routes
location / {
proxy_pass http://app;
}
# WebSocket endpoint (Pattern 3)
location /ws {
proxy_pass http://app;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
proxy_read_timeout 86400s;
proxy_send_timeout 86400s;
}
# Health check endpoint -- bypass load balancing
location /health {
proxy_pass http://app;
proxy_next_upstream off;
access_log off;
}
}
}
Before you push this to production, validate it:
# Syntax check
nginx -t
# Reload without dropping connections
nginx -s reload
# Verify upstream keepalive is working
curl -I http://localhost/health
Then run a load test to confirm the keepalive pool is active:
# Watch upstream connection reuse
ss -tan | grep 3000
# Run a quick benchmark
wrk -t4 -c100 -d30s http://localhost/
Compare TIME_WAIT socket counts before and after the keepalive config. If you see hundreds of TIME_WAIT entries after the config, something is wrong — Nginx is still not using persistent connections to the backend.
Not every setup needs a bespoke Nginx config. If your app runs on Kubernetes with a service mesh (like Istio or Linkerd), the sidecar proxy handles most of these patterns (keepalive, retries, timeouts, load balancing). Running Nginx as an additional layer inside the mesh adds complexity without benefit.
If you run a single-region, single-instance API with fewer than 100 req/s, the default Nginx config in the official Docker image is good enough. Apply these patterns when you see concrete symptoms: slow clients causing event loop lag, WebSocket drops, or upstream connection exhaustion.
Nginx is the most boring part of your stack until it is not. The defaults are optimized for static file serving, not for proxying long-lived Node.js API connections. A few targeted config changes — upstream keepalive, response buffering, WebSocket timeouts, real-IP forwarding, and least-conn load balancing — turn Nginx from a passive pass-through into an active reliability layer.
Your Node.js app can handle more traffic with less memory and fewer timeouts when the proxy in front of it stops working against you.
Teams that build and deploy Node.js applications at scale often invest in this kind of infrastructure hygiene as a baseline, not an afterthought. Yojji, for example, treats reverse proxy configuration as part of the core delivery checklist in backend-heavy projects where every millisecond and every dropped connection counts toward the user experience.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。