File Descriptor Exhaustion: The Kernel Limit That Silently Drops Node.js Connections

The Practical Developer

The Libuv Thread Pool Trap: Why Node.js Async APIs Stall Under Load Postgres Covering Indexes with INCLUDE: Eliminate Heap Fetches on Read-Heavy Workloads Postgres DISTINCT ON: The Fastest Way to Get the Latest Row Per Group Postgres Transaction Isolation: The Anomalies Your App Actually Faces in Production Linux TCP Tuning for Node.js Microservices: The Kernel Settings That Stop Silent Connection Drops Under Load Postgres HOT Updates and Fillfactor: Why Not All Writes Are Created Equal Database Connection Pool Leaks: Finding the Promise That Never Returns Its Seat Linux OOM Killer in Production: Why Your Node.js Containers Die Without a Stack Trace Postgres Materialized Views: Refresh Strategies That Do Not Lock Your Dashboards API Dependency Health Checks: Why /health Is Not Enough Authorization with Zanzibar Tuples: How Google Manages Permissions and How To Build the Same Check in Node.js Postgres Advisory Locks: The 20-Character Primitive That Replaces Redis for Coordination Dead Letter Queues: The Message Queue Pattern That Saves You at 2 a.m. Graceful Degradation: The Pattern That Turns Total Outages into Partial Success PostgreSQL Full-Text Search: Dropping Elasticsearch for 90% of Use Cases S3 Presigned Multipart Uploads: Stop Your API Server from Being a File Upload Bottleneck MessagePack vs JSON: The Binary Serialization Switch That Cut Our Internal RPC Overhead by 40% DNS Caching in Node.js: The Silent Cause of Production Latency Spikes Reliable Cron Jobs: The Pattern That Stops Double Runs, Missed Executions, And The 2 AM Page GraphQL Query Complexity: Stop the OOM Query Before It Reaches Your Resolver Node.js Event Loop Lag: The Hidden Metric Behind Random Latency Spikes API Request Validation with Zod: The Schema That Catches Bad Input Before It Corrupts Your Database Load Shedding in Node.js: How to Reject Traffic Before You Drown Request Hedging: Cut Tail Latency In Half Without Overprovisioning Git Bisect: The Automated Binary Search That Finds Breaking Commits in Minutes Node.js Garbage Collection Tuning: Stop Letting V8 Pause Your Event Loop Node.js Server Timeouts: The Settings That Stop Slow Clients from Holding Sockets Hostage Postgres BRIN Indexes: The Time-Series Secret That Shrinks Indexes by 99% Event Sourcing with PostgreSQL: The Pragmatic 80% Solution Node.js Cluster Mode: Scaling the Event Loop Across CPU Cores Postgres Partial Indexes: Stopping Soft Deletes from Ruining Your Query Performance Request Coalescing with the Singleflight Pattern: Stop Drowning Your Database on Every Cache Miss The Bulkhead Pattern: Why One Slow Endpoint Should Not Drown Your Whole Service Node.js AsyncLocalStorage: End-to-End Request Context Without the Propagation Hell Postgres Deadlocks: Logging the Victim, Reproducing the Race, and Fixing the Lock Order Your Node.js HTTP Client Is the Bottleneck: Connection Pool Tuning That Works Optimistic Locking in Postgres: Stop Losing Data to Race Conditions Postgres Read Replicas: Stop Serving Stale Data to Your Users Cursor Pagination: Why Offset Queries Explode at Scale and How to Fix Them Node.js Worker Threads: 60 Lines That Stop a CSV Upload from Timing Out Every Other Request Reliable Webhook Delivery: Architecture for Outbound HTTP You Can Trust Request Timeouts and Deadline Propagation: Stop the Chain of Slowness Advanced Security Practices in Node.js Graceful Shutdown in Node.js: The 40 Lines That Stop 502s During Deploys Finding Node.js Memory Leaks with Heap Snapshots Idempotency Keys in 30 Lines: Stop Your Webhook From Charging Customers Twice Backpressure In Node.js: The Fix For Slow-Motion Queue Meltdowns Retries Done Right: Jitter, Budgets, and the Stampede You Did Not See Coming The Cache Stampede: Why Your "Just Add Redis" Layer Crashes Postgres at 3 a.m. Postgres SKIP LOCKED: An 80-Line Job Queue You Can Run Without Redis Stop Doing Work Nobody Wants: AbortController in Node.js, Done Right The N+1 Query Problem: We Found 23 In One Codebase And Killed Every One I Tried 5 AI Coding Tools for a Month. Here Is What I Actually Use CI/CD From Zero to Production in 30 Minutes With GitHub Actions Node.js vs Bun vs Deno: Which Runtime Should You Pick in 2025? Kubernetes Resource Requests And Limits: The Numbers That Decide If Your Cluster Is Stable The Three Pillars of Observability Are A Myth: What Actually Matters In Production pnpm Vs npm Vs yarn Vs Bun For Monorepos: Which One Earns The Migration In 2024 JSONB Indexing In Postgres: GIN Vs Expression Indexes, And When Each Is The Right Choice A Code Review Checklist That Ends The Same Three Arguments Every Sprint gRPC Vs REST In 2024: When The Switch Pays For Itself React Suspense For Data Fetching: The Pattern That Replaces Half Your Loading State Code The Five-Stage Rollout: How To Ship A Risky Change Without Holding Your Breath GitHub Actions In A Monorepo: Caching, Path Filters, And Secret Boundaries That Actually Work The Blameless Postmortem That Actually Improves Things: A Template And Six Hard-Won Rules Recursive CTEs In Postgres: How To Query A Tree Without N Round Trips Node.js Streams: When They Actually Help, And When They Just Add Complexity Playwright Vs Cypress In 2024: The Honest Comparison Of Which One Earns The Test Time React Server Components: The Mental Model That Makes The "use client" Boundary Obvious Pod Disruption Budgets: The K8s Object That Keeps Your Service Up During Cluster Maintenance Postgres LISTEN/NOTIFY: The Pub/Sub You Already Have And Are Not Using Chaos Engineering Starter Kit: The Five Drills That Don't Need Netflix-Scale Spec-Driven API Development With OpenAPI: How To Stop Drifting From Your Docs Saga Pattern vs Two-Phase Commit: Distributed Transactions Without The Lies Kubernetes Autoscaling Beyond CPU: The Custom-Metric HPA Pattern That Actually Works Postgres Partitioning For Time-Series: The Boring Setup That Saves Your Database Distributed Locks With Redis: An Honest Look At Redlock And When You Don't Need It HTTP/2 vs HTTP/3: What Actually Changes For Your App, And What Doesn't Image Optimization For The Web In 2023: srcset, AVIF, And The Lighthouse Score You Actually Want Kafka vs RabbitMQ: A Decision Tree That Doesn't Hate You UUID vs Bigint Primary Keys In Postgres: The Index Math That Decides For You Flame Graphs: How To Find The Slow Function In 30 Seconds Without Profiling Theatre Postgres Streaming Vs. Logical Replication: Which One Solves Your Actual Problem ESLint Rules That Earn Their Keep: The Twelve I Enable On Every Project Pre-Commit Hooks That Pay For Themselves: Husky, lint-staged, And The Five Rules That Stick Zero-Downtime Database Migrations: The Six-Step Pattern That Rules Them All Circuit Breakers In Node.js: 50 Lines That Stop A Failing Dependency From Taking Down Your Service Postgres VACUUM Is Not Magic: How Your Hot Table Bloats To 80GB And How To Fix It Kubernetes Liveness And Readiness Probes: The Difference That Causes Half Your Outages Rate Limiting In Production: A Token Bucket In 30 Lines Of Redis The Outbox Pattern: How To Stop Losing Events When Postgres And Kafka Disagree Load Testing With k6: The Three Scenarios That Find Real Bugs (Not Synthetic Numbers) Postgres Row-Level Security For Multi-Tenant Apps: The Pattern That Stops You From Leaking Data Rebase vs. Merge: The Team Policy That Ends The Argument Forever OpenTelemetry in Node.js: Distributed Tracing That Actually Helps During an Incident Feature Flags That Pay Rent: The 4 Flag Types And When To Delete Each ETag, Last-Modified, and the Caching Headers Most APIs Get Wrong Connection Pooling Without the Cargo Cult: pgbouncer in 100 Lines of Config JSONB Is Not a Schema: When To Reach For It in Postgres, And When To Stop Bash Strict Mode: The Three Lines That Stop Your Deploy Script From Lying To You

The Practica · 2026-05-23 · via The Practical Developer

The 3 a.m. page was blunt: “Customers reporting intermittent connection failures.” The load balancer showed all targets healthy. CPU usage on the Node.js pods was under 15%. Memory was flat. There were no application errors in the logs. Yet every few minutes a burst of requests failed with ECONNREFUSED before they ever reached our HTTP handlers.

We scaled the deployment. We restarted the pods. We blamed the cloud provider’s load balancer. Then one engineer ran lsof -p $(pgrep -f "node server.js") | wc -l on a pod and saw the number: 65,536 open file descriptors. The soft limit was 65,536. The process had hit the ceiling. Every new inbound TCP connection needed a new fd. Every new Postgres checkout needed a new fd. The kernel said no. The connections were refused at the syscall layer, before any of our code saw them.

This is file descriptor exhaustion, and it is one of the nastiest silent failures in production backend work. It does not crash your process. It does not log a stack trace. It just drops traffic. This post covers how fds are consumed, how to diagnose the leak, how to raise the limit without creating a new problem, and how to monitor fd usage so you catch it before the pager goes off.

Why file descriptors matter more than you think

In Linux, everything is a file. TCP sockets, Unix domain sockets, pipes, actual files on disk, epoll instances, and eventfd objects all consume a file descriptor. Each is just an integer index into the process’s file descriptor table. When that table is full, any syscall that needs a new fd (socket, open, accept, pipe, epoll_create) returns EMFILE (too many open files for this process) or ENFILE (too many open files in the system).

A Node.js server in a microservices architecture can open fds in surprising quantities:

Inbound client connections: One fd per active HTTP connection. With keepalive, browsers and mobile clients hold these open for seconds or minutes.
Database connection pools: A pg.Pool with a max of 100 holds 100 fds to Postgres. If you have two databases (primary and read replica), that is 200.
Redis connections: One persistent connection per ioredis instance, plus one per BullMQ queue worker, plus sentinel connections if you use Sentinel.
Outbound HTTP agents: Whether you use http.Agent, undici.Pool, or axios, keepalive connections to downstream services hold fds open.
Files and pipes: Log file streams, child process stdio, temporary file uploads, and fs.watch instances all count.
Event loop internals: epoll_wait creates an epoll fd. async_hooks and some diagnostic tools add more.

Add these up on a busy pod. If you have 1,000 concurrent inbound clients, a database pool of 200, a Redis connection count of 50, an outbound HTTP agent juggling 100 connections to three downstreams, and a handful of log streams, you are already north of 1,500 fds. That sounds modest, but the default soft limit on many Linux distributions is 1,024. Ubuntu 22.04 and Debian 12 raise it to 1,024 soft / 1,048,576 hard, but container runtimes and older base images often ship far lower. If you run inside a container with an unspecified limit, Docker historically defaulted to the host’s soft limit (often 1,024). Kubernetes inherited this behavior for years.

The result: a service that works fine in development, works fine under light integration tests, and collapses under production load when the fd count crosses a threshold that has nothing to do with your code quality.

Diagnosing fd exhaustion in real time

When the incident is happening, you have three tools that matter.

First, lsof gives you the breakdown by fd type:

lsof -p $NODE_PID | awk '{print $5}' | sort | uniq -c | sort -rn

On a Node.js server, you will see IPv4, IPv6, FIFO, REG, and unix. If IPv4 dominates, your connections (inbound, outbound, or both) are the issue. If REG dominates, you are leaking file handles on disk (log rotation without closing streams is a common cause).

Second, the /proc filesystem gives you the raw count instantly without parsing lsof output:

ls /proc/$NODE_PID/fd/ | wc -l

Third, prlimit tells you the exact limits the kernel is enforcing on that process:

prlimit --pid $NODE_PID

Look at NOFILE. If the current value is within a few hundred of the max, you have found your smoking gun.

For a programmatic check inside the process (for a health endpoint or metrics), Node.js can read its own fd count from /proc/self/fd:

import fs from 'node:fs';
import path from 'node:path';

function getOpenFdCount() {
  return fs.readdirSync('/proc/self/fd').length - 1; // subtract the dir fd itself
}

console.log(`Open fds: ${getOpenFdCount()}`);

This is safe to run every few seconds in production. It is synchronous but only reads a small directory; on Linux, the cost is negligible.

The mathematics of connection pooling

The most common self-inflicted fd spike is connection pool sprawl. Here is a formula every service should document somewhere:

total_fds_estimate = (
  max_inbound_connections +
  sum(database_pool_maxes) +
  sum(redis_connection_counts) +
  sum(outbound_agent_max_sockets_per_host * number_of_downstream_hosts) +
  baseline_files_and_pipes
) * safety_margin

For a typical Node.js API, that might look like:

max_inbound_connections:        4,000    (Node.js http server maxConnections or effective concurrency)
Postgres primary pool:            100
Postgres replica pool:            100
Redis primary:                     10
Redis BullMQ workers:              40
Outbound HTTP to 3 services:     300    (100 per host)
Log streams and misc:              20
Baseline subtotal:              4,570
Safety margin (1.5x):           6,855

Your ulimit -n should be set to at least 7,000 for this service. Prefer 16,384 or 32,768 so you have headroom for traffic spikes, memory pressure-induced connection pileups, or deployment overlap (when old and new pods briefly coexist on the same node).

The mistake many teams make is raising the database pool max to 200 “just to be safe” without realizing that fds are a finite resource shared by every subsystem. A connection pool is not a performance knob you turn up arbitrarily. It is a congestion-control parameter for the downstream. If you double your pool max, you double the fds consumed, increase memory usage, and increase the risk of Postgres max_connections exhaustion. Size pools from estimates, not hopes.

Raising the limit: sysctl, systemd, Docker, and Kubernetes

There are four layers where the fd limit is set, and you need to understand which one wins.

The shell and systemd

For a Node.js service running under systemd (most modern Linux servers), the limit is controlled by the unit file:

[Service]
Type=simple
ExecStart=/usr/bin/node /opt/app/server.js
LimitNOFILE=65536

After reloading systemd and restarting the service, verify with:

systemctl show your-service.service --property=LimitNOFILE

If you run the process directly from a shell, the shell inherits limits from the user session. You can raise them with ulimit -n 65536 before starting Node, but systemd is the durable fix.

Docker

Docker containers inherit the host’s limits by default, but older versions or custom daemon configs can override this. Always specify the limit explicitly:

docker run --ulimit nofile=65536:65536 your-image

Or in docker-compose.yml:

services:
  app:
    ulimits:
      nofile:
        soft: 65536
        hard: 65536

Kubernetes

Kubernetes did not support setting ulimits per container natively for a long time. As of recent versions, you can use securityContext in the container spec (CRI-O and containerd support this):

spec:
  containers:
    - name: api
      image: your-image
      securityContext:
        capabilities:
          drop:
            - ALL
      resources:
        limits:
          ephemeral-storage: "1Gi"

Wait, that does not set fd limits. Kubernetes delegates fd limits to the container runtime, which usually inherits from the node. The reliable way to control this in Kubernetes is to ensure your container image or runtime config sets the limit, or use an init container script that calls prlimit. The cleaner approach is to set it in the container’s entrypoint:

#!/bin/sh
ulimit -n 65536
exec node server.js

Better yet, bake it into the Dockerfile if your base image respects it:

RUN echo "nofile 65536" >> /etc/security/limits.conf

Then verify inside the running pod:

kubectl exec -it pod-name -- /bin/sh -c "ulimit -n"

If this prints 1024, your containers are still carrying the default, and every connection pool decision you make is walking on a tightrope.

Application discipline: what to change in Node.js

Raising the limit buys you breathing room. It does not fix a leak. Here are the application-level patterns that keep fd usage honest.

1. Set explicit pool max values.

Do not rely on defaults. pg’s default pool max is 10, which is conservative. undici’s default is more aggressive. Check every library:

import pg from 'pg';

const pool = new pg.Pool({
  connectionString: process.env.DATABASE_URL,
  max: 40, // sized from the formula above
  idleTimeoutMillis: 10_000,
  connectionTimeoutMillis: 5_000,
});

2. Close streams deliberately.

If you open a file for logging, ensure it is closed or rotated by a library that tracks fds. If you spawn child processes, always handle their stdio explicitly:

import { spawn } from 'node:child_process';

const child = spawn('ffmpeg', args, {
  stdio: ['ignore', 'pipe', 'pipe'],
});

child.stdout.on('end', () => child.stdout.destroy());
child.stderr.on('end', () => child.stderr.destroy());
child.on('exit', () => {
  child.stdout?.destroy();
  child.stderr?.destroy();
});

Leaving stdio streams open after a child exits will keep their fds in the parent’s fd table indefinitely.

3. Monitor your HTTP agent sockets.

If you use http.Agent or https.Agent, socket reuse is good, but socket leaks are catastrophic. Log the agent’s current socket count periodically:

import http from 'node:http';

const agent = new http.Agent({ keepAlive: true, maxSockets: 50 });

setInterval(() => {
  const sockets = Object.values(agent.sockets).flat().length;
  const freeSockets = Object.values(agent.freeSockets).flat().length;
  console.log(JSON.stringify({ event: 'agent_socket_gauge', sockets, freeSockets }));
}, 30_000);

If sockets grows monotonically while request rate is flat, you have a leak (often caused by not consuming response bodies, which prevents the socket from returning to the free pool).

4. Use an explicit server connection limit.

Node.js http.createServer accepts an optional maxConnections. If you know your architecture cannot handle more than 5,000 concurrent fes due to downstream constraints, enforce it at the server:

const server = http.createServer(app);
server.maxConnections = 5000;

This is not just about fds. It is backpressure. When the server is at capacity, new inbound connections are rejected at the kernel level (ECONNREFUSED), which is faster and cheaper than accepting them, queuing them, and timing them out in application code.

Monitoring fd usage in production

You need a metric that tracks (open fds / fd limit) and alerts when it crosses 0.7. Here is a minimal Prometheus-style exporter hook you can attach to an existing /metrics endpoint or health check:

import fs from 'node:fs';
import os from 'node:os';

function getFdMetrics() {
  const open = fs.readdirSync('/proc/self/fd').length - 1;
  const limit = os.getrlimit ? os.getrlimit().nofile?.soft : undefined;

  // Fallback for older Node versions
  let limitFallback;
  try {
    const stdout = fs.readFileSync('/proc/self/limits', 'utf8');
    const line = stdout.split('\n').find(l => l.includes('Max open files'));
    if (line) {
      limitFallback = parseInt(line.trim().split(/\s+/)[3], 10);
    }
  } catch {}

  const effectiveLimit = limit ?? limitFallback ?? 1024;
  const ratio = open / effectiveLimit;

  return {
    open_file_descriptors: open,
    file_descriptor_limit: effectiveLimit,
    fd_utilization_ratio: Number(ratio.toFixed(4)),
  };
}

Ship fd_utilization_ratio to your metrics pipeline. Alert on > 0.7. Page on > 0.85. The 30 minutes between those two thresholds are usually the difference between a planned restart and a 3 a.m. incident.

The fix checklist

Before you declare fd work done, verify:

ulimit -n inside the running container is at least 16,384 (or 4x your estimated peak fd count, whichever is larger).
Every database, cache, and outbound HTTP pool has an explicit max sized from the formula above.
File streams and child process stdio are explicitly closed or destroyed after use.
You are exporting fd_utilization_ratio as a metric with alerts at 0.7 and 0.85.
Load tests or flame deployments confirm fd count stays flat under sustained traffic.

A note from Yojji

The kind of work this post describes (tracing a silent kernel failure through lsof and /proc, sizing connection pools from first principles, and wiring metrics that catch the problem before it becomes an outage) is the unglamorous infrastructure craft that separates a service that survives real traffic from one that looks fine until it does not.

Yojji is an international custom software development company founded in 2016, with offices in Europe, the US, and the UK. Their teams specialize in the JavaScript ecosystem (React, Node.js, TypeScript), cloud platforms (AWS, Azure, GCP), and the backend operational rigor that keeps production systems honest when load increases and defaults betray you.

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。

推荐订阅源

The Practical Developer

Why file descriptors matter more than you think

Diagnosing fd exhaustion in real time

The mathematics of connection pooling

Raising the limit: sysctl, systemd, Docker, and Kubernetes

The shell and systemd

Docker

Kubernetes

Application discipline: what to change in Node.js

Monitoring fd usage in production

The fix checklist

A note from Yojji