Prisma Performance Patterns for Production Node.js Applications

The Practical Developer

The Libuv Thread Pool Trap: Why Node.js Async APIs Stall Under Load Postgres Covering Indexes with INCLUDE: Eliminate Heap Fetches on Read-Heavy Workloads Postgres DISTINCT ON: The Fastest Way to Get the Latest Row Per Group Postgres Transaction Isolation: The Anomalies Your App Actually Faces in Production Linux TCP Tuning for Node.js Microservices: The Kernel Settings That Stop Silent Connection Drops Under Load Postgres HOT Updates and Fillfactor: Why Not All Writes Are Created Equal Database Connection Pool Leaks: Finding the Promise That Never Returns Its Seat Linux OOM Killer in Production: Why Your Node.js Containers Die Without a Stack Trace Postgres Materialized Views: Refresh Strategies That Do Not Lock Your Dashboards API Dependency Health Checks: Why /health Is Not Enough Authorization with Zanzibar Tuples: How Google Manages Permissions and How To Build the Same Check in Node.js Postgres Advisory Locks: The 20-Character Primitive That Replaces Redis for Coordination Dead Letter Queues: The Message Queue Pattern That Saves You at 2 a.m. File Descriptor Exhaustion: The Kernel Limit That Silently Drops Node.js Connections Graceful Degradation: The Pattern That Turns Total Outages into Partial Success PostgreSQL Full-Text Search: Dropping Elasticsearch for 90% of Use Cases S3 Presigned Multipart Uploads: Stop Your API Server from Being a File Upload Bottleneck MessagePack vs JSON: The Binary Serialization Switch That Cut Our Internal RPC Overhead by 40% DNS Caching in Node.js: The Silent Cause of Production Latency Spikes Reliable Cron Jobs: The Pattern That Stops Double Runs, Missed Executions, And The 2 AM Page GraphQL Query Complexity: Stop the OOM Query Before It Reaches Your Resolver Node.js Event Loop Lag: The Hidden Metric Behind Random Latency Spikes API Request Validation with Zod: The Schema That Catches Bad Input Before It Corrupts Your Database Load Shedding in Node.js: How to Reject Traffic Before You Drown Request Hedging: Cut Tail Latency In Half Without Overprovisioning Git Bisect: The Automated Binary Search That Finds Breaking Commits in Minutes Node.js Garbage Collection Tuning: Stop Letting V8 Pause Your Event Loop Node.js Server Timeouts: The Settings That Stop Slow Clients from Holding Sockets Hostage Postgres BRIN Indexes: The Time-Series Secret That Shrinks Indexes by 99% Event Sourcing with PostgreSQL: The Pragmatic 80% Solution Node.js Cluster Mode: Scaling the Event Loop Across CPU Cores Postgres Partial Indexes: Stopping Soft Deletes from Ruining Your Query Performance Request Coalescing with the Singleflight Pattern: Stop Drowning Your Database on Every Cache Miss The Bulkhead Pattern: Why One Slow Endpoint Should Not Drown Your Whole Service Node.js AsyncLocalStorage: End-to-End Request Context Without the Propagation Hell Postgres Deadlocks: Logging the Victim, Reproducing the Race, and Fixing the Lock Order Your Node.js HTTP Client Is the Bottleneck: Connection Pool Tuning That Works Optimistic Locking in Postgres: Stop Losing Data to Race Conditions Postgres Read Replicas: Stop Serving Stale Data to Your Users Cursor Pagination: Why Offset Queries Explode at Scale and How to Fix Them Node.js Worker Threads: 60 Lines That Stop a CSV Upload from Timing Out Every Other Request Reliable Webhook Delivery: Architecture for Outbound HTTP You Can Trust Request Timeouts and Deadline Propagation: Stop the Chain of Slowness Advanced Security Practices in Node.js Graceful Shutdown in Node.js: The 40 Lines That Stop 502s During Deploys Finding Node.js Memory Leaks with Heap Snapshots Idempotency Keys in 30 Lines: Stop Your Webhook From Charging Customers Twice Backpressure In Node.js: The Fix For Slow-Motion Queue Meltdowns Retries Done Right: Jitter, Budgets, and the Stampede You Did Not See Coming The Cache Stampede: Why Your "Just Add Redis" Layer Crashes Postgres at 3 a.m. Postgres SKIP LOCKED: An 80-Line Job Queue You Can Run Without Redis Stop Doing Work Nobody Wants: AbortController in Node.js, Done Right The N+1 Query Problem: We Found 23 In One Codebase And Killed Every One I Tried 5 AI Coding Tools for a Month. Here Is What I Actually Use CI/CD From Zero to Production in 30 Minutes With GitHub Actions Node.js vs Bun vs Deno: Which Runtime Should You Pick in 2025? Kubernetes Resource Requests And Limits: The Numbers That Decide If Your Cluster Is Stable The Three Pillars of Observability Are A Myth: What Actually Matters In Production pnpm Vs npm Vs yarn Vs Bun For Monorepos: Which One Earns The Migration In 2024 JSONB Indexing In Postgres: GIN Vs Expression Indexes, And When Each Is The Right Choice A Code Review Checklist That Ends The Same Three Arguments Every Sprint gRPC Vs REST In 2024: When The Switch Pays For Itself React Suspense For Data Fetching: The Pattern That Replaces Half Your Loading State Code The Five-Stage Rollout: How To Ship A Risky Change Without Holding Your Breath GitHub Actions In A Monorepo: Caching, Path Filters, And Secret Boundaries That Actually Work The Blameless Postmortem That Actually Improves Things: A Template And Six Hard-Won Rules Recursive CTEs In Postgres: How To Query A Tree Without N Round Trips Node.js Streams: When They Actually Help, And When They Just Add Complexity Playwright Vs Cypress In 2024: The Honest Comparison Of Which One Earns The Test Time React Server Components: The Mental Model That Makes The "use client" Boundary Obvious Pod Disruption Budgets: The K8s Object That Keeps Your Service Up During Cluster Maintenance Postgres LISTEN/NOTIFY: The Pub/Sub You Already Have And Are Not Using Chaos Engineering Starter Kit: The Five Drills That Don't Need Netflix-Scale Spec-Driven API Development With OpenAPI: How To Stop Drifting From Your Docs Kubernetes Autoscaling Beyond CPU: The Custom-Metric HPA Pattern That Actually Works Postgres Partitioning For Time-Series: The Boring Setup That Saves Your Database Distributed Locks With Redis: An Honest Look At Redlock And When You Don't Need It HTTP/2 vs HTTP/3: What Actually Changes For Your App, And What Doesn't Image Optimization For The Web In 2023: srcset, AVIF, And The Lighthouse Score You Actually Want Kafka vs RabbitMQ: A Decision Tree That Doesn't Hate You UUID vs Bigint Primary Keys In Postgres: The Index Math That Decides For You Flame Graphs: How To Find The Slow Function In 30 Seconds Without Profiling Theatre Postgres Streaming Vs. Logical Replication: Which One Solves Your Actual Problem ESLint Rules That Earn Their Keep: The Twelve I Enable On Every Project Pre-Commit Hooks That Pay For Themselves: Husky, lint-staged, And The Five Rules That Stick Zero-Downtime Database Migrations: The Six-Step Pattern That Rules Them All Circuit Breakers In Node.js: 50 Lines That Stop A Failing Dependency From Taking Down Your Service Postgres VACUUM Is Not Magic: How Your Hot Table Bloats To 80GB And How To Fix It Kubernetes Liveness And Readiness Probes: The Difference That Causes Half Your Outages Rate Limiting In Production: A Token Bucket In 30 Lines Of Redis The Outbox Pattern: How To Stop Losing Events When Postgres And Kafka Disagree Load Testing With k6: The Three Scenarios That Find Real Bugs (Not Synthetic Numbers) Postgres Row-Level Security For Multi-Tenant Apps: The Pattern That Stops You From Leaking Data Rebase vs. Merge: The Team Policy That Ends The Argument Forever OpenTelemetry in Node.js: Distributed Tracing That Actually Helps During an Incident Feature Flags That Pay Rent: The 4 Flag Types And When To Delete Each ETag, Last-Modified, and the Caching Headers Most APIs Get Wrong Connection Pooling Without the Cargo Cult: pgbouncer in 100 Lines of Config JSONB Is Not a Schema: When To Reach For It in Postgres, And When To Stop Bash Strict Mode: The Three Lines That Stop Your Deploy Script From Lying To You

The Practica · 2026-06-06 · via The Practical Developer

The API endpoint that creates an order with line items was taking 12 seconds in staging. Not production. Staging, with 200 users and 50 rows in the order table. The Prisma query in question was a nested create that wrote one order, three line items, and updated an inventory counter. Twelve seconds for four write operations. The team was about to blame the database and ask for a bigger instance. The real problem was that Prisma was opening and closing a database connection for every single SQL statement inside that nested create, and the default connection pool of 10 was fighting itself across concurrent requests.

This post covers the four Prisma performance traps that surface in production, not in tutorials, and the exact configuration and code patterns that fix them.

The connection pool is not magical

Prisma wraps pg under the hood, and pg defaults to a pool size of 10. That pool is shared across all requests in a process. When a request acquires a connection, runs a query, and releases it, the pool hands the next waiting request the same connection. This works fine until the request holds the connection while doing something else.

Here is the pattern that destroys throughput:

// Bad: Holds a connection while doing async work between queries
app.get('/orders/:id', async (req, res) => {
  const order = await prisma.order.findUnique({
    where: { id: req.params.id },
    include: { lineItems: true }
  });

  // This pauses the event loop but keeps the connection pinned
  const enriched = await enrichFromExternalService(order);

  const result = await prisma.order.update({
    where: { id: order.id },
    data: { enrichedAt: new Date() }
  });

  res.json(result);
});

Every await prisma.* call acquires a connection from the pool. If you do async work between Prisma calls, the connection from the first call was already released. But the problem is subtler: Prisma’s internal query engine uses a single connection for the lifetime of an interactive transaction. Outside of transactions, each query acquires and releases a connection from the pool. If the pool is exhausted, queries queue up.

The fix is to size the pool correctly and never let non-database work sit between Prisma queries without an explicit transaction:

// Configure in schema.prisma
datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
  // connection_limit controls the pg pool size
  connection_limit = 20
}

// Or set via DATABASE_URL
// postgresql://user:pass@host:5432/db?connection_limit=20

For serverless environments (Lambda, Cloudflare Workers), the pool is irrelevant because each invocation is isolated. Use the @prisma/adapter-pg or the Data Proxy instead:

// serverless.ts - one PrismaClient per invocation, no pool
import { PrismaClient } from '@prisma/client';
import { Pool, neonConfig } from '@neondatabase/serverless';

neonConfig.fetchConnectionCache = true;

const pool = new Pool({ connectionString: process.env.DATABASE_URL });

const prisma = new PrismaClient({
  adapter: new PrismaNeon(pool)
});

Do not reuse a single PrismaClient across Lambda warm starts with a connection pool unless you have configured the pool size to 1. Lambda concurrency means 100 invocations each with a pool of 10 rapidly exhausts the database connection limit.

Batch operations are not what you expect

Inserting 10,000 rows with a for loop and prisma.model.create() is the most common Prisma performance mistake. Each create is a separate round trip. With 10 concurrent requests each inserting 100 rows, you have 1,000 queries in flight against a pool of 20 connections. The queue depth alone adds seconds.

Prisma has createMany and updateMany, but they have sharp edges:

// Slow: 1,000 round trips for 1,000 rows
for (const item of items) {
  await prisma.lineItem.create({
    data: { orderId, sku: item.sku, quantity: item.quantity }
  });
}

// Fast: One round trip
await prisma.lineItem.createMany({
  data: items.map(item => ({
    orderId,
    sku: item.sku,
    quantity: item.quantity
  }))
});

On a local Postgres with default settings, inserting 1,000 rows one by one takes about 1,200 ms. createMany with the same data takes about 45 ms. That is a 26x difference. On a networked database with 5 ms latency per query, the loop takes 5 seconds of round-trip overhead alone before any SQL execution time.

But createMany has limitations:

It does not support nested creates. If your model has relations that must be created together, createMany cannot handle them. You must use create with include or a raw query.
It does not return the created rows by default. Set skipDuplicates: true if you need Postgres to ignore conflicts, but the return value is a count, not the records. Add a separate select query afterward if you need the IDs.
It does not set @default(now()) timestamps reliably across all database providers. Test this before relying on it.

For updates, the updateMany API is equally constrained:

// Updates all matching rows in one query
await prisma.inventory.updateMany({
  where: { sku: { in: skus } },
  data: { quantity: { decrement: 1 } }
});

If you need bulk operations with different values per row, Prisma currently has no native INSERT ... ON CONFLICT DO UPDATE or multi-row UPDATE ... FROM VALUES. You must drop to raw SQL for those.

Raw queries are not an escape hatch for emergencies

Prisma’s type-safe query builder is the main reason teams choose it. But for read-heavy analytics queries that join five tables with aggregations, the generated SQL is often suboptimal. Prisma generates a separate SELECT for each include that crosses a one-to-many relation. One Prisma query with three include clauses can produce four or five SQL queries.

Consider this:

const orders = await prisma.order.findMany({
  where: { status: 'shipped', createdAt: { gte: lastWeek } },
  include: {
    customer: true,
    lineItems: { include: { product: true } },
    shipments: true
  },
  take: 50
});

This generates roughly five SQL queries: one for orders, one for customers, one for line items, one for products, and one for shipments. Each query runs independently with a WHERE clause that filters by the IDs returned from the previous query. For 50 orders, the line items query fetches 150 rows (three per order), then the products query fetches 150 rows, then the shipments query fetches 50 rows. That is 400 rows fetched across 5 queries and 5 round trips.

If latency is 3 ms per round trip, that is 15 ms of overhead before any data is processed. More importantly, the database planner cannot optimize across these queries. It cannot choose a hash join or a merge join because each query is isolated.

The fix is to use a raw SQL query with proper JOINs when you need more than two levels of nesting:

const orders = await prisma.$queryRaw<OrderWithRelations[]>`
  SELECT
    o.id, o.status, o.created_at,
    jsonb_build_object('id', c.id, 'name', c.name) AS customer,
    COALESCE(
      jsonb_agg(
        DISTINCT jsonb_build_object(
          'id', li.id,
          'sku', li.sku,
          'product', jsonb_build_object('id', p.id, 'name', p.name)
        )
      ) FILTER (WHERE li.id IS NOT NULL),
      '[]'::jsonb
    ) AS line_items,
    COALESCE(
      jsonb_agg(
        DISTINCT jsonb_build_object('id', s.id, 'tracking', s.tracking_number)
      ) FILTER (WHERE s.id IS NOT NULL),
      '[]'::jsonb
    ) AS shipments
  FROM orders o
  JOIN customers c ON c.id = o.customer_id
  LEFT JOIN line_items li ON li.order_id = o.id
  LEFT JOIN products p ON p.id = li.product_id
  LEFT JOIN shipments s ON s.order_id = o.id
  WHERE o.status = 'shipped' AND o.created_at >= NOW() - INTERVAL '7 days'
  GROUP BY o.id, c.id
  ORDER BY o.created_at DESC
  LIMIT 50
`;

This generates one query, one round trip, and lets Postgres choose the best join strategy. On a dataset of 100,000 orders, the Prisma-generated version took 320 ms at p95. The raw JOIN took 42 ms. That is 7.6x faster.

The trade-off is that $queryRaw returns untyped results by default. You can type them with a generic, but you lose Prisma’s type checking on the output. If the schema changes, the raw query breaks silently. Use this only for read-heavy aggregate queries where the performance difference is measurable. For CRUD operations, Prisma’s builder is fast enough and the type safety is worth the overhead.

Interactive transactions and batch processing

The default Prisma transaction behavior wraps a single operation. For multi-step workflows, you need an interactive transaction. But the default isolation level (Read Committed) combined with a long-running transaction causes contention.

Here is the pattern for processing a batch of items where each item needs read, compute, and write:

// Slow: Sequential reads and writes in separate transactions
for (const item of batch) {
  const inventory = await prisma.inventory.findUnique({
    where: { sku: item.sku }
  });

  if (inventory.quantity >= item.quantity) {
    await prisma.inventory.update({
      where: { sku: item.sku },
      data: { quantity: { decrement: item.quantity } }
    });
  }
}

This is one transaction per item. If the batch has 100 items, that is 200 transactions, 200 round trips. The database spends most of its time committing.

The fix is to batch the work into a single interactive transaction and use $queryRawUnsafe for the conditional update if updateMany does not cover it:

// Fast: One transaction, batch logic in application code
await prisma.$transaction(async (tx) => {
  for (const item of batch) {
    // Each query still triggers a round trip within the transaction
    const inventory = await tx.inventory.findUnique({
      where: { sku: item.sku }
    });

    if (inventory.quantity >= item.quantity) {
      await tx.inventory.update({
        where: { sku: item.sku },
        data: { quantity: { decrement: item.quantity } }
      });
    }
  }
});

This is still one round trip per item (200 round trips), but they happen inside one transaction. The database commits once. On our benchmark, this cut total time from 4,200 ms to 650 ms for a batch of 100 items with 3 ms latency.

For truly bulk operations where latency is critical, use a raw multi-row update:

await prisma.$executeRaw`
  UPDATE inventory
  SET quantity = inventory.quantity - data.qty
  FROM (VALUES
    ${prisma.join(
      batch.map(item => prisma.sql`(${item.sku}::text, ${item.quantity}::int)`),
      ','
    )}
  ) AS data(sku, qty)
  WHERE inventory.sku = data.sku
    AND inventory.quantity >= data.qty
`;

This is one query, one round trip, regardless of batch size. For 1,000 items, this runs in about 15 ms versus 6,500 ms for the sequential loop. The database does the work in a single statement.

Cursor-based pagination is mandatory at scale

Prisma supports both offset pagination (skip/take) and cursor-based pagination. Offset pagination is the default in every tutorial. It is also the one that breaks under load.

// Fragile: Offset pagination, O(n) scan for every page
const page = await prisma.order.findMany({
  where: { customerId },
  orderBy: { createdAt: 'desc' },
  skip: 10000,
  take: 50
});

Every call with skip: 10000 causes Postgres to scan and discard the first 10,000 rows. As the offset grows, the query gets slower. At page 200 (skip 10,000), this query takes about 150 ms. At page 2,000 (skip 100,000), it takes 800 ms. The time scales linearly with the offset because Postgres must count through every qualifying row before it can return the slice.

Cursor-based pagination avoids this entirely:

// Fast: Cursor-based, O(log n) lookup regardless of page depth
const page = await prisma.order.findMany({
  where: { customerId },
  orderBy: { createdAt: 'desc' },
  take: 51, // Fetch one extra to check if there is a next page
  cursor: lastCursor ? { id: lastCursor } : undefined,
  skip: lastCursor ? 1 : 0 // Skip the cursor row itself
});

const hasMore = page.length === 51;
const results = hasMore ? page.slice(0, 50) : page;
const nextCursor = hasMore ? results[results.length - 1].id : null;

The cursor lookup uses the primary key index, which is O(log n). It does not depend on how many rows came before. At page 200 or page 20,000, the query takes the same time. In our benchmarks, cursor-based pagination stayed flat at 2-5 ms per page regardless of depth. Offset pagination took 800 ms at depth 100,000.

Prisma’s cursor pagination requires a unique field for the cursor. id works. For compound cursors (e.g., createdAt + id for tiebreaking), create a unique composite index and pass the compound value:

const page = await prisma.order.findMany({
  where: { customerId },
  orderBy: { createdAt: 'desc' },
  take: 51,
  cursor: lastCursor ? {
    createdAt_id: { createdAt: lastCursor.createdAt, id: lastCursor.id }
  } : undefined,
  skip: lastCursor ? 1 : 0
});

This requires a compound unique index on (createdAt, id). Prisma does not enforce this automatically if id is already unique, but the cursor lookup will fail at runtime if the index does not exist.

Middleware and extension overhead

Prisma middleware ($use) and client extensions ($extends) let you add logging, caching, and authorization checks to every query. But poorly written middleware wraps every query in extra async work that compounds with every query in a chain.

// Bad: Logging middleware that awaits a write on every query
prisma.$use(async (params, next) => {
  const start = performance.now();
  const result = await next(params);
  const elapsed = performance.now() - start;

  // This awaits a file append on every single query
  await fsPromises.appendFile('/var/log/prisma.log',
    `${params.model}.${params.action} ${elapsed}ms\n`);

  return result;
});

If the log write takes 1 ms, every Prisma query takes at least 1 extra millisecond. For a query that generates 5 internal queries (nested include), that is 5 ms of middleware overhead. For a createMany with 1,000 rows, it is 1,000 ms of logging. Remove async work from middleware paths. Use a buffered logger or push to a background queue:

// Good: Fire-and-forget, zero blocking
prisma.$use(async (params, next) => {
  const start = performance.now();
  const result = await next(params);
  const elapsed = performance.now() - start;

  setImmediate(() => {
    logger.info({
      model: params.model,
      action: params.action,
      duration: elapsed
    });
  });

  return result;
});

The same applies to caching extensions. Never await a cache write in the hot path. Set it and forget it.

The practical takeaway

Prisma’s defaults are optimized for developer experience, not production throughput. Every team that deploys Prisma to production hits at least two of the traps above within the first month. Here is the checklist to apply before your next deployment:

Set connection_limit in your datasource URL. Start at 20 for a 4 vCPU instance and tune from there. Monitor pg_stat_activity for idle-in-transaction connections.
Replace loops with createMany and updateMany. If the API does not support your operation, use $executeRaw instead of writing a for loop.
Profile every query that has more than one include. If the EXPLAIN reveals multiple independent SELECTs and the endpoint serves more than 100 requests per second, replace it with a raw JOIN.
Use cursor-based pagination for any endpoint that allows page numbers beyond page 100. Offset pagination is a linear-time trap.
Remove async I/O from middleware. Log asynchronously. Cache writes asynchronously. Never await side effects in query middleware.

The final test: run your top five endpoints through prisma.$on('query', ...) and count the number of SQL queries each one generates. Any endpoint that produces more than three SQL statements per HTTP request is a candidate for raw SQL or a restructured schema.

A 12-second order creation endpoint dropped to 300 ms with these changes. No database upgrade. No schema redesign. Just the pool, the batch, and the raw query.

A note from Yojji

The kind of database-layer work that digs into connection pool sizing, batch statement planning, and the gap between Prisma’s generated SQL and optimal Postgres execution is the work that separates applications that scale from applications that need emergency rewrites. It is also the exact kind of backend engineering Yojji has been delivering since 2016.

Yojji is an international custom software development company with offices across Europe, the US, and the UK. Their teams build production Node.js systems using Prisma, Postgres, and the full JavaScript ecosystem, and they treat query profiling and connection management as standard practice, not post-launch firefighting.

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。

推荐订阅源

The Practical Developer

The connection pool is not magical

Batch operations are not what you expect

Raw queries are not an escape hatch for emergencies

Interactive transactions and batch processing

Cursor-based pagination is mandatory at scale

Middleware and extension overhead

The practical takeaway

A note from Yojji