惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
G
GRAHAM CLULEY
P
Privacy & Cybersecurity Law Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
宝玉的分享
宝玉的分享
P
Proofpoint News Feed
H
Help Net Security
V
Visual Studio Blog
阮一峰的网络日志
阮一峰的网络日志
C
Cisco Blogs
人人都是产品经理
人人都是产品经理
Know Your Adversary
Know Your Adversary
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Recorded Future
Recorded Future
I
Intezer
罗磊的独立博客
T
The Exploit Database - CXSecurity.com
Blog — PlanetScale
Blog — PlanetScale
Malwarebytes
Malwarebytes
Spread Privacy
Spread Privacy
T
Tor Project blog
V
Vulnerabilities – Threatpost
云风的 BLOG
云风的 BLOG
腾讯CDC
B
Blog RSS Feed
Stack Overflow Blog
Stack Overflow Blog
F
Future of Privacy Forum
MyScale Blog
MyScale Blog
Latest news
Latest news
IT之家
IT之家
MongoDB | Blog
MongoDB | Blog
The Hacker News
The Hacker News
S
Securelist
博客园 - 【当耐特】
C
CXSECURITY Database RSS Feed - CXSecurity.com
T
Threat Research - Cisco Blogs
Jina AI
Jina AI
Cisco Talos Blog
Cisco Talos Blog
B
Blog
博客园 - 三生石上(FineUI控件)
Last Week in AI
Last Week in AI
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
M
MIT News - Artificial intelligence
V
V2EX
D
Darknet – Hacking Tools, Hacker News & Cyber Security
The Cloudflare Blog
The GitHub Blog
The GitHub Blog
博客园 - 聂微东
F
Full Disclosure
C
CERT Recently Published Vulnerability Notes

DEV Community

I built a CLI that eliminates README reading forever Measuring AI Gateway Failover: 30 Days of Production Data The Folly of Global AI Platforms: Or How We Built a System That Actually Works in Cameroon Week 9 The 10-Minute Race: Scaling the "Cancel Order" Button to 100K+ Requests Per Second Tutorial: This AI Now Tells You if a Meeting Could Be an Email Why I Got Tired of Class-Heavy UI Code and Started Building Around Attributes GitHub Is No Longer a Place for Serious Work Build an AI-Powered Developer Portal with Backstage and .NET Updates to developer experience on Setapp Node.Js Express CRUD template Lint Your Phishing Templates Like You Lint Your Code From Code to Cloud: 3 Labs for Deploying Your AI Agent I built Voice2Sub: a local AI subtitle generator for video and audio The OCR Rabbit Hole Built a 100k-Document RAG System by Hand. Hermes Read the Architecture in 47 Seconds. I tried monetizing my MCP server with x402 — production needs more than npm install Understanding Tracking Dimensions in Accounting Integrations I Ran My Local, NOT AI, AI Code Auditor on Its Own Source Code Agent Surface Map: Gemma 4 review before you install an MCP Stop Being Nice, Start Being Right": The Day My User Reconfigured My Reward Function Building a Database Performance Testing Tool With AI: The Honest Breakdown Hot To Run LLMs Locally Research blockchain with post-quantum Dilithium and custom zk-STARKs from scratch AI agents do not just need tool access. They need execution control. The CTO’s Blueprint for Governing Multi-Agent AI Systems in the Enterprise I audited our CMS and 86% of our articles were invisible. A Sanity gotcha. Upselling Explained Industry-Specific Tactics for EC Owners 2026 I Keep Hermes Agent's Self-Improvement OFF For the First 14 Days — Here's What Happens When I Don't I Built the Hermes + Claude Code Dual-Stack: Orchestrator Meets Coder — Here's the Full Architecture Stop Using .iterrows(). Here's What Actually Fast Looks Like I Built a SaaS to Stop the Awkward "Hey, Did You Get My Invoice?" Conversation I Renamed a Hot Postgres Table Without Dropping a Request How to Build a Self-Hosted AI Gateway With LiteLLM and Open WebUI What is a Webhook? A Complete Guide for Beginners Headless BI: How a Universal Semantic Layer Replaces Tool-Specific Models Beyond Translation: A Developer's Guide to App Localization (i18n & l10n) Aegis: Designing an Offline Ambient Co-Working Companion for High-Burnout Medical and STEM Grinds Local LLM Code Completion Showdown: Zed AI vs Continue vs Cursor (Honest 2026 Review) The Agentic Payment Protocol Wars Your No-Code AI Agent Has a Memory Problem The Agentic Payment Protocol Wars How to Bypass LinkedIn Commercial Use Limit in 2026 (Without Paying $150/mo) We built a statechart hosting platform where two actors in the same state can migrate to different versions — here's why that matters Playwright vs TWD: A Frontend Developer's Honest Comparison Claude Code's skillListingBudgetFraction: The Undocumented Setting Silently Killing Half Your Skills O GitHub pode mudar sua carreira mais do que você imagina Just redesigned and launched my developer portfolio 🚀 Would genuinely love some honest feedback from the dev community 👨‍💻 Data Virtualization and the Semantic Layer: Query Without Copying Launching opub: donated compute for open-source maintainers Four iteration rounds on a security scanner I run, all of them visible. Here is what the loop actually looks like. Why Good Abstractions Make Debugging Harder Found a Coordinated Inauthentic Network on GitHub: 24 Accounts, Fabricated History, and a Generator That Left Its PID in Three READMEs Cursor Just Released Composer 2.5. Here's What Actually Changed for AI Coding Agents. What Wrong Docs Cost Test Automation Teams Export Your DeepSeek Chats to Word, PDF, Google Docs, Markdown & Notion in One Click When the Docs Lie OpenShift Observability: Built-in vs. Bring-Your-Own If your AI initiative is pending for 6 months, the bottleneck is probably not technology Hermes Agent Under the Hood: The Open-Source Runtime for Autonomous AI Systems Expert Systems -The AI That Existed Before AI Was Cool AI-generated accessibility, an update — frontier models still fail, but skills change the game My HTML Learning Journey 🚀 The Day PayPal Failed and the Rust Rewrite Saved the Product Launch Google Sheets CRM: 4 Ways I've Actually Done It (with Apps Script Code) BrontoScope: AI-Powered Error Investigations The job of an AI engineer inside a 40-person company is not what most CEOs think it is Building a Clinical Speech-Therapy App With a Real SLP: 4 Lessons From PhoenixSteps 7 overlooked .Net features How Stripe Took 48 Hours and 3 API Calls to Break My Freelance Income Stream in Lagos Pretty normal Both Camps in the 'Left Behind' Argument Are Right About Each Other Flutter MCP Toolkit v3 Google Just Shipped Gemini 3.5 Flash. Here's What Developers Actually Need to Know. 🔐 Working with Private Symfony Recipes Rate limiting in web apps: what to protect before picking a library Rate limiting en aplicaciones web: qué proteger antes de elegir una librería What Are Lakehouse Catalogs? The Role of Catalogs in Apache Iceberg What It Really Takes to Become a Senior Software Engineer Microservices Were Never About Technology JS Crime Scene: The Misleading Array Project-as-code for a Directus v9 backend When the API literally burned your database after a typo COOKIES DPRK Hacking Trends 2026: AI‑Powered Supply Chain and Developer Environment Attacks Phone control for AI coding sessions is not a tiny terminal PayPal and Crypto Are Not Equals: How I Built a Gumroad Alternative for Restricted Countries Exploring Tech as a Content Writer I Raised Gemma 4's Token Cap. The Dense Model Stopped Refusing. React Server Components Don't Make Your App Fast by Default Multi-Stage Builds for a Next.js App — Reduce Image Size by 70% I Built a Chrome Extension That Teaches Vocabulary While You Browse Why I Walked Back from Next.js and RSC to a Plain SPA and a Separate Backend NeuralPocket: Private On-Device AI with Gemma 4 — Android & Web Github Speckit: Revolucionando o Desenvolvimento com SDD Cloud Cost Elasticity I Built a Payment System for Bangladesh—Heres Why Stripe Failed Us Polyglot Persistence in Microservices: Choosing the Right Database for Each Service Centralized Authentication for a Multi-Brand Laravel Ecosystem How I made a perfect recording button. Simple yet complex thing.
SQL Performance: Indexing, Query Tuning & Explain Plans (Developer Guide)
Anup Karanjk · 2026-05-22 · via DEV Community

Anup Karanjkar

A missing index on a foreign key column can turn a 5ms query into a 5-second table scan. A correlated subquery in a WHERE clause can multiply your query time by the number of rows. An ORM that generates N+1 queries can bring down a production API under moderate load. SQL performance problems are almost always fixable — the hard part is knowing where to look. This guide is that map.

All examples use PostgreSQL 16 syntax. The concepts apply to MySQL, SQLite, and most relational databases.

How PostgreSQL Picks a Query Plan

Before optimizing, understand what the planner does. PostgreSQL maintains statistics about tables and uses a cost-based optimizer to choose among many possible query plans. It estimates row counts, considers available indexes, and picks the plan with the lowest estimated cost. The planner is usually right — when it is wrong, it is almost always because statistics are stale or misleading.

-- always run ANALYZE before investigating slow queries
ANALYZE orders;

-- basic EXPLAIN (shows plan, no execution)
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;

-- EXPLAIN ANALYZE (runs the query, shows actual vs estimated rows)
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.total, c.email
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending'
  AND o.created_at > NOW() - INTERVAL '7 days';

Enter fullscreen mode Exit fullscreen mode

The output to look for: Seq Scan (table scan — usually bad on large tables), Index Scan (good), Bitmap Index Scan (good for low selectivity), and Hash Join vs Nested Loop vs Merge Join. A large discrepancy between "rows estimated" and "rows actual" means the planner has bad statistics.

B-Tree Indexes: The Foundation

-- basic single-column index
CREATE INDEX idx_orders_customer_id ON orders (customer_id);

-- index on a frequently filtered status column
CREATE INDEX idx_orders_status ON orders (status);

-- partial index — only index the rows you actually query
-- much smaller, faster to maintain
CREATE INDEX idx_orders_pending ON orders (created_at)
WHERE status = 'pending';

-- unique index (enforces uniqueness + faster lookups)
CREATE UNIQUE INDEX idx_users_email ON users (lower(email));

-- expression index — index a computed value
CREATE INDEX idx_users_email_lower ON users (lower(email));
-- now this query uses the index:
-- SELECT * FROM users WHERE lower(email) = 'user@example.com';

Enter fullscreen mode Exit fullscreen mode

Composite Indexes: Column Order Matters

A composite index on (a, b, c) can satisfy queries on a, a, b, and a, b, c — but NOT on b alone or c alone. The leading column rule is the most commonly misunderstood fact about composite indexes.

-- this index supports:
-- WHERE status = 'pending'                          ✓
-- WHERE status = 'pending' AND created_at > ...    ✓
-- WHERE status = 'pending' AND created_at > ... AND customer_id = 42  ✓
-- WHERE created_at > ...                            ✗ (can't skip status)
CREATE INDEX idx_orders_status_created_customer
  ON orders (status, created_at, customer_id);

-- covering index — includes all columns the query needs, zero heap fetches
CREATE INDEX idx_orders_covering ON orders (customer_id, status)
INCLUDE (total, created_at);

-- query that uses the covering index with no heap access:
SELECT total, created_at
FROM orders
WHERE customer_id = 42 AND status = 'shipped';

Enter fullscreen mode Exit fullscreen mode

Reading EXPLAIN ANALYZE Output

EXPLAIN (ANALYZE, BUFFERS)
SELECT p.name, SUM(oi.quantity * oi.unit_price) AS revenue
FROM order_items oi
JOIN products p ON p.id = oi.product_id
WHERE oi.created_at BETWEEN '2026-01-01' AND '2026-03-31'
GROUP BY p.id, p.name
ORDER BY revenue DESC
LIMIT 20;

/*
 Sample output sections to read:

 -> Hash Join  (cost=1234.56..5678.90 rows=10000 width=48)
              (actual time=45.123..98.456 rows=8734 loops=1)
   Hash Cond: (oi.product_id = p.id)
   Buffers: shared hit=1023 read=234

   -> Bitmap Heap Scan on order_items oi
         (actual time=12.3..34.5 rows=92345 loops=1)
      Recheck Cond: (created_at BETWEEN ...)
      ->  Bitmap Index Scan on idx_oi_created_at
            (actual time=8.9..8.9 rows=92345 loops=1)

 Planning Time: 2.1 ms
 Execution Time: 102.3 ms
*/

Enter fullscreen mode Exit fullscreen mode

Key metrics: actual time is the real wall time. rows discrepancy is your optimization signal. Buffers: read=N means disk I/O — large values indicate missing indexes or cold cache.

Eliminating N+1 Queries

The N+1 problem happens when you fetch a list of records, then run one query per record to fetch related data. In SQL, the fix is always a JOIN.

-- BAD: 1 query for orders + N queries for customers
SELECT * FROM orders WHERE status = 'pending';
-- then for each order:
SELECT * FROM customers WHERE id = ?;

-- GOOD: single query with JOIN
SELECT
  o.id,
  o.total,
  o.created_at,
  c.id         AS customer_id,
  c.email      AS customer_email,
  c.first_name,
  c.last_name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending'
ORDER BY o.created_at DESC;

Enter fullscreen mode Exit fullscreen mode

-- fetching nested aggregates without N+1
-- BAD: separate query per order for item count
SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '30 days';
-- then: SELECT COUNT(*) FROM order_items WHERE order_id = ?

-- GOOD: aggregate in JOIN
SELECT
  o.id,
  o.total,
  COUNT(oi.id)              AS item_count,
  SUM(oi.quantity)          AS total_units
FROM orders o
LEFT JOIN order_items oi ON oi.order_id = o.id
WHERE o.created_at > NOW() - INTERVAL '30 days'
GROUP BY o.id, o.total
ORDER BY o.created_at DESC;

Enter fullscreen mode Exit fullscreen mode

CTEs vs Subqueries

-- subquery in FROM (derived table) — planner can inline and optimize
SELECT p.name, revenue_data.total_revenue
FROM products p
JOIN (
  SELECT product_id, SUM(quantity * unit_price) AS total_revenue
  FROM order_items
  WHERE created_at > NOW() - INTERVAL '90 days'
  GROUP BY product_id
) revenue_data ON revenue_data.product_id = p.id
ORDER BY revenue_data.total_revenue DESC;

-- CTE — cleaner syntax, same performance in Postgres 12+
-- (CTEs are now inlined by default unless MATERIALIZED is specified)
WITH revenue_by_product AS (
  SELECT
    product_id,
    SUM(quantity * unit_price) AS total_revenue,
    COUNT(DISTINCT order_id)   AS order_count
  FROM order_items
  WHERE created_at > NOW() - INTERVAL '90 days'
  GROUP BY product_id
),
top_products AS (
  SELECT product_id, total_revenue, order_count
  FROM revenue_by_product
  WHERE total_revenue > 1000
)
SELECT p.name, tp.total_revenue, tp.order_count
FROM top_products tp
JOIN products p ON p.id = tp.product_id
ORDER BY tp.total_revenue DESC
LIMIT 50;

-- force materialization when the CTE result is expensive + reused
WITH MATERIALIZED expensive_aggregation AS (
  SELECT customer_id, AVG(total) AS avg_order_value
  FROM orders
  WHERE created_at > NOW() - INTERVAL '1 year'
  GROUP BY customer_id
)
SELECT c.email, ea.avg_order_value
FROM expensive_aggregation ea
JOIN customers c ON c.id = ea.customer_id
WHERE ea.avg_order_value > 200;

Enter fullscreen mode Exit fullscreen mode

Window Functions

Window functions compute a value across a set of rows related to the current row — without collapsing them into a single GROUP BY result. They are one of the most powerful SQL features for analytics.

-- running total
SELECT
  created_at::date      AS date,
  SUM(total)            AS daily_revenue,
  SUM(SUM(total)) OVER (
    ORDER BY created_at::date
    ROWS UNBOUNDED PRECEDING
  )                     AS cumulative_revenue
FROM orders
WHERE created_at >= '2026-01-01'
GROUP BY created_at::date
ORDER BY date;

-- rank products by revenue within each category
SELECT
  p.name,
  p.category,
  SUM(oi.quantity * oi.unit_price)          AS revenue,
  RANK() OVER (
    PARTITION BY p.category
    ORDER BY SUM(oi.quantity * oi.unit_price) DESC
  )                                          AS rank_in_category
FROM products p
JOIN order_items oi ON oi.product_id = p.id
GROUP BY p.id, p.name, p.category
ORDER BY p.category, rank_in_category;

-- compare to previous period (LAG)
SELECT
  month,
  revenue,
  LAG(revenue) OVER (ORDER BY month)  AS prev_month_revenue,
  ROUND(
    (revenue - LAG(revenue) OVER (ORDER BY month))
    / NULLIF(LAG(revenue) OVER (ORDER BY month), 0) * 100,
    2
  )                                     AS pct_change
FROM (
  SELECT
    DATE_TRUNC('month', created_at) AS month,
    SUM(total)                      AS revenue
  FROM orders
  GROUP BY 1
) monthly
ORDER BY month;

-- deduplicate: keep latest record per customer
SELECT DISTINCT ON (customer_id)
  customer_id, id AS order_id, created_at, total
FROM orders
ORDER BY customer_id, created_at DESC;

Enter fullscreen mode Exit fullscreen mode

Index Maintenance

-- find unused indexes (waste space, slow writes)
SELECT
  schemaname,
  tablename,
  indexname,
  idx_scan,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelname NOT LIKE 'pg_%'
ORDER BY pg_relation_size(indexrelid) DESC;

-- find missing indexes on foreign keys
SELECT
  tc.table_name,
  kcu.column_name,
  ccu.table_name AS references_table
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
  ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage ccu
  ON tc.constraint_name = ccu.constraint_name
WHERE tc.constraint_type = 'FOREIGN KEY'
  AND NOT EXISTS (
    SELECT 1 FROM pg_indexes
    WHERE tablename = tc.table_name
      AND indexdef LIKE '%' || kcu.column_name || '%'
  );

-- rebuild bloated indexes concurrently (no table lock)
REINDEX INDEX CONCURRENTLY idx_orders_customer_id;

Enter fullscreen mode Exit fullscreen mode

People Also Ask

When should I use a partial index instead of a full index?

Use a partial index when your queries consistently filter on a specific condition — like WHERE status = 'pending' or WHERE deleted_at IS NULL. A partial index only indexes rows matching the condition, so it is smaller (faster to build, cheaper to maintain) and has higher selectivity (more likely to be used by the planner). The tradeoff is that it only helps queries that include the matching WHERE clause.

Why does adding an index sometimes make queries slower?

For very low-selectivity queries (e.g., WHERE status IN ('a', 'b', 'c') matching 80% of rows), a sequential scan is actually faster than an index scan because the index forces random I/O to fetch each heap page individually. The planner knows this and will choose a seq scan. You can force it with hints in development, but in production you should trust the planner — and investigate its statistics if you think it is wrong.

What is the difference between EXPLAIN and EXPLAIN ANALYZE?

EXPLAIN shows the query plan the planner would use — it does not execute the query. EXPLAIN ANALYZE actually runs the query and shows both the estimated and actual row counts and timings. Always use EXPLAIN ANALYZE with BUFFERS (EXPLAIN (ANALYZE, BUFFERS)) when debugging performance issues so you can see disk I/O. Be careful: EXPLAIN ANALYZE on a DELETE or UPDATE will actually execute those statements, so wrap them in a transaction you roll back if needed.

Originally published at wowhow.cloud