惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

N
News and Events Feed by Topic
Malwarebytes
Malwarebytes
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
Cybersecurity and Infrastructure Security Agency CISA
F
Future of Privacy Forum
C
Cisco Blogs
T
The Exploit Database - CXSecurity.com
A
Arctic Wolf
S
Securelist
K
Kaspersky official blog
S
Schneier on Security
T
ThreatConnect
T
Tenable Blog
Spread Privacy
Spread Privacy
T
True Tiger Recordings
AWS News Blog
AWS News Blog
F
Fox-IT International blog
量子位
T
Threatpost
V
Vulnerabilities – Threatpost
C
CERT Recently Published Vulnerability Notes
Cisco Talos Blog
Cisco Talos Blog
GbyAI
GbyAI
宝玉的分享
宝玉的分享
腾讯CDC
G
Google Developers Blog
aimingoo的专栏
aimingoo的专栏
Cyberwarzone
Cyberwarzone
有赞技术团队
有赞技术团队
S
SegmentFault 最新的问题
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
U
Unit 42
雷峰网
雷峰网
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Simon Willison's Weblog
Simon Willison's Weblog
O
OpenAI News
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The GitHub Blog
The GitHub Blog
The Register - Security
The Register - Security
MyScale Blog
MyScale Blog
小众软件
小众软件
A
About on SuperTechFans
Last Week in AI
Last Week in AI
Y
Y Combinator Blog
博客园 - 三生石上(FineUI控件)
美团技术团队
Google Online Security Blog
Google Online Security Blog
P
Proofpoint News Feed
MongoDB | Blog
MongoDB | Blog

DEV Community

Principal Components in TypeScript (Part 3) Gemma 4 CAD Orchestrator Live Holographic Editor In Fractal Time Everbench: A document management system with Local Intelligence Instanton in Fractal Time The Hidden Features of Claude How I Built an AI News Brief with Next.js, Supabase, Vercel, and GPT-4o-mini How We Built a Multi-Agent AI Documentation System (And What We Learned) I got tired of writing post-mortems — so I built RCAi for SREs MIA: A Futuristic AI Desktop Assistant Built with Voice, Gestures, and Controlled Chaos Best Programming Language for Backend Web Development: PHP vs Python PayPal Alternatives for Indian Businesses: Best Payment Gateways for International Card Payments (2026) Gemma 4 Made Me Rethink Local AI: Not Just Text, But Images Too Clean Architecture in .NET Explained (The Dependency Rule) I Compiled Rust to WebAssembly and Made My JavaScript 6 Faster Outlook.com Is the Final Boss of 'Just Send an Email' Conditional Statements and Control Flow in Python Insults & Cutlasses, Local LLM Sword Fighting on Melee Island Production Lab: ECS Fargate + Prometheus + Grafana + Loki + Alloy + Node Exporter How 12 AI agent frameworks handle human approval (most badly) The Four-Index Reality: Why AI Search Isn't One Thing I Scanned 1 Million AI Services. Here's What Worries Me More Than the Vulnerabilities Managing multiple docker hub accounts using docker-use System Design Interview: Decentralized Web Crawler Metric Cardinality: High or Low? 4 Steps to Making the Right Choice 로컬 LLM 셋업 가이드 (v23) GEO vs SEO in 2026 — What Google's May Guidance Changed Cursor Review 2026 — Honest 'Not For Me' Take From a VSCode User Hello from rikuq — a practitioner blog for solo AI SaaS founders Why DevOps Engineers Need Practical Tutorials, Not Just Theory AI Agents in CI/CD: Give Them Context, Not Production Authority Now I See Why Translators Are Panicking Over AI—Should Coders Panic Too? Why I Track HRV Every Morning (And How It Actually Changes My Day) Diffusion Language Models: How NVIDIA's Nemotron-Labs DLM Is Killing Token-by-Token Generation Chatbots GPT pour le support client : ce que les équipes françaises ont réellement besoin de savoir I Hit the 1,232-Byte Wall So You Don't Have To Google Just Rebuilt the Search Box (Again) — But This Time It's Different Aether: A local Android assistant built with Gemma 4 BoxAgnts Introduction (1) — Out of the Box mkdev: trusted HTTPS for localhost, mapped by name Just one question, one answer. Why Java Still Rules the Programming World in 2026 Four Architectures for Letting Claude Edit Elementor (and Why We Shipped Clone-and-Mutate) yard-yaml 0.1.1: safer UTF-8 handling for YAML documentation I Built a Mac App That Keeps Your Clipboard in Sync Across All Your Android Devices Stop Using UUIDs: Why B2B SaaS Needs ULIDs in Laravel 🐘 I'm a non-technical founder who built a Slack approval tool. Here's what actually broke first. Open-Sourcing Our Game AI Stack — SDKs, Templates, and CLI Tools for NPC Dialogue I Built an AI System That Makes 1,000 Decisions a Day. Here's Where I Drew the Line. Lets Encrypt DNS Challenge with Traefik and AWS Route 53 Building an agent-ready website: how to make your site readable for ChatGPT, Perplexity and autonomous agents A productivity tool with GitHub as your cloud database How We Built Dynamic NPC Dialogue with LLMs — Lessons from Early Access cmux: The Native macOS Terminal Built for Running AI Coding Agents in Parallel Deep Atlantic Storage: Rewriting in Rust How I Built a Bulk Image Optimizer with $0 Server Costs Using Vanilla JS and Canvas API Humans and Machines read differently, I think I have a fix? Claude Code Deleted 92 Images Without Asking. This Happens More Than You Think. Method Calling Stack in Java I Built Schedule Sensei & Pushed It to GitHub – Here's What's Inside (And I Need Your Help 👀) OIC: From a Working Toast Watcher to a General "Watch It for Me" Agent Memory is two-thirds of what an AI chip costs to build The XState persistence problem is five years old. Here is what we built to finally solve it. i added MCP support to my SaaS in an afternoon. here's the whole thing. Framework: Link Building ☁️ Importing existing S3 buckets into Terraform state made easy with terraform import existing s3 bucket I Built a Token System on Solana (Without Any Backend Code) 터미널 AI 에이전트 구축 (v21) I Built an AI 3D Model Generator — Here's How I Handle Meshes in the Browser 🛡️ PromptGuard: I Built a Local AI Privacy Firewall That Sanitizes Your Prompts Before They Leave Your Machine PostgreSQL WAL Bloat: Why Automatic Management Is Often Insufficient? Seven PRs Before Lunch: Parallel Claude Code Tabs Plus Audit-Before-Bump Deployment using all three Kubernetes probes Qwen 3.6 Has Four Tiers. Here's How to Route Without Burning Cash. RAG 시스템 실전 구축 (v21) How I handle my errors in PHP The Blind Spot in Treasure Hunt Engine Configuration: Long-Term Server Health Run NVIDIA NIM on Your Own GPU — Same API, Different Endpoint Webflow SEO Implementation 로컬 LLM 셋업 가이드 (v21) How Logs Travel From Your EKS Pod to Datadog 𝗦𝘁𝗼𝗽 𝗖𝗿𝗮𝗺𝗺𝗶𝗻𝗴 𝗙𝗼𝗿 𝗘𝘅𝗮𝗺𝘀, 𝗦𝘁𝗮𝗿𝘁 𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗥𝗲𝗮𝗹 𝗦𝗸𝗶𝗹𝗹𝘀 How to Use EXPLAIN ANALYZE in PostgreSQL: A Visual Guide gRPC Performance: tonic (Rust) vs grpc-go Benchmarked at Scale Hack The Box (HTB): Cap Machine (Full Walkthrough) Visual Search Optimization studygemma: AI study buddy for CS students Architectural Tradeoffs in Webhook Idempotency and SaaS API Versioning One Open Source Project a Day (No. 75): Understand Anything - The AI Engine That Turns Any Codebase Into an Explorable Knowledge Graph From mock-only-works to real-world-works: 48 hours of reCAPTCHA debugging I built a free music tool AI Talking Avatar Pipelines Broke Our Ad CTR by 3.7% 800G to 400G Breakout: How to Scale 400G Networks with 800G Ports 터미널 AI 에이전트 구축 (v20) Topical Authority Architecture Inside Hermes Agent's Session Memory: What X-Hermes-Session-Id Actually Does How Logs Travel From Your EKS Pod to Datadog The Hidden Journey Inside / Kubernetes Is it safe to connect my bank account to AI? No Room — The World of Aying (8/12) Fossils — The World of Aying (10/12)
I built a local Postgres triage co-pilot because HIPAA says I can't paste plans into ChatGPT or Claude
Sireesha Cha · 2026-05-25 · via DEV Community

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

I am a Staff DBA at a top public healthcare company. I look after MS SQL Server and PostgreSQL fleets, and every one of our database servers is on-prem and air-gapped. We do not have the option to use a frontier LLM for production work, because of HIPAA. Query plans, table names, filter literals, none of it is allowed to leave our network.

That means when one of our Postgres instances starts misbehaving at 3 AM, I cannot drop an EXPLAIN plan into frontier models like ChatGPT or Claude and ask what changed. I read the JSON by hand, the way I did in 2015. Most of my colleagues do the same.

I built Plansmith for myself. You feed it two EXPLAIN (ANALYZE, FORMAT JSON) outputs, a baseline from when the query was healthy and the incident plan from now, and it gives you a triage runbook in plain English: what changed in the plan, ranked root cause hypotheses, things you can try in the next five minutes, and the durable fix. The whole pipeline runs locally, including the model. The only outbound network call Plansmith makes is to 127.0.0.1:11434, which is Ollama on the same laptop.

The piece that makes a small local model work for this is that Gemma 4 is never asked to read the raw plan JSON. A deterministic Python pass walks both plan trees first and pulls out a structured diff. Scan method flipped from Index Scan to Seq Scan. Planner row estimate off by 124,000x. Sort spilled to disk, hash join needed 8 batches, lost the parallel workers it had before. Only that compact set of named findings goes to the model, with the numbers already measured. Gemma’s job is the part it is actually good at, which is producing the human explanation and the ranked runbook on top of evidence I have already extracted.

Demo

The CLI run on a real sample that ships with the repo, a parameter-sniffing regression on a customer-history query:

$ plansmith analyze \
    --baseline samples/baseline_q1.json \
    --incident samples/incident_q1.json \
    --query    samples/q1.sql

╭─────────────────── Plansmith ───────────────────╮
│ baseline 15ms · incident 7421ms · 497.7× slower │
╰─────────────────────────────────────────────────╯
                    Structured diff findings
┏━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Severity ┃ Kind               ┃ Summary                           ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ CRITICAL │ runtime_regression │ Execution time grew 497.7×        │
│ CRITICAL │ scan_method_flip   │ orders: Index Scan → Seq Scan     │
│ CRITICAL │ row_estimate_miss  │ Seq Scan on orders off by 124158× │
│ CRITICAL │ nested_loop_blowup │ Nested Loop with 248,317 outer    │
└──────────┴────────────────────┴───────────────────────────────────┘

─────────── Triage report from gemma4:e4b ───────────

Enter fullscreen mode Exit fullscreen mode

And then Gemma 4 streams the runbook in. This is verbatim from a local run on my laptop, no editing:

TL;DR

The planner incorrectly switched to a Sequential Scan on orders because statistics are stale, leading to massive row overestimation and a subsequent join blowup.

Likely root causes (ranked)

  1. Stale statistics on the orders table, causing the planner to drastically underestimate the cardinality of the filtered result set. (supported by finding row_estimate_miss)
  2. The lack of a suitable composite index covering the WHERE clause columns (customer_id, placed_at) or the join key. (supported by finding scan_method_flip)
  3. The planner incorrectly choosing a Nested Loop join due to the highly inflated row estimates. (supported by finding nested_loop_blowup)

Immediate mitigations (minutes)

  • Run VACUUM ANALYZE orders; to force an update of statistics on the affected table.
  • If the index is known to be correct, temporarily force the planner: SET LOCAL enable_seqscan = off;
  • Run EXPLAIN (ANALYZE, BUFFERS) SELECT … to confirm the plan change before deploying any fix.

Permanent fixes (days)

  • Create a composite index on orders covering the filter columns: CREATE INDEX idx_orders_customer_time ON orders (customer_id, placed_at);
  • Increase the default_statistics_target for orders.customer_id and orders.placed_at to prevent recurrence.
  • If the customer base is highly skewed, consider partitioning the orders table by customer_id.

That is roughly what I would have written myself, in maybe two minutes instead of twenty. The second sample in the repo is a different shape of bug, statistics drift on an analytics rollup that ends up 92x slower with a sort spill and a hash spill, and the runbook for that one is just as good.

There is also a small Flask web UI, plansmith serve, that gives you two textareas, a button, and a streamed Markdown panel. It is the same code path, just over Server-Sent Events so the report appears as Gemma generates it. Useful when you are sitting next to a colleague and want to talk through what the model is suggesting.

Code

GitHub: github.com/sireesha-chavvakula/plansmith, Apache-2.0.

plansmith/
├── plansmith/
│   ├── plan_diff.py     # the deterministic EXPLAIN JSON diff
│   ├── triage.py        # Ollama / Gemma 4 prompt and streaming
│   ├── cli.py
│   ├── web.py
│   └── templates/index.html
├── samples/
│   ├── q1.sql, baseline_q1.json, incident_q1.json   # parameter sniffing
│   └── q2.sql, baseline_q2.json, incident_q2.json   # stats drift
├── pyproject.toml
├── LICENSE
└── README.md

Enter fullscreen mode Exit fullscreen mode

About 640 lines of Python and 240 lines of HTML and CSS. Three dependencies, flask, requests, rich. No external services.

The categories of plan regression that plan_diff.py currently detects:

Finding What triggers it
runtime_regression top-line execution time grew 2x or more
scan_method_flip same relation went from Index / Index Only / Bitmap Scan to Seq Scan
row_estimate_miss any node where planner rows vs actual rows is off by 100x
nested_loop_blowup Nested Loop where the outer side returned 10,000+ rows
sort_spilled_to_disk Sort node with method containing “Disk”
hash_spilled_to_disk Hash node with batches > 1
parallelism_lost baseline used workers, incident used zero
join_strategy_change the set of join node types differs between plans

These are the patterns that account for the large majority of plan-shape regressions I have actually triaged in production. The list is short on purpose; I would rather catch the common bugs cleanly than have a long catalog of rules that fire on noise.

How I Used Gemma 4

Why E4B

There are three Gemma 4 variants and the judges want to see why I picked the one I did. For Plansmith the answer is short.

The 31B dense model would write slightly more polished prose. It will not run on the laptop that is sitting next to me when I am on call, and it definitely will not run on the bastion host I have to use to even reach the air-gapped DB network. The 26B MoE is overkill for what is essentially “fill in a six-section template using these eight findings as evidence.” E4B at Q4_K_M is about 8 GB on disk, fits in laptop RAM, streams the runbook fast enough to feel responsive, and the 128K context window means I never have to truncate a plan, even a wide partition-heavy one. It is the smallest Gemma 4 variant that still produces output I would be willing to sign off on, and for an air-gapped DBA tool that is exactly the right tradeoff.

How the prompt is shaped

Gemma never sees raw EXPLAIN JSON. The structured diff produced by plan_diff.py is the entire user message:

{
  "baseline_runtime_ms": 14.91,
  "incident_runtime_ms": 7421.41,
  "slowdown_factor": 497.75,
  "findings": [
    { "kind": "scan_method_flip", "severity": "critical",
      "summary": "Table 'orders' flipped from Index Scan to Seq Scan.",
      "detail": { "relation": "orders", "before": "Index Scan", "after": "Seq Scan" }},
    { "kind": "row_estimate_miss", "severity": "critical",
      "summary": "Seq Scan on orders misestimated rows by 124158×",
      "detail": { "plan_rows": 2, "actual_rows": 248317, "ratio": 124158.5 }}
  ]
}

Enter fullscreen mode Exit fullscreen mode

The system prompt does two things. It tells the model to trust the numbers it was given and not invent new ones, and it locks the output into an exact six-section Markdown skeleton (TL;DR, What the plans say, Likely root causes ranked, Immediate mitigations, Permanent fixes, Verification).

That separation matters more than the prompt wording itself. The deterministic pass does the arithmetic and the pattern recognition. Gemma does the judgment, the ranking, and the prose. A 4B model is plenty for that second job, and it cannot hallucinate row counts because it was never asked to compute any.

Where Plansmith goes next

The two things on my own roadmap:

  1. SQL Server support. Half my day is in MSSQL. The XML showplan format is more verbose than Postgres JSON but the structural diff is the same idea, just a different parser. The Gemma prompt does not need to change.
  2. Multi-query mode. Feed it a directory of pg_stat_statements plan captures, rank queries by regression severity, produce one runbook per query. This is the actual on-call workflow.

If anyone reading this works in a regulated environment and wants to compare notes on local-only DBA tooling, I would love to hear from you. The “we cannot use the cloud LLMs” group of practitioners is larger than it looks and we are all building the same kinds of tools in parallel.