惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

博客园 - 三生石上(FineUI控件)
T
Threat Research - Cisco Blogs
月光博客
月光博客
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
爱范儿
爱范儿
Hugging Face - Blog
Hugging Face - Blog
腾讯CDC
云风的 BLOG
云风的 BLOG
D
Docker
罗磊的独立博客
U
Unit 42
博客园 - 聂微东
人人都是产品经理
人人都是产品经理
P
Proofpoint News Feed
博客园 - Franky
Apple Machine Learning Research
Apple Machine Learning Research
MyScale Blog
MyScale Blog
B
Blog RSS Feed
美团技术团队
J
Java Code Geeks
S
Securelist
Cyberwarzone
Cyberwarzone
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
NISL@THU
NISL@THU
Security Latest
Security Latest
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Recorded Future
Recorded Future
Hacker News - Newest:
Hacker News - Newest: "LLM"
L
LINUX DO - 热门话题
Recent Announcements
Recent Announcements
Last Week in AI
Last Week in AI
A
About on SuperTechFans
MongoDB | Blog
MongoDB | Blog
Spread Privacy
Spread Privacy
T
Tenable Blog
I
Intezer
N
News | PayPal Newsroom
大猫的无限游戏
大猫的无限游戏
A
Arctic Wolf
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
V
V2EX - 技术
S
Schneier on Security
S
SegmentFault 最新的问题
Latest news
Latest news
宝玉的分享
宝玉的分享
V
Visual Studio Blog
V
V2EX
T
Tor Project blog
C
Comments on: Blog

DEV Community

Where Tensor-Parallel Inference Hits the NVLink Wall I open-sourced a World Cup 2026 prediction model — and tested it honestly WordPress Emails Were Failing Silently on DigitalOcean. Here's What Broke. Reading Belgium's KBO/CBE registry: what the live API returns 🤫 I Built CodeMoji: A VS Code Extension That Turns Code Into Emojis 5 AI Pair Programming Patterns That Actually Speed Up Development LLD Object-Oriented Design: From Requirements to Classes (Bridging Thinking to Domain Modeling) How We Built a CTO-Grade Grafana Dashboard With Codex How We Built a CTO-Grade Grafana Dashboard With Codex T-Slot Bolts and Nuts for Secure Industrial Clamping From Tools to Economic Actors: Why AI Agents Need Independent Financial Infrastructure Designing a Scalable Event-Driven Data Processing Pipeline with Apache Kafka Streams M4 Pro vs M5 Pro: Which Apple Silicon Chip Wins? Calling GET API & Mapping Response in VBCS (Service Connection) How to Connect a Trailhead Org to Salesforce CLI WebAssembly - Why Your Browser Can Finally Do What Desktop Apps Could I built a CLI tool to find worthwhile GitHub issues to contribute to I Built a Global Location Data API with 12M+ Cities — Here's How Why we built a metadata-driven Angular framework instead of using Retool Your PR Queue Is the New Technical Debt. AI Code Review Is the Fix Nobody Set Up Yet. Hermes Quant: Zero-Cost Autonomous Equity Research Agent Powered by Hermes 3 RedBase / redb.Route / redb.Tsak 3.0.0 shipped Road To KiwiEngine #5: The Future of SaaS Might Be Operational Ownership Comparison pages that say where the competition wins Devlog #3 Turning OpenClaw Governance Into an Operating Layer AI Governance as Infrastructure on AGTP How Instagram, WhatsApp, Uber & Netflix Would Be Built Today Using Expo Router Linux: A few more tips From Zero to Manifest V3: How GitHub Copilot Helped Me Finish an RSC Vulnerability Detector for CVE‑2025‑55182 Best use of Gemini in everyday life. How to configure Claude Code (and Cursor) so it stops ignoring your conventions The AI Agent Ecosystem in PHP - From Simple OpenAI Calls to Multi-Agent Platforms Building a Micro-SaaS Empire: A Step-by-Step Guide to Creating and Monetizing Open-Source Developer Tools Google's Agentic Leap: How Gemini Turned Workspace Into Your Autonomous Executive Assistant The Art of Package Publishing: Best Practices for Creating and Maintaining Popular Open-Source Libraries Architectural Foundations of Adobe Experience Manager: A Developer's Deep Dive (Part - 1) Automating Bulk Content Authoring in SitecoreAI with PowerShell Extensions I tried to hide semantic meaning from embeddings without breaking search Static vs Non-Static in Java: Understanding Class and Object Through a Shop Story Selling a macOS app outside the App Store is easy. Licensing is the hard part. The Agent That Actually Remembers You: A Deep Dive into Hermes Agent published What Is HTTP & HTTPS ? Frontend Architecture: Where Does This File Go? W. Edwards Deming: The Father of Total Quality Who Predicted the Future of AI Vue Teleport component React: How does VuReact convert it? How to audit an AI agent skill: the 7-check framework we used on 200 skills AI for Knowledge Management: Real Workflows That Hold Up W. Edwards Deming: El Padre de la Calidad Total que Predijo el Futuro de la IA Mamba/SSM Basics 4 Ways to Get Started with AITuber, Sorted by Level How I Fixed a CORS Error Without Knowing Backend - and What I Learned From It Using jQuery to hide a DIV when the user clicks outside of it. How to Build a PostgreSQL Backed Job Queue in Go Manifest AI联创Jacob谈Transformer的不足与提出 Power Retention Gemma Mentor AI: From an Unfinished Prototype to a Real-Time Multi-Agent Learning Companion Giving Your Digital Employee a Company Credit Card (With Limits) Your AI Assistant Just Bought a $30,000 Cloud Subscription How Writing Can Help You Escape AI Delirium Self-Hosted VPN in 2026: WireGuard, Headscale, NetBird and More Compared Build Cache Strategies: The Operational Burden of Speed Inside agent-gov: Architecture of an Agent Cost Governance Platform Why we chose a Rust template engine and Go APIs Announcing agent-gov: Open-Source AI Agent Cost Governance AI doesn't fail because the model is bad. It fails because there's nothing underneath it Polly wants a transcript: giving agents ears and a voice, on your own machine AI coding assistants make junior devs faster and worse at the same time AI Won't Save You From Forgetting How to Think encodeURI vs encodeURIComponent: The JavaScript URL Encoding Trap Building an MCP Server Using Spring AI, JSON-RPC and SSE (Server-Sent Events) PgBouncer: Effectively Managing Your PostgreSQL Connection Pool How much should I charge for 3D prints? A complete pricing breakdown for Etsy sellers How to Fix Core Web Vitals in a MERN Stack App (Complete Guide) Is AI-Native .NET Development Actually Happening in 2026? The 54-point production deployment checklist that saves you from 3am rollbacks How I Built Hidden Collector Game in Unity Moving Beyond the Context Window: The Agentic Memory Architecture 🚀 Building an open-source email blast tool — free, self-hosted, no Mailchimp needed. Looking for contributors to help add: 📊 Open & click tracking 🐳 Docker support All issues are open. Jump in 👇 https://github.com/nikhilt101/email-blast-tool Progressive Distillation System Design - 6.CAP Theorem & PACELC, CAP Theorem & PACELC: The Most Important Trade-off in Distributed Systems [Boost] AWS Summit India Online 2026: Get a FREE AWS Certificate Without Any Exam or Fees! 🚀 Why your React tournament bracket breaks in Safari (and a 4 KB pure-CSS fix) Markdown Is Becoming the AI App Interface AstroFit – My Fitness Tracking Web Application When WP-CLI fatals on the plugin you came to rescue # Agentic AI: Architecture of Autonomous Systems Why Most AI Agents Forget Everything — And Why Hermes Agent Changes the Game Hermes Agent's Brain: How Its Skills & Memory System Actually Works How to Structure Reusable Components in a Next.js Project Scribe vs ClickTrek vs Tango vs Guidde vs Floik: Workflow Documentation Tools Compared (2026) Awk! Awk! Add a diagram!: Greptile-style PR diagrams, minus the SaaS How to Share Client Links Safely: Custom URLs, Passwords, and Expiration Dates The AI Agent That Deleted Everything in 9 Seconds — And What Every Developer Needs to Know I built a time-decaying knowledge graph for my terminal — here's how it works Demystifying Linux File Permissions and chmod (Without the Guesswork) I Let Claude Design 4 Chaos Experiments via MCP. The 4th Took Down Staging and Found a 6-Month-Old Bug. Two agents passing strings to each other is not a multi-agent system — it's a pipeline, and the distinction matters System Design - 5.Latency vs Throughput Latency vs Throughput: Why "Average Response Time" Is the Biggest Lie in Engineering Insights of git (part :2) Mix and Match: Running Kiro on Google Cloud Shell
Database WAL Bloat Management: The Core Anatomy for Performance
Mustafa ERBAY · 2026-05-31 · via DEV Community

WAL: PostgreSQL's Logging Mechanism

When it comes to performance and data integrity in PostgreSQL, the WAL (Write-Ahead Logging) mechanism plays a critical role. Essentially, WAL ensures that the database server logs all changes to a log file before writing them to the actual data files on disk. This prevents data loss in case of a crash or power outage. Database operations are first written to the WAL buffer, then to WAL files. The data pages are written to their disk locations later.

This approach enhances the database's reliability. For example, if a transaction is successfully written to WAL, even if the system crashes, the transaction can be recovered. However, the accumulation of WAL files over time without proper management can lead to serious performance issues. This accumulation is called "WAL bloat" and, in addition to consuming disk space, it negatively impacts read/write performance.

The Anatomy of WAL Bloat: Causes and Symptoms

WAL bloat is the condition where WAL files occupy more space on disk than necessary. There can be several primary reasons for this. Firstly, in systems with long-running transactions or under heavy write loads, WAL segments may not be processed quickly enough. Secondly, the processes for cleaning up (archiving and deletion) WAL segments may not be functioning correctly. Specifically, WAL segments are not deleted and accumulate when archive_mode is off or when archive_command fails.

ℹ️ WAL Segment Lifespan

WAL segments are typically 16MB in size. PostgreSQL uses these segments in a cyclical manner. As operations are written to WAL, new segments are created. The basic principle is that processed WAL segments that are no longer needed are archived and deleted.

The most obvious symptom of bloat is the rapid depletion of disk space on the database server. When you observe this with the df -h command, you'll see that there's no space left on the disk where the PostgreSQL data directory is located. Furthermore, I/O performance degrades due to heavy WAL write activity, queries slow down, and general system responsiveness decreases. Tracking WAL LSN (Log Sequence Number) differences using functions like pg_wal_lsn_diff() is a good method to understand the growth rate.

Performance Impacts: More Than Just Disk Space

WAL bloat doesn't just fill up disk space; it directly affects database performance. Constantly accumulating WAL files increase the rate of random read/write operations on the disk. This situation dramatically reduces performance, especially on systems running on mechanical disks (HDDs). Even on SSDs, this can lead to performance degradation.

⚠️ WAL Load and Disk I/O

Heavy WAL write activity can create a bottleneck, especially in I/O bound systems. Disk queues fill up, transaction times increase, and this situation endangers overall system stability.

Additionally, the fragmentation of WAL files on disk also negatively impacts performance. PostgreSQL's time to read and process WAL files increases. This leads to replication lag and longer WAL segment cleanup operations. Consequently, the system's overall read/write efficiency decreases.

Effective WAL Management Strategies

To prevent and manage WAL bloat, several fundamental strategies can be employed. First, enabling archive_mode and correctly configuring archive_command is essential. This ensures that WAL segments are safely archived and subsequently deleted. It's critical to ensure that archive_command does not fail and that the archiving destination is accessible.

# Example archive_command configuration (archiving to a remote server using rsync)
archive_command = 'rsync -a %p user@remote_host:/path/to/wal_archive/%f'

Enter fullscreen mode Exit fullscreen mode

Secondly, correctly setting the wal_keep_segments parameter, or wal_keep_size in more modern PostgreSQL versions, which determines how long WAL segments are retained, is important. These parameters dictate how long WAL segments needed by replicas or backup tools will be kept before being deleted. However, misconfiguring these parameters can still lead to bloat, so caution is advised.

💡 wal_keep_size vs. wal_keep_segments

wal_keep_size determines the total size of WAL files, while wal_keep_segments determines the number of segments. In more modern PostgreSQL versions, wal_keep_size is preferred as it offers more flexible control.

A third strategy is to regularly clean up WAL segments. When the archiving command runs successfully, PostgreSQL automatically deletes archived WAL files. However, if archive_mode is off or there's an issue, these files remain on disk. In such cases, it might be necessary to manually or via a script clean up old WAL files. However, this operation must be done carefully; deleting files that replica servers or backup operations might need can lead to data loss.

Troubleshooting and Advanced Techniques

To resolve WAL bloat issues, the first step is to identify the source of the problem. Monitoring the size of the pg_wal directory and determining which WAL files are occupying the most space is the initial step. Commands like ls -lhS /var/lib/postgresql/data/pg_wal/ can be helpful here.

🔥 Danger of Manual Cleanup

Manually deleting WAL files is extremely risky. If you accidentally delete a WAL file needed by a replica server or a backup tool, replication can break, or restoration from backup may become impossible. Such operations should only be performed with full understanding and caution.

If archive_mode is off, enabling it and configuring a correct archive_command should be the first solution. If archive_command is running but WAL files are still accumulating, you need to check for issues at the archiving destination (disk full, no network connection, etc.).

At an advanced level, the max_wal_size parameter limits how much PostgreSQL can grow WAL files. This parameter can be useful, especially in environments with limited disk space or for controlling sudden WAL spikes. However, setting max_wal_size too low can trigger frequent CHECKPOINTs and lead to performance degradation. A CHECKPOINT operation writes data pages to disk and clears WAL logs.

Conclusion and Recommendations

WAL bloat in PostgreSQL is a condition that, if ignored, can lead to serious performance issues and the risk of data loss. Enabling archive_mode, correctly configuring archive_command, and carefully adjusting parameters like wal_keep_size are the fundamental steps to prevent this problem.

💡 Proactive Monitoring

Regularly monitoring the size of the WAL directory and the WAL write speed allows you to detect bloat issues before they grow. The pg_stat_wal_dir() function and system metrics can be used for this monitoring.

For performance and reliability, WAL management is an integral part of database operations. Understanding how this mechanism works and implementing appropriate maintenance strategies is vital for the health of your PostgreSQL databases. Sharing my experiences and the problems I've encountered on this topic is important to guide those facing similar situations.

As I mentioned in my previous post on [related: PostgreSQL Index Optimization], database performance is built from the convergence of many different components. WAL management is one of these components and, if neglected, can diminish the impact of other optimizations.