惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

P
Proofpoint News Feed
G
GRAHAM CLULEY
GbyAI
GbyAI
Martin Fowler
Martin Fowler
Last Week in AI
Last Week in AI
月光博客
月光博客
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
V
Visual Studio Blog
博客园 - 聂微东
aimingoo的专栏
aimingoo的专栏
The GitHub Blog
The GitHub Blog
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
Blog — PlanetScale
Blog — PlanetScale
The Cloudflare Blog
博客园 - 叶小钗
罗磊的独立博客
宝玉的分享
宝玉的分享
P
Privacy International News Feed
酷 壳 – CoolShell
酷 壳 – CoolShell
Scott Helme
Scott Helme
Project Zero
Project Zero
P
Palo Alto Networks Blog
F
Fortinet All Blogs
Help Net Security
Help Net Security
K
Kaspersky official blog
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
S
Schneier on Security
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
F
Full Disclosure
Webroot Blog
Webroot Blog
V
V2EX
C
Check Point Blog
L
LangChain Blog
阮一峰的网络日志
阮一峰的网络日志
H
Hacker News: Front Page
G
Google Developers Blog
Hugging Face - Blog
Hugging Face - Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
博客园_首页
Application and Cybersecurity Blog
Application and Cybersecurity Blog
H
Help Net Security
量子位
Recorded Future
Recorded Future
H
Heimdal Security Blog
雷峰网
雷峰网
T
The Blog of Author Tim Ferriss
www.infosecurity-magazine.com
www.infosecurity-magazine.com
O
OpenAI News
D
DataBreaches.Net

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python) The Hidden Cost of AI Systems Nobody Talks About. undefined vs undeclared, and how typeof behaves Switching from file-based jobs to NATS/Kafka in Rust without changing code io_uring Adventures: Rust Servers That Love Syscalls Why Agentic AI is Killing the Traditional Database The POUR principles of web accessibility for developers and designers Quantum Neural Network 3D — A Deep Dive into Interactive WebGL Visualization How To Install Caveman In Codex On macOS And Windows Automation Pipeline Reliability: Why Your Workflow Breaks When Nobody Is Watching I Built an 'Open World' AI Coding Agent — It Works From ANY Folder From Freelancing to Product: A Tech Service Company's SaaS Transformation China's AI Giants: Adding Tencent Hunyuan & ByteDance Doubao to AI University (74 Providers) On the Vibe Coders and Their Lies clerk: Auto-Summarize Your Claude Code Sessions AI Weekly — 2026/04/10–04/17 | The Model Lockdown Is Here, but the Toolchain Is the Real Battleground AI 週報 — 2026/04/10–2026/04/17 模型封鎖潮來了,但工具鏈才是真戰場 Maybe this is how Open-Source apps are born... 🚀 Fine-Tune LLMs with LoRA and QLoRA: 2026 Guide tRPC v11 + Next.js App Router: End-to-End Type Safety Without the Boilerplate ShadCN UI in 2026: Why I Stopped Installing Component Libraries and Started Owning My Components SaaS Billing in React Server Components: Stripe + Supabase Without a Single `useEffect` Join our DEV Weekend Challenge — $1,000 in Prizes Across TEN winners! Submissions Due April 20 at 6:59 AM UTC. Implementing FSRS Spaced Repetition in Flutter + Supabase — Adding Memory Science to an AI Learning App "I Texted My Localhost From the Train — Claude Code Fixed the Bug Before I Got Home" I Built a Sales Prep AI and It Went Deeper Than Expected Design to Code #2: One JSON, Eleven Outputs Solving the 100M-Row Problem: A Summary Table Pattern for High-Volume Push Notification Logs Flutter Web With Wasm: What Actually Changes For Developers I Built 50 Royalty-Free Soundtracks for My Side Project in a Weekend Using AI Music Generation The Vibe Coding Security Checklist: 7 Things to Check Before You Ship Stop Letting Googlebot Guess Fix Your React App's SEO Right Desconstruindo o Streaming do LinkedIn: Como Criar um Engine de Extração de Vídeo de Alta Performance com HLS e FFmpeg (EDA Part-1) EDA (Exploratory Data Analysis) Explained With Real Life — Why Looking at Your Data Is the Most Important Step in Machine Learning Brand Relationship Management at Scale: Our 4-Touch Outreach System for 200+ Brands Why String.fromEnvironment() Might Return an Empty String in Dart JGuardrails 1.0.0 — Hardening Java LLM Apps Against Jailbreaks, Toxicity, and Prompt Injection Plan and Schedule a Full Week of Threads Content From One Claude Conversation Coding Cat Oran Ep3, Five Tables Changed Everything BFF模式详解:构建前后端协同的中间层 I'm done watching freelancers get buried by 200 proposals. So I'm building the alternative. This is my first post BFS Algorithm in Java Step by Step Tutorial with Examples Tracking LLM Pricing Monthly: An Open Dataset for 22 AI Models How We Measure Content ROI on a Comparison Site: Revenue Attribution Without Perfect Data Introducing Nova AI Ops: The AI-Native Operating System for SRE Teams I built a free desktop video downloader for Windows — Grabbit How Talkie OCR Helps Vision-Impaired & Dyslexic Users Read the World Around Them VRCFaceTracking安装和iPhone面捕配置教程,有bug Even CrowdStrike Can't See Your Agents The Automation Gold Rush: What n8n Workflows and Claude Are Opening Up for Developers Right Now
The Silent Killers of Go Concurrency: Mutexes, Semaphores, and Goroutine Leaks
amir · 2026-05-25 · via DEV Community

Go makes concurrency look simple.

You write:

go func() {
    // do something concurrently
}()

Enter fullscreen mode Exit fullscreen mode

And suddenly your code is running in another goroutine.

That simplicity is one of the reasons I like Go so much. But after working on backend systems, notification pipelines, high-traffic APIs, and production services under real load, I learned something important:

Most concurrency problems in Go do not come from not using concurrency.

They come from using concurrency without understanding where the bottleneck actually is.

Sometimes the issue is a missing lock.

But very often, especially in production Go services, the issue is the opposite:

  • too much locking
  • locks held for too long
  • network I/O inside critical sections
  • goroutines that never exit
  • unbounded goroutine creation
  • WaitGroups copied by value
  • channels used without a cancellation strategy

In this article, I want to walk through the concurrency problems I have seen in real systems, how I reason about mutexes and semaphores, and how I usually debug these issues before they become production incidents.


The Real Problem: Concurrency That Accidentally Becomes Sequential

A service can look concurrent from the outside and still behave like a single-threaded application internally.

This usually happens when a large part of the request flow is hidden behind one shared lock.

A pattern like this is more common than many developers admit:

mu.Lock()
user.Name = "Test User"
sendEmail(user)
callDatabase(user)
mu.Unlock()

Enter fullscreen mode Exit fullscreen mode

At first glance, it may look safe.

The developer wanted to protect shared state. That part is reasonable. But the lock is now protecting much more than shared memory. It is protecting the entire flow:

  1. update a field
  2. send an email
  3. call the database
  4. maybe wait on network I/O
  5. maybe retry
  6. maybe block other goroutines for a long time

That is not just a mutex anymore.

That is a traffic jam.

Every goroutine that needs the same lock must wait until the whole flow finishes. So even if your service has hundreds or thousands of goroutines, a big part of the system becomes sequential.

The dangerous part is that CPU usage may still look normal or even low. Memory may also look fine. But latency increases, throughput drops, and p95/p99 response times become unstable.

This is why lock contention is sometimes difficult to notice from basic infrastructure metrics alone.


A Production-Style Example: Email Inside a Mutex

Imagine we have a service that updates user state and sends notifications.

type Service struct {
    mu    sync.Mutex
    state map[int]string
}

func (s *Service) ProcessUsers(users []User) {
    s.mu.Lock()
    defer s.mu.Unlock()

    for _, user := range users {
        s.state[user.ID] = "processed"
        sendEmail(user) // slow network I/O inside the lock
    }
}

Enter fullscreen mode Exit fullscreen mode

This code is safe from a data race perspective.

But it is dangerous from a performance perspective.

A mutex should protect the smallest possible shared memory operation. It should not protect slow external work like:

  • sending email
  • calling another microservice
  • database queries
  • HTTP requests
  • file uploads
  • logging to a slow external sink
  • waiting on a third-party API

The memory update may take nanoseconds or microseconds. The email call may take milliseconds or seconds.

That difference matters.

If the lock is held while sendEmail runs, every other goroutine that needs s.mu is blocked behind a network call.

A better version separates shared-state mutation from slow work:

func (s *Service) ProcessUsers(users []User) {
    emails := make([]User, 0, len(users))

    s.mu.Lock()
    for _, user := range users {
        s.state[user.ID] = "processed"
        emails = append(emails, user)
    }
    s.mu.Unlock()

    for _, user := range emails {
        sendEmail(user)
    }
}

Enter fullscreen mode Exit fullscreen mode

This is already better because the lock only protects the shared map.

But in a real production system, I usually prefer pushing the slow work to a queue or bounded worker pool:

func (s *Service) ProcessUsers(users []User, jobs chan<- EmailJob) {
    s.mu.Lock()
    for _, user := range users {
        s.state[user.ID] = "processed"
    }
    s.mu.Unlock()

    for _, user := range users {
        jobs <- EmailJob{UserID: user.ID, Email: user.Email}
    }
}

Enter fullscreen mode Exit fullscreen mode

Now the request path does not directly depend on the email provider latency.

That is the real fix.

Not just “use goroutines.”

The fix is designing the boundary between shared memory, external I/O, and backpressure.


Mutexes Are Not Bad. Large Critical Sections Are Bad.

I sometimes see developers become afraid of mutexes.

That is the wrong lesson.

sync.Mutex is simple, fast, and perfectly fine when used correctly. The problem is not the mutex. The problem is the size of the critical section.

This is what I try to keep in mind:

mu.Lock()
// only touch shared memory here
mu.Unlock()

Enter fullscreen mode Exit fullscreen mode

Not this:

mu.Lock()
// shared memory
// database call
// HTTP call
// email call
// JSON encoding
// logging
// metrics push
mu.Unlock()

Enter fullscreen mode Exit fullscreen mode

A good critical section should be boring.

It should usually do one of these:

  • read shared state
  • update shared state
  • copy shared state into a local variable
  • swap a pointer
  • increment a counter
  • append to a protected slice/map

Then unlock.

After that, do the expensive work outside the lock.


Under the Hood: What a Mutex Gives You

At a high level, a mutex gives you mutual exclusion: only one goroutine can enter a protected section at a time.

But it also gives you memory ordering guarantees.

In Go's memory model, an unlock operation synchronizes before a later lock operation on the same mutex. In practical terms, that means if one goroutine updates shared data and unlocks, another goroutine that later locks the same mutex can safely observe that update.

That is the part many developers forget.

A mutex is not just about “blocking other goroutines.” It is also about creating a safe visibility boundary between goroutines.

Without that boundary, different goroutines may read and write the same memory at the same time, and now you have a data race. Once you have a data race, your program is no longer something you can reason about confidently.

This is why I do not like “clever” lock-free code unless there is a very strong reason for it.

Most backend services do not need clever concurrency.

They need clear concurrency.


Semaphore: Controlling Capacity, Not Ownership

A mutex is usually about ownership of shared memory.

A semaphore is about capacity.

For example, suppose you want to process 10,000 users, but you do not want to send 10,000 emails at the same time.

A naive version might do this:

for _, user := range users {
    go sendEmail(user)
}

Enter fullscreen mode Exit fullscreen mode

This is dangerous because it creates unbounded concurrency.

If users has 10,000 items, you create 10,000 goroutines. If each goroutine performs network I/O, opens connections, allocates memory, and waits on an external provider, you can overload your own service before you overload the email provider.

A simple semaphore pattern fixes this:

sem := make(chan struct{}, 20) // allow only 20 concurrent email sends
var wg sync.WaitGroup

for _, user := range users {
    user := user

    sem <- struct{}{}
    wg.Add(1)

    go func() {
        defer wg.Done()
        defer func() { <-sem }()

        sendEmail(user)
    }()
}

wg.Wait()

Enter fullscreen mode Exit fullscreen mode

Now the code still uses concurrency, but concurrency is bounded.

That one detail is huge in production.

Unbounded concurrency is not scalability.

It is delayed failure.


A Better Worker Pool for Production Code

The semaphore pattern is useful, but for services that run continuously, I often prefer a worker pool.

type EmailJob struct {
    UserID int
    Email  string
}

func startEmailWorkers(ctx context.Context, workerCount int, jobs <-chan EmailJob) {
    var wg sync.WaitGroup

    for i := 0; i < workerCount; i++ {
        wg.Add(1)

        go func(workerID int) {
            defer wg.Done()

            for {
                select {
                case <-ctx.Done():
                    return

                case job, ok := <-jobs:
                    if !ok {
                        return
                    }

                    if err := sendEmailJob(ctx, job); err != nil {
                        // In real systems: log, retry, dead-letter, or expose metrics.
                        fmt.Printf("worker=%d failed to send email user_id=%d err=%v\n", workerID, job.UserID, err)
                    }
                }
            }
        }(i)
    }

    go func() {
        wg.Wait()
    }()
}

Enter fullscreen mode Exit fullscreen mode

This gives you much better operational control:

  • fixed concurrency
  • easier metrics
  • easier shutdown
  • easier retry strategy
  • easier backpressure
  • easier rate limiting

This is the difference between “I used goroutines” and “I designed a concurrent system.”


Goroutine Leak: The Bug That Does Not Explode Immediately

Goroutine leaks are one of the most common production problems in Go.

They are dangerous because the service may not crash immediately. It may slowly become worse over hours or days.

Here is a classic example:

func process() error {
    ch := make(chan result)

    go func() {
        ch <- heavyComputation()
    }()

    select {
    case res := <-ch:
        return handle(res)

    case <-time.After(1 * time.Second):
        return errors.New("timeout")
    }
}

Enter fullscreen mode Exit fullscreen mode

The problem is subtle.

ch is unbuffered.

If the timeout happens first, process returns. After that, there is no receiver waiting on ch.

When heavyComputation() finishes, the goroutine tries to send into ch and blocks forever.

That goroutine is now leaked.

One leaked goroutine may not matter.

Thousands of leaked goroutines matter.

A safer version uses a buffered channel:

func process() error {
    ch := make(chan result, 1)

    go func() {
        ch <- heavyComputation()
    }()

    select {
    case res := <-ch:
        return handle(res)

    case <-time.After(1 * time.Second):
        return errors.New("timeout")
    }
}

Enter fullscreen mode Exit fullscreen mode

This prevents the goroutine from blocking on send after the timeout.

But in real services, I prefer context-based cancellation:

func process(ctx context.Context) error {
    ctx, cancel := context.WithTimeout(ctx, 1*time.Second)
    defer cancel()

    ch := make(chan result, 1)

    go func() {
        res := heavyComputation(ctx)

        select {
        case ch <- res:
        case <-ctx.Done():
        }
    }()

    select {
    case res := <-ch:
        return handle(res)

    case <-ctx.Done():
        return ctx.Err()
    }
}

Enter fullscreen mode Exit fullscreen mode

The important lesson:

Every goroutine needs an exit path.

If you cannot explain how a goroutine stops, you probably have a leak waiting to happen.


WaitGroup by Value: A Small Mistake With a Big Impact

This mistake is very easy to miss in code review:

func worker(wg sync.WaitGroup) { // wrong: copied by value
    defer wg.Done()

    // do work
}

Enter fullscreen mode Exit fullscreen mode

sync.WaitGroup must not be copied after first use.

When you pass it by value, you copy its internal state. The worker calls Done() on the copy, not on the original WaitGroup that the main goroutine is waiting on.

That can cause a deadlock.

Correct version:

func worker(wg *sync.WaitGroup) {
    defer wg.Done()

    // do work
}

Enter fullscreen mode Exit fullscreen mode

And usage:

var wg sync.WaitGroup

for i := 0; i < 10; i++ {
    wg.Add(1)
    go worker(&wg)
}

wg.Wait()

Enter fullscreen mode Exit fullscreen mode

This rule also applies to other synchronization primitives like sync.Mutex.

Do not copy them after first use.


The Loop Variable Trap

This used to be one of the most famous Go concurrency bugs:

for _, user := range users {
    go func() {
        sendEmail(user)
    }()
}

Enter fullscreen mode Exit fullscreen mode

Depending on the Go version and context, capturing loop variables incorrectly could lead to goroutines using the wrong value.

The defensive pattern is still simple and clear:

for _, user := range users {
    user := user

    go func() {
        sendEmail(user)
    }()
}

Enter fullscreen mode Exit fullscreen mode

Even with improvements in newer Go versions, I still like this style in production code because it makes the ownership of the variable obvious to the reader.

Readable concurrency is maintainable concurrency.


How I Debug Lock Contention in Go

When I suspect a concurrency bottleneck, I do not start by guessing.

I start by measuring.

1. Enable pprof

import _ "net/http/pprof"

func main() {
    go func() {
        log.Println(http.ListenAndServe("localhost:6060", nil))
    }()

    // start application
}

Enter fullscreen mode Exit fullscreen mode

Then collect profiles:

go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30

Enter fullscreen mode Exit fullscreen mode

For mutex contention, enable mutex profiling:

runtime.SetMutexProfileFraction(1)

Enter fullscreen mode Exit fullscreen mode

Then inspect:

go tool pprof http://localhost:6060/debug/pprof/mutex

Enter fullscreen mode Exit fullscreen mode

2. Check goroutine count

A rising goroutine count is often a signal of blocked goroutines or leaks.

fmt.Println("goroutines:", runtime.NumGoroutine())

Enter fullscreen mode Exit fullscreen mode

For production, expose it as a metric:

prometheus.NewGaugeFunc(
    prometheus.GaugeOpts{
        Name: "go_goroutines_current",
        Help: "Current number of goroutines.",
    },
    func() float64 {
        return float64(runtime.NumGoroutine())
    },
)

Enter fullscreen mode Exit fullscreen mode

3. Dump goroutine stacks

When the service is stuck, goroutine dumps are gold.

curl http://localhost:6060/debug/pprof/goroutine?debug=2

Enter fullscreen mode Exit fullscreen mode

Look for many goroutines blocked on the same line:

sync.(*Mutex).Lock
chan send
chan receive
net/http.(*Transport).RoundTrip

Enter fullscreen mode Exit fullscreen mode

If 5,000 goroutines are blocked on the same lock or channel, you found your bottleneck.

4. Use the race detector in tests

go test -race ./...

Enter fullscreen mode Exit fullscreen mode

The race detector is not free, and you usually do not run it in production, but it is extremely useful in CI and local debugging.


My Practical Rules for Production Go Concurrency

These are the rules I try to follow when writing or reviewing concurrent Go code:

1. Keep locks small

Lock only the data that needs protection.

Do not lock the whole request lifecycle.

2. Never put slow I/O inside a mutex

Avoid database calls, HTTP calls, email sending, file uploads, and third-party API calls inside critical sections.

3. Bound concurrency

Do not create unlimited goroutines.

Use worker pools, semaphores, queues, or rate limiters.

4. Every goroutine needs a shutdown path

Use context.Context, channel close, or explicit cancellation.

5. Do not copy synchronization primitives

Pass *sync.WaitGroup, *sync.Mutex, and similar primitives by pointer when sharing them.

6. Measure before optimizing

Use pprof, runtime metrics, traces, logs, and goroutine dumps.

Guessing is not debugging.

7. Prefer boring concurrency

The best concurrent code is usually not clever.

It is clear, measurable, and easy to shut down.


Final Thoughts

Go gives us powerful concurrency tools, but it does not automatically give us good concurrent design.

A goroutine is cheap, but it is not free.

A mutex is fast, but it can destroy throughput if you hold it around slow work.

A channel is elegant, but it can leak goroutines if nobody is receiving.

A WaitGroup is simple, but copying it can break your entire flow.

For me, senior Go engineering is not about using every concurrency primitive. It is about knowing when not to use them, where the real boundary is, and how the system behaves under load.

The next time you write this:

mu.Lock()

Enter fullscreen mode Exit fullscreen mode

Ask one question before moving on:

What exactly am I protecting, and how fast can I release this lock?

That one question can save your service from a silent production bottleneck.


References