惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

爱范儿
爱范儿
E
Exploit-DB.com RSS Feed
Google DeepMind News
Google DeepMind News
F
Full Disclosure
D
Darknet – Hacking Tools, Hacker News & Cyber Security
T
ThreatConnect
Stack Overflow Blog
Stack Overflow Blog
Last Week in AI
Last Week in AI
Martin Fowler
Martin Fowler
G
GRAHAM CLULEY
C
Check Point Blog
T
Threatpost
I
Intezer
Spread Privacy
Spread Privacy
The Register - Security
The Register - Security
Project Zero
Project Zero
月光博客
月光博客
人人都是产品经理
人人都是产品经理
阮一峰的网络日志
阮一峰的网络日志
D
DataBreaches.Net
IT之家
IT之家
Malwarebytes
Malwarebytes
T
The Blog of Author Tim Ferriss
P
Privacy International News Feed
P
Palo Alto Networks Blog
T
The Exploit Database - CXSecurity.com
量子位
李成银的技术随笔
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Cisco Talos Blog
Cisco Talos Blog
Know Your Adversary
Know Your Adversary
美团技术团队
The GitHub Blog
The GitHub Blog
T
Tor Project blog
M
MIT News - Artificial intelligence
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Google Online Security Blog
Google Online Security Blog
P
Proofpoint News Feed
有赞技术团队
有赞技术团队
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
博客园 - 司徒正美
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
C
Comments on: Blog
T
Threat Research - Cisco Blogs
aimingoo的专栏
aimingoo的专栏
Security Latest
Security Latest
NISL@THU
NISL@THU
The Cloudflare Blog
H
Help Net Security
Recent Commits to openclaw:main
Recent Commits to openclaw:main

The Cloudflare Blog

The day my ping took countermeasures Announcing Claude Compliance API support with Cloudflare CASB Announcing Claude Managed Agents on Cloudflare Project Glasswing: what Mythos showed us Our billing pipeline was suddenly slow. The culprit was a hidden bottleneck in ClickHouse Browser Run: now running on Cloudflare Containers, it’s faster and more scalable When "idle" isn't idle: how a Linux kernel optimization became a QUIC bug Building For The Future How Cloudflare responded to the “Copy Fail” Linux vulnerability When DNSSEC goes wrong: how we responded to the .de TLD outage Code Orange: Fail Small is complete. The result is a stronger Cloudflare network Introducing Dynamic Workflows: durable execution that follows the tenant Post-quantum encryption for Cloudflare IPsec is generally available Agents can now create Cloudflare accounts, buy domains, and deploy Shutdowns, power outages, and conflict: a review of Q1 2026 Internet disruptions Making Rust Workers reliable: panic and abort recovery in wasm‑bindgen Moving past bots vs. humans Building the agentic cloud: everything we launched during Agents Week 2026 The AI engineering stack we built internally — on the platform we ship Orchestrating AI Code Review at scale Introducing the Agent Readiness score. Check to see if your site is agent-ready Shared Dictionaries: compression that keeps up with the agentic web Redirects for AI Training enforces canonical content Unweight: how we compressed an LLM 22% without sacrificing quality Agents that remember: introducing Agent Memory Agents Week: network performance update Introducing Flagship: feature flags built for the age of AI Cloudflare’s AI Platform: an inference layer designed for agents Building the foundation for running extra-large language models AI Search: the search primitive for your agents Deploy Postgres and MySQL databases with PlanetScale + Workers Artifacts: versioned storage that speaks Git Email for agents - Cloudflare Email Service now in public beta Project Think: building the next generation of AI agents on Cloudflare Introducing Agent Lee - a new interface to the Cloudflare stack Register domains wherever you build: Cloudflare Registrar API now in beta Browser Run: give your agents a browser Rearchitecting the Workflows control plane for the agentic era Add voice to your agent Managed OAuth for Access: make internal apps agent-ready in one click Securing non-human identities: automated revocation, OAuth, and scoped permissions Scaling MCP adoption: Our reference architecture for simpler, safer and cheaper enterprise deployments of MCP Secure private networking for everyone: users, nodes, agents, Workers — introducing Cloudflare Mesh Building a CLI for all of Cloudflare Durable Objects in Dynamic Workers: Give each AI-generated app its own database Agents have their own computers with Sandboxes GA Dynamic, identity-aware, and secure Sandbox auth Welcome to Agents Week 500 Tbps of capacity: 16 years of scaling our global network From bytecode to bytes- automated magic packet generation Cloudflare targets 2029 for full post-quantum security How we built Organizations to help enterprises manage Cloudflare at scale Why we're rethinking cache for the AI era Our ongoing commitment to privacy for the 1.1.1.1 public DNS resolver Introducing EmDash — the spiritual successor to WordPress that solves plugin security Introducing Programmable Flow Protection: custom DDoS mitigation logic for Magic Transit customers Cloudflare Client-Side Security: smarter detection, now open to everyone How we use Abstract Syntax Trees (ASTs) to turn Workflows code into visual diagrams A one-line Kubernetes fix that saved 600 hours a year Sandboxing AI agents, 100x faster Inside Gen 13- how we built our most powerful server yet Launching Cloudflare’s Gen 13 servers- trading cache for cores for 2x edge compute performance Powering the agents: Workers AI now runs large models, starting with Kimi K2.5 Introducing Custom Regions for precision data control Standing up for the open Internet- why we appealed Italy’s Piracy Shield fine From legacy architecture to Cloudflare One Announcing Cloudflare Account Abuse Protection: prevent fraudulent attacks from bots and humans Slashing agent token costs by 98% with RFC 9457-compliant error responses AI Security for Apps is now generally available Building a security overview dashboard for actionable insights Investigating multi-vector attacks in Log Explorer Translating risk insights into actionable protection: leveling up security posture with Cloudflare and Mastercard Fixing request smuggling vulnerabilities in Pingora OSS deployments Active defense: introducing a stateful vulnerability scanner for APIs Complexity is a choice. SASE migrations shouldn’t take years. From the endpoint to the prompt: a unified data security vision in Cloudflare One Ending the "silent drop": how Dynamic Path MTU Discovery makes the Cloudflare One Client more resilient A QUICker SASE client: re-building Proxy Mode How Automatic Return Routing solves IP overlap Always-on detections: eliminating the WAF “log versus block” trade-off Mind the gap: new tools for continuous enforcement from boot to login Stop reacting to breaches and start preventing them with User Risk Scoring Defeating the deepfake: stopping laptop farms and insider threats Moving from license plates to badges: the Gateway Authorization Proxy Evolving Cloudflare’s Threat Intelligence Platform: actionable, scalable, and ETL-less Introducing the 2026 Cloudflare Threat Report See risk, fix risk: introducing Remediation in Cloudflare CASB How Cloudy translates complex security into human action From reactive to proactive: closing the phishing gap with LLMs Modernizing with agile SASE: a Cloudflare One blog takeover Beyond the blank slate: how Cloudflare accelerates your Zero Trust journey The truly programmable SASE platform Toxic combinations: when small signals add up to a security incident We deserve a better streams API for JavaScript The most-seen UI on the Internet? Redesigning Turnstile and Challenge Pages ASPA: making Internet routing more secure Bringing more transparency to post-quantum usage, encrypted messaging, and routing security How we rebuilt Next.js with AI in one week Cloudflare One is the first SASE offering modern post-quantum encryption across the full platform Cloudflare outage on February 20, 2026
Graceful upgrades in Go
Cloudflare Team · 2018-10-11 · via The Cloudflare Blog

2018-10-11

5 min read

Dingle Dangle! by Grant C. (CC-BY 2.0)

The idea behind graceful upgrades is to swap out the configuration and code of a process while it is running, without anyone noticing it. If this sounds error-prone, dangerous, undesirable and in general a bad idea – I’m with you. However, sometimes you really need them. Usually this happens in an environment where there is no load balancing layer. We have these at Cloudflare, which led to us investigating and implementing various solutions to this problem.

Coincidentally, implementing graceful upgrades involves some fun low-level systems programming, which is probably why there are already a bajillion options out there. Read on to learn what trade-offs there are, and why you should really use the Go library we are about to open source. For the impatient, the code is on GitHub and you can read the documentation on godoc.

The basics

So what does it mean for a process to perform a graceful upgrade? Let’s use a web server as an example: we want to be able to fire HTTP requests at it, and never see an error because a graceful upgrade is happening.

We know that HTTP uses TCP under the hood, and that we interface with TCP using the BSD socket API. We have told the OS that we’d like to receive connections on port 80, and the OS has given us a listening socket, on which we call Accept() to wait for new clients.

A new client will be refused if the OS doesn’t know of a listening socket for port 80, or nothing is calling Accept() on it. The trick of a graceful upgrade is to make sure that neither of these two things occur while we somehow restart our service. Let’s look at the all the ways we could achieve this, from simple to complex.

Just Exec()

Ok, how hard can it be. Let’s just Exec() the new binary (without doing a fork first). This does exactly what we want, by replacing the currently running code with the new code from disk.

// The following is pseudo-Go.

func main() {
	var ln net.Listener
	if isUpgrade {
		ln = net.FileListener(os.NewFile(uintptr(fdNumber), "listener"))
	} else {
		ln = net.Listen(network, address)
	}
	
	go handleRequests(ln)

	<-waitForUpgradeRequest

	syscall.Exec(os.Argv[0], os.Argv[1:], os.Environ())
}

Unfortunately this has a fatal flaw since we can’t “undo” the exec. Imagine a configuration file with too much white space in it or an extra semicolon. The new process would try to read that file, get an error and exit.

Even if the exec call works, this solution assumes that initialisation of the new process is practically instantaneous. We can get into a situation where the kernel refuses new connections because the listen queue is overflowing.

New connections may be dropped if Accept() is not called regularly enough

Specifically, the new binary is going to spend some time after Exec() to initialise, which delays calls to  Accept(). This means the backlog of new connections grows until some are dropped. Plain exec is out of the game.

Listen() all the things

Since just using exec is out of the question, we can try the next best thing. Lets fork and exec a new process which then goes through its usual start up routine. At some point it will create a few sockets by listening on some addresses, except that won’t work out-of-the-box due to errno 48, otherwise known as Address Already In Use. The kernel is preventing us from listening on the address and port combination used by the old process.

Of course, there is a flag to fix that: SO_REUSEPORT. This tells the kernel to ignore the fact that there is already a listening socket for a given address and port, and just allocate a new one.

func main() {
	ln := net.ListenWithReusePort(network, address)

	go handleRequests(ln)

	<-waitForUpgradeRequest

	cmd := exec.Command(os.Argv[0], os.Argv[1:])
	cmd.Start()

	<-waitForNewProcess
}

Now both processes have working listening sockets and the upgrade works. Right?

SO_REUSEPORT is a little bit peculiar in what it does inside the kernel. As systems programmers, we tend to think of a socket as the file descriptor that is returned by the socket call. The kernel however makes a distinction between the data structure of a socket, and one or more file descriptors pointing at it. It creates a separate socket structure if you bind using SO_REUSEPORT, not just another file descriptor. The old and the new process are thus referring to two separate sockets, which happen to share the same address. This leads to an unavoidable race condition: new-but-not-yet-accepted connections on the socket used by the old process will be orphaned and terminated by the kernel. GitHub wrote an excellent blog post about this problem.

The engineers at GitHub solved the problems with SO_REUSEPORT by using an obscure feature of the sendmsg syscall called ancilliary data. It turns out that ancillary data can include file descriptors. Using this API made sense for GitHub, since it allowed them to integrate elegantly with HAProxy. Since we have the luxury of changing the program we can use simpler alternatives.

NGINX is the tried and trusted workhorse of the Internet, and happens to support graceful upgrades. As a bonus we also use it at Cloudflare, so we were confident in its implementation.

It is written in a process-per-core model, which means that instead of spawning a bunch of threads NGINX runs a process per logical CPU core. Additionally, there is a primary process which orchestrates graceful upgrades.

The primary is responsible for creating all listen sockets used by NGINX and sharing them with the workers. This is fairly straightforward: first, the FD_CLOEXEC bit is cleared on all listen sockets. This means that they are not closed when the exec() syscall is made. The primary then does the customary fork() / exec() dance to spawn the workers, passing the file descriptor numbers as an environment variable.

Graceful upgrades make use of the same mechanism. We can spawn a new primary process (PID 1176) by following the NGINX documentation. This inherits the existing listeners from the old primary process (PID 1017) just like workers do. The new primary then spawns its own workers:

 CGroup: /system.slice/nginx.service
       	├─1017 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
       	├─1019 nginx: worker process
       	├─1021 nginx: worker process
       	├─1024 nginx: worker process
       	├─1026 nginx: worker process
       	├─1027 nginx: worker process
       	├─1028 nginx: worker process
       	├─1029 nginx: worker process
       	├─1030 nginx: worker process
       	├─1176 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
       	├─1187 nginx: worker process
       	├─1188 nginx: worker process
       	├─1190 nginx: worker process
       	├─1191 nginx: worker process
       	├─1192 nginx: worker process
       	├─1193 nginx: worker process
       	├─1194 nginx: worker process
       	└─1195 nginx: worker process

At this point there are two completely independent NGINX processes running. PID 1176 might be a new version of NGINX, or could use an updated config file. When a new connection arrives for port 80, one of the 16 worker processes is chosen by the kernel.

After executing the remaining steps, we end up with a fully replaced NGINX:

   CGroup: /system.slice/nginx.service
       	├─1176 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
       	├─1187 nginx: worker process
       	├─1188 nginx: worker process
       	├─1190 nginx: worker process
       	├─1191 nginx: worker process
       	├─1192 nginx: worker process
       	├─1193 nginx: worker process
       	├─1194 nginx: worker process
       	└─1195 nginx: worker process

Now, when a request arrives the kernel chooses between one of the eight remaining processes.

This process is rather fickle, so NGINX has a safeguard in place. Try requesting a second upgrade while the first hasn’t finished, and you’ll find the following message in the error log:

[crit] 1176#1176: the changing binary signal is ignored: you should shutdown or terminate before either old or new binary's process

This is very sensible, there is no good reason why there should be more than two processes at any given point in time. In the best case, we also want this behaviour from our Go solution.

Graceful upgrade wishlist

The way NGINX has implemented graceful upgrades is very nice. There is a clear life cycle which determines valid actions at any point in time:

It also solves the problems we’ve identified with the other approaches. Really, we’d like NGINX-style graceful upgrades as a Go library.

  • No old code keeps running after a successful upgrade

  • The new process can crash during initialisation, without bad effects

  • Only a single upgrade is active at any point in time

Of course, the Go community has produced some fine libraries just for this occasion. We looked at

just to name a few. Each of them is different in its implementation and trade-offs, but none of them ticked all of our boxes. The most common problem is that they are designed to gracefully upgrade an http server. This makes their API much nicer, but removes flexibility that we need to support other socket based protocols. So really, there was absolutely no choice but to write our own library, called tableflip. Having fun was not part of the equation.

tableflip

tableflip is a Go library for NGINX-style graceful upgrades. Here is what using it looks like:

upg, _ := tableflip.New(tableflip.Options{})
defer upg.Stop()

// Do an upgrade on SIGHUP
go func() {
    sig := make(chan os.Signal, 1)
    signal.Notify(sig, syscall.SIGHUP)
    for range sig {
   	    _ = upg.Upgrade()
    }
}()

// Start a HTTP server
ln, _ := upg.Fds.Listen("tcp", "localhost:8080")
server := http.Server{}
go server.Serve(ln)

// Tell the parent we are ready
_ = upg.Ready()

// Wait to be replaced with a new process
<-upg.Exit()

// Wait for connections to drain.
server.Shutdown(context.TODO())

Calling Upgrader.Upgrade spawns a new process with the necessary net.Listeners, and waits for the new process to signal that it has finished initialisation, to die or to time out. Calling it when an upgrade is ongoing returns an error.

Upgrader.Fds.Listen is inspired by facebookgo/grace and allows inheriting net.Listener easily. Behind the scenes, Fds makes sure that unused inherited sockets are cleaned up. This includes UNIX sockets, which are tricky due to UnlinkOnClose. You can also pass straight up *os.File to the new process if you desire.

Finally, Upgrader.Ready cleans up unused fds and signals the parent process that initialisation is done. The parent can then exit, which completes the graceful upgrade cycle.

GoProgramming