惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

S
Secure Thoughts
S
Securelist
P
Proofpoint News Feed
D
DataBreaches.Net
Cisco Talos Blog
Cisco Talos Blog
C
CXSECURITY Database RSS Feed - CXSecurity.com
Project Zero
Project Zero
A
About on SuperTechFans
罗磊的独立博客
WordPress大学
WordPress大学
月光博客
月光博客
Latest news
Latest news
C
Cyber Attacks, Cyber Crime and Cyber Security
GbyAI
GbyAI
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
博客园 - 三生石上(FineUI控件)
F
Fortinet All Blogs
W
WeLiveSecurity
Attack and Defense Labs
Attack and Defense Labs
V
Visual Studio Blog
Blog — PlanetScale
Blog — PlanetScale
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
P
Privacy International News Feed
AI
AI
博客园 - 司徒正美
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Stack Overflow Blog
Stack Overflow Blog
M
MIT News - Artificial intelligence
Help Net Security
Help Net Security
T
Tor Project blog
V
Vulnerabilities – Threatpost
C
Cisco Blogs
I
Intezer
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
MyScale Blog
MyScale Blog
雷峰网
雷峰网
MongoDB | Blog
MongoDB | Blog
Forbes - Security
Forbes - Security
V
V2EX
Apple Machine Learning Research
Apple Machine Learning Research
T
Threat Research - Cisco Blogs
B
Blog RSS Feed
博客园 - 叶小钗
N
News and Events Feed by Topic
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
Simon Willison's Weblog
Simon Willison's Weblog
C
CERT Recently Published Vulnerability Notes
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
N
News and Events Feed by Topic

The latest news from GitHub - The GitHub Blog

GitHub availability report: May 2026 GitHub Universe is back: All together now, in the agentic era GitHub Copilot app: The agent-native desktop experience Still a developer. Just outside. Our latest GitHub Shop collection is here. GitHub availability report: April 2026 GitHub Copilot individual plans: Introducing flex allotments in Pro and Pro+, and a new Max plan GitHub Copilot is moving to usage-based billing Changes to GitHub Copilot Individual plans Bringing more transparency to GitHub’s status page GitHub availability report: March 2026 GitHub Universe is back: We want you to take the stage Updates to GitHub Copilot interaction data usage policy
An update on GitHub availability
Vlad Fedorov · 2026-04-28 · via The latest news from GitHub - The GitHub Blog

I wanted to give an update on GitHub’s availability in light of two recent incidents. Both of those incidents are not acceptable, and we are sorry for the impact they had on you. I wanted to share some details on them, as well as explain what we’ve done and what we’re doing to improve our reliability.

We started executing our plan to increase GitHub’s capacity by 10X in October 2025 with a goal of substantially improving reliability and failover. By February 2026, it was clear that we needed to design for a future that requires 30X today’s scale.

The main driver is a rapid change in how software is being built. Since the second half of December 2025, agentic development workflows have accelerated sharply. By nearly every measure, the direction is already clear: repository creation, pull request activity, API usage, automation, and large-repository workloads are all growing quickly.

Three line graphs showing record acceleration of pull requests merged (peaking at 90M), commits (peaking at 1.4B), and new repos per month (20M).

This exponential growth does not stress one system at a time. A pull request can touch Git storage, mergeability checks, branch protection, GitHub Actions, search, notifications, permissions, webhooks, APIs, background jobs, caches, and databases. At high scale, small inefficiencies compound: queues deepen, cache misses become database load, indexes fall behind, retries amplify traffic, and one slow dependency can affect several product experiences.

Our priorities are clear: availability first, then capacity, then new features. We are reducing unnecessary work, improving caching, isolating critical services, removing single points of failure, and moving performance-sensitive paths into systems designed for these workloads. This is distributed systems work: reducing hidden coupling, limiting blast radius, and making GitHub degrade gracefully when one subsystem is under pressure. We’re making progress quickly, but these incidents are examples of where there’s still work to do.

What we’re doing

Short term, we had to resolve a variety of bottlenecks that appeared faster than expected from moving webhooks to a different backend (out of MySQL), redesigning user session cache to redoing authentication and authorization flows to substantially reduce database load. We also leveraged our migration to Azure to stand up a lot more compute.

Next we focused on isolating critical services like git and GitHub Actions from other workloads and minimizing the blast radius by minimizing single points of failure. This work started with careful analysis of dependencies and different tiers of traffic to understand what needs to be pulled apart and how we can minimize impact on legitimate traffic from various attacks. Then we addressed those in order of risk. Similarly, we accelerated parts of migrating performance or scale sensitive code out of Ruby monolith into Go.

While we were already in progress of migrating out of our smaller custom data centers into public cloud, we started working on path to multi cloud. This longer-term measure is necessary to achieve the level of resilience, low latency, and flexibility that will be needed in the future.

The number of repositories on GitHub is growing faster than ever, but a much harder scaling challenge is the rise of large monorepos. For the last three months, we’ve been investing heavily in response to this trend both within git system and in the pull request experience.

We will have a separate blog post soon describing extensive work we’ve done and the new upcoming API design for greater efficiency and scale. As part of this work, we have invested in optimizing merge queue operations, since that is key for repos that have many thousands of pull requests a day.

Recent incidents

The two recent incidents were different in cause and impact, but both reflect why we are increasing our focus on availability, isolation, and blast-radius reduction.

April 23 merge queue incident

On April 23, pull requests experienced a regression affecting merge queue operations.

Pull requests merged through merge queue using the squash merge method produced incorrect merge commits when a merge group contained more than one pull request. In affected cases, changes from previously merged pull requests and prior commits were inadvertently reverted by subsequent merges.

During the impact window, 658 repositories and 2,092 pull requests were affected. We initially shared slightly higher numbers because our first assessment was intentionally conservative. The issue did not affect pull requests merged outside merge queue, nor did it affect merge queue groups using merge or rebase methods.

There was no data loss: all commits remained stored in Git. However, the state of affected default branches was incorrect, and we could not safely repair every repository automatically. More details are available in the incident root cause analysis.

This incident exposed multiple process failures, and we are changing those processes to prevent this class of issue from recurring.

On April 27, an incident affected our Elasticsearch subsystem, which powers several search-backed experiences across GitHub, including parts of pull requests, issues, and projects.

We are still completing the root cause analysis and will publish it shortly. What we know now is that the cluster became overloaded (likely due to a botnet attack) and stopped returning search results. There was no data loss, and Git operations and APIs were not impacted. However, parts of the UI that depended on search showed no results, which caused a significant disruption.

This is one of the systems we had not yet fully isolated to eliminate as a single point of failure, because other areas had been higher in our risk-prioritized reliability work. That impact is unacceptable, and we are using the same dependency and blast-radius analysis described above to reduce the likelihood and impact of this type of failure in the future.

Increasing transparency

We have also heard clear feedback that customers need greater transparency during incidents.

We recently updated the GitHub status page to include availability numbers. We have also committed to statusing incidents both large and small, so you do not have to guess whether an issue is on your side or ours.

We are continuing to improve how we categorize incidents so that the scale and scope are easier to understand. We are also working on better ways for customers to report incidents and share signals with us during disruptions.

Our commitment

GitHub’s role has always been to support developers on an open and extensible platform.

The team at GitHub is incredibly passionate about our work. We hear the pain you’re experiencing. We read every email, social post, support ticket, and we take it all to heart. We’re sorry.

We are committed to improving availability, increasing resilience, scaling for the future of software development, and communicating more transparently along the way.

Editor’s note: This post was updated on April 28, 2026, to update the number of repos affected during the April 23 incident.

Written by

Vlad Fedorov

Vladimir Fedorov is GitHub's Chief Technology Officer, bringing decades of experience in engineering leadership and innovation. A passionate advocate for developer productivity, Vlad is leading GitHub’s engineering team to shape the future of developer tools and innovation with a developer-first mindset.

Before joining GitHub, Vlad co-founded UserClouds, a startup specializing in data governance and privacy. He spent 12 years at Facebook, now Meta, as Senior Vice President, leading engineering teams of over 2,000 across Privacy, Ads, and Platform. Earlier in his career, Vlad worked at Microsoft and earned both his BS and MS in Computer Science from Caltech. He currently serves on the board of Codepath.org, an organization dedicated to reprogramming higher education to create the first AI-native generation of engineers, CTOs, and founders.

Vlad lives in the Bay Area and when not working enjoys spending time outside and on the water with his family.

Explore more from GitHub

Docs

Docs

Everything you need to master GitHub, all in one place.

Go to Docs

GitHub

GitHub

Build what’s next on GitHub, the place for anyone from anywhere to build anything.

Start building

Customer stories

Customer stories

Meet the companies and engineering teams that build with GitHub.

Learn more

The GitHub Podcast

The GitHub Podcast

Catch up on the GitHub podcast, a show dedicated to the topics, trends, stories and culture in and around the open source developer community on GitHub.

Listen now