惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Google DeepMind News
Google DeepMind News
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
Security Latest
Security Latest
P
Palo Alto Networks Blog
AWS News Blog
AWS News Blog
NISL@THU
NISL@THU
T
Threatpost
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Latest news
Latest news
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
WordPress大学
WordPress大学
J
Java Code Geeks
P
Privacy International News Feed
阮一峰的网络日志
阮一峰的网络日志
S
Schneier on Security
博客园 - 聂微东
Project Zero
Project Zero
美团技术团队
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Scott Helme
Scott Helme
I
Intezer
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
H
Hacker News: Front Page
S
Security @ Cisco Blogs
博客园 - 司徒正美
O
OpenAI News
Last Week in AI
Last Week in AI
L
LINUX DO - 热门话题
酷 壳 – CoolShell
酷 壳 – CoolShell
SecWiki News
SecWiki News
月光博客
月光博客
S
Security Affairs
The GitHub Blog
The GitHub Blog
P
Privacy & Cybersecurity Law Blog
S
Secure Thoughts
V
V2EX
S
Securelist
F
Fortinet All Blogs
W
WeLiveSecurity
D
Docker
博客园 - 三生石上(FineUI控件)
Simon Willison's Weblog
Simon Willison's Weblog
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
C
Cyber Attacks, Cyber Crime and Cyber Security
V
Visual Studio Blog
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Webroot Blog
Webroot Blog
Engineering at Meta
Engineering at Meta

InfoWorld

AWS boosts CloudWatch Logs query limits by 10x to ease debugging for developers, SREs 21 LLMs tuned for special domains The new AI lock-in AWS adds Advanced Prompt Optimization tool to Bedrock Capacity markets could reshape cloud computing Four cutting-edge tools for spec-driven development Anthropic puts Claude agents on a meter across its subscriptions Notion courts developers with a platform for AI agents and workflow automation Using continuous purple teaming to protect fast-paced enterprise environments A better way to work with SQL Server Evidence-driven workflows: Rethinking enterprise process design AWS debuts Graviton-powered Redshift RG instances to cut analytics costs SAP’s AI promises last year? Most are still rolling out First look: Lemonade serves up local AI with limitations GitLab CEO sees developer tool bill increasing 100-fold Red Hat adds support for agentic AI development What’s new and exciting in JDK 26 Kill the loading spinner with local-first data and reactive SQL A networking revolution at AWS Tokenmaxxing is super dumb Hands-on with React, Supabase, and PowerSync How to add AI to an existing product (without annoying users) Your AI doesn’t need another database What happens when engineering teams reorganize around AI agents Python isn’t always easy When cloud giants meddle in markets 12 model-level deep cuts to slash AI training costs The best new features in Python 3.15 Teradata launches platform for enterprise AI agents moving beyond pilots Three skills that matter when AI handles the coding MongoDB targets AI’s retrieval problem Building AI apps and agents with Microsoft Foundry Designing front-end systems for cloud failure No, AI won’t destroy software development jobs Diskless databases: What happens when storage isn’t the bottleneck Vibe coding or spec-driven development? The agentic AI distraction Vibe coding or spec-driven development? How to choose Cloud providers are blinded by agentic AI SAP to acquire data lakehouse vendor Dremio Small language models: Rethinking enterprise AI architecture Making AI work through eval hygiene Improving AI agents through better evaluations AI in the cloud is easy but expensive Running AI in the cloud is easy – and expensive Making AI work for databases Harness teams of agentic coders with Squad Harness teams of coding agents with Squad Oracle NetSuite announces AI coding skills for SuiteCloud developers Why it’s so hard to create stand-alone Python apps A new challenge for software product managers The hidden cost of front-end complexity GitHub shifts Copilot to usage-based billing, signaling a new cost model for enterprise AI tools OpenAI’s Symphony spec pushes coding agents from prompts to orchestration The front-end architecture trilemma: Reactivity vs. hypermedia vs. local-first apps Enterprise AI is missing the business core The best JavaScript certifications for getting hired Google begins putting the guardrails on agentic AI Why world models are AI’s next frontier Where to begin a cloud career Google pitches Agentic Data Cloud to help enterprises turn data into context for AI agents How open source ideals must expand for AI Is your Node.js project really secure? How I doubled my GPU efficiency without buying a single new card SpaceX secures option to acquire AI coding startup Cursor for $60B Google’s Gemma 4 shines on local systems – both big and small AI is upending the SaaS game How AI is upending SaaS tools Snowflake offers help to users and builders of AI agents From the engine room to the bridge: What the modern leadership shift means for architects like me Addressing the challenges of unstructured data governance for AI The cookbook for safe, powerful agents Enterprises are rethinking Kubernetes GitHub pauses new Copilot sign-ups as agentic AI strains infrastructure Best practices for building agentic systems Making agents dull Oracle delivers semantic search without LLMs Exciting Python features are on the way Ease into Azure Kubernetes Application Network The agent tier: Rethinking runtime architecture for context-driven enterprise workflows The two-pass compiler is back – this time, it’s fixing AI code generation MuleSoft Agent Fabric adds new ways to keep AI agents in line Salesforce launches Headless 360 to support agent‑first enterprise workflows Tap into the AI APIs of Google Chrome and Microsoft Edge Where will developer wisdom come from? GitHub adds Stacked PRs to speed complex code reviews The hyperscalers are pricing themselves out of AI workloads HTMX 4.0: Hypermedia finds a new gear Google Cloud introduces QueryData to help AI agents create reliable database queries Hands-on with the Google Agent Development Kit Are AI certifications worth the investment? AWS targets AI agent sprawl with new Bedrock Agent Registry Cloud degrees are moving online Swift for Visual Studio Code comes to Open VSX Registry AI agents aren't failing. The coordination layer is failing Anthropic rolls out Claude Managed Agents Microsoft’s reauthentication snafu cuts off developers globally Meta’s Muse Spark: a smaller, faster AI model for broad app deployment Bringing databases and Kubernetes together AWS turns its S3 storage service into a file system for AI agents
When cloud giants neglect resilience
2026-04-17 · via InfoWorld

In a recent article chronicling the history of Microsoft Azure and its intensifying woes, we see a narrative that has been building throughout the industry for years. As cloud computing evolved from a buzzword to the backbone of digital infrastructure, major providers like Microsoft, Amazon, and Google have had to make compromises. Their promises of near-perfect uptime shifted from an expectation to “good enough,” influenced by economic pressures that have seen the cloud giants prioritize cost cuts and staff reductions over previously non-negotiable service reliability.

Frankly, many who follow the cloud space closely, including myself, have been warning about this situation for some time. Cloud outages are no longer rare, freak events. They are ingrained in the model as accepted collateral for the rapid growth and relentless cost-cutting that define this era of cloud computing. The story of Azure, as discussed in the referenced Register piece, is simply the latest and most prominent example of a much larger, industrywide trend.

This is not to say that cloud computing is inherently unstable or that its advantages—agility, scalability, rapid deployment—are a mirage. Enterprises aren’t abandoning the cloud. Far from it. Adoption continues at pace, even as these high-profile outages occur. The question is not whether the cloud is worth it, but rather, how much unreliability is acceptable for all that innovation and efficiency?

The price of cost optimization

If you trace the decisions of major public cloud players, a clear theme emerges. Competitive pressure from rivals translates to constant cost control, rushing services to market, shaving operational budgets, automating wherever possible, and reducing (or outright eliminating) teams of deeply experienced engineering talent who once ensured continuity and institutional knowledge. The comments from a former Azure engineer clearly illustrate how an exodus of talent, paired with an almost single-minded focus on AI and automation, is having downstream effects on the platform’s stability and support.

The irony is sharp: As cloud providers trumpet their AI prowess and machine-driven automation, the human expertise that built and reliably ran these platforms is no longer considered mission-critical. Automation isn’t a cure-all; companies still need experienced architects and operators who understand system limits, manage dependencies, handle failures, and respond deftly to unpredictable failures. Recent major outages reflect the slow but sure loss of that critically embedded human knowledge. Meanwhile, engineering decisions are increasingly made by those tasked with juggling ever-larger portfolios, new feature launches, and cost-reduction mandates, rather than contributing a methodical focus on resilience and craftsmanship.

Azure faces growing pains at scale, with tens of thousands of AI-generated lines of code created, tested, and deployed daily—sometimes by other AI agents —creating a self-reinforcing cycle of complexity and opacity. The resulting “compute crunch” puts even more strain on infrastructure, which, despite its sophistication, now handles heavier loads with fewer people providing oversight.

Outages aren’t driving users away

A natural question emerges: With reliability clearly taking a back seat, why aren’t enterprises reconsidering cloud altogether? I’ve argued for years that the game has changed. The benefits of cloud centralization, automation, and connectivity have become so fundamental to operations that the industry has quietly recalibrated its tolerance for outages. Public cloud is so deeply embedded into the business and digital operations that stepping back would mean undoing years, and often decades, of progress.

Headline-grabbing outages are dramatic but usually survivable. Disaster recovery plans, multi-region deployments, and architectural workarounds are now essentials for all major cloud-based companies. Building with failure in mind is a standard cost, not an avoidable exception. For most CIOs, the persistent risk of downtime is a manageable variable, balanced against the unmatchable benefits of cloud agility and in-house scale.

Providers know this well, and their actions reflect it. Outages may sting a bit in the press, but the real-world consequences have yet to outweigh the benefits to companies that push further into the cloud. As such, the providers’ logic is simple: As long as customers accept outages, however grudgingly, there’s little incentive to switch to costlier, less scalable systems.

How enterprises can adapt

With outages now the price of admission, enterprises should recognize that neither staff cuts nor the blind pursuit of automation will stop anytime soon. Cloud providers may promise improvements, but their incentives will remain focused on cost control over reliability. Organizations must adapt to this new normal, but they can still make choices that reduce their risk.

First, enterprises should prioritize fault-resistant cloud architecture. Adopting multicloud and hybrid cloud strategies, while complex, reduces the technical risk associated with reliance on a single provider.

Second, it’s crucial to invest in in-house expertise that understands both the workloads and the nuances of cloud service behavior. While the providers may treat their operations talent as expendable, nothing will replace the value of an enterprise’s in-house team to independently monitor, test, and prepare for the unexpected.

Finally, enterprises must enforce strict vendor management. This means holding providers accountable for promised service-level agreements, monitoring transparency in communication and incident reporting, and leveraging contracted services to their fullest extent, especially as the cloud market matures and customer influence grows.

The era of the infallible cloud is over. As public cloud providers pursue operational efficiency and AI dominance, resilience has taken a hit, and both providers and users must adapt. The challenge for today’s enterprises is to strategically mitigate the most likely consequences before the next outage strikes.