惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

酷 壳 – CoolShell
酷 壳 – CoolShell
H
Hacker News: Front Page
P
Palo Alto Networks Blog
T
ThreatConnect
Apple Machine Learning Research
Apple Machine Learning Research
博客园_首页
T
True Tiger Recordings
P
Privacy & Cybersecurity Law Blog
B
Blog
IT之家
IT之家
Last Week in AI
Last Week in AI
F
Full Disclosure
Hacker News: Ask HN
Hacker News: Ask HN
C
Comments on: Blog
Microsoft Azure Blog
Microsoft Azure Blog
C
Cybersecurity and Infrastructure Security Agency CISA
Microsoft Security Blog
Microsoft Security Blog
博客园 - 【当耐特】
N
News and Events Feed by Topic
NISL@THU
NISL@THU
腾讯CDC
雷峰网
雷峰网
Security Latest
Security Latest
李成银的技术随笔
M
Microsoft Research Blog - Microsoft Research
L
LangChain Blog
L
Lohrmann on Cybersecurity
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
C
Check Point Blog
Y
Y Combinator Blog
Recent Announcements
Recent Announcements
博客园 - Franky
N
News | PayPal Newsroom
V
V2EX
A
About on SuperTechFans
The Register - Security
The Register - Security
月光博客
月光博客
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Google Online Security Blog
Google Online Security Blog
MyScale Blog
MyScale Blog
Cisco Talos Blog
Cisco Talos Blog
Vercel News
Vercel News
WordPress大学
WordPress大学
C
Cyber Attacks, Cyber Crime and Cyber Security
The Hacker News
The Hacker News
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
爱范儿
爱范儿
A
Arctic Wolf
L
LINUX DO - 最新话题
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

NETSCOUT

The 1 A.M. Cloud Migration Meltdown Communication Service Provider Supports Banking Application Success Across International Borders Defending Against DDoS Attacks at Scale AI-Driven Workflow Automation Is the New North Star for Communication Service Providers Key Takeaways from the EMA Network Management Megatrends 2026 The Digital Foundation of Public Trust Is More Than Skin Deep Unlocking the Full Value of 5G with Network Slicing NETSCOUT to Have a Strong Presence at Cisco Live Why Airlines and Airports Must Embrace Observability Ahead of the Summer Travel Surge Beyond “Best Effort”: Why Carrier Grade 5G Slicing Matters More Than Ever The Shrinking Lifespan of SSL/TLS Certificates From Packets to Insight: How Curated Network Data Powers AI Data Centers Are Feeling the Heat, and That’s OK If You Can’t See the Slice, You Can’t Sell the SLA Insights from the GigaOm Radar for Network Observability v6 Report How Shadow AI Creates Zombie Infrastructure NETSCOUT Earns Eight Leader Badges in the G2 Spring 2026 Grid Reports Your Modern Manufacturing Network Deserves a Modern Observability Strategy How Botnet-Driven DDoS Attacks Evolved in 2H 2025 The Hidden Cost of Poor Network Observability Insurance Systems Look Simple, but the Infrastructure Isn’t How AI is Transforming the RAN With the Right Data When Cloud SaaS DDoS Mitigation Offerings Aren’t Enough Frictionless Banking Experiences Start with Observability Colocation Growth Demands Scalable End-to-End Observability Bringing Shadow AI Into the Light AIOps Outcomes Depend on Data Quality, Not Algorithms Why AI, Zero Trust, and Modern Security Require Deep Visibility How Service Behavior Changes in Remote Locations The 10-Hour Problem: How Visibility Gaps Are Burning Out the SOC From Insight to Impact: Observability Fuels AI-Driven Innovation How Orphaned Applications Are Quietly Fueling Your Shadow IT Problem Why Today’s Security Tools Can’t See the Network Anymore How NETSCOUT Addresses Modern Network Observability Challenges Helping IT Organizations Prevent Disruptions Before They Impact Business How Hidden Blind Spots Quietly Became Cybersecurity’s Biggest Vulnerability The Blame Game! Is it the Network or Gaps in Observability? Six Winter 2026 G2 Leader Badges Prove This DDoS Protection Stands Out The Value of Combining Modern Observability Solutions for Actionable Insights AI Failure Is the Norm Because Most Initiatives Are Flying Blind NETSCOUT Distinguished by Frost & Sullivan with the 2025 Company of the Year Recognition 5 Emerging AI Data Trends Enterprise IT Teams Cannot Ignore What is Network Slicing NETSCOUT’s Omnis Cyber Intelligence Earns Security Today’s 2025 CyberSecured Award Turning a Flood of 5G Data into Rocket Fuel for AIOps NETSCOUT Recognized by Comparably as a Top Workplace for Q4 2025 How to deliver consistent ultra-low latency, high-throughput, and total reliability across complex networks Smart Data: The Super Fuel Driving Next-Gen Observability NETSCOUT Recognized for Leadership in Network Detection and Response Integrating Deep Packet Inspection in 5G Networks Removing Barriers to Digital Transformation Gain Real-time Visibility to Future Proof Your Network for Autonomous Operations Why Is Cloud Performance Still Foggy? Smarter DDoS Security at Scale How DPI Is Transforming Observability and Operational Resilience 10 Key Challenges to Optimizing Radio Access Networks in the 5G Era Why Arbor Edge Defense and CDN-Based DDoS Protection Are Better Together NETSCOUT’s Holiday Playlist for IT Teams and Leaders More Data Does Not Always Equate to Better Business Visibility Seeing Clearly with Deep Packet Inspection at Scale How to Ensure High Availability for FWA Services System Integrators and the Future of Enterprise IT The Transformative Power of ‘Thinking’ AI and the Implications for Business Observability for the “Always On” Power Industry
How Fast Can Your Organization Identify and Resolve IT Outages?
2025-11-04 · via NETSCOUT

It has happened again. It was just October 20 that one IT disruption in a major cloud provider environment impacted the applications and websites of hundreds of enterprise businesses and government agencies around the world. Just a week later, on October 29, another public cloud provider suffered a major disruption that affected hundreds of other web-based services, customer-facing applications, and online games.

The most recent outage reportedly started around 11:40 a.m. ET. The failure triggered widespread connectivity issues for millions of users and was a reminder of how dependent modern communications are on a small number of major networking providers.

Problem Identification and Resolution

The issue was determined to have been an inadvertent configuration change error. Others labeled it a “problematic configuration change.” To resolve the issue, the provider deployed the last known good configuration, and by 5:30 p.m. that same day, the provider believed that would resolve the issue. To achieve full success, the recovery effort required additional reloading of configurations and rebalancing of traffic across a large volume of nodes. Ultimately, the provider was able to restore normal operations at scale; shortly after 8:00 p.m. that evening, the vast majority of services had been restored.

According to the Uptime Institute, the top causes of IT outages are due to configurations or change management failures:

  • 45 percent of the time for network outages
  • 64 percent of the time for system/software outages

It bears repeating: IT disruptions are going to happen. If catastrophic outages can occur in the environments of the world’s largest and most advanced technology leaders, then they can happen in any enterprise.

What Organizations Can Learn from These IT Outages

Companies large and small have had to deal with the outages of their valued cloud providers these last couple of weeks, just as they had to with last summer’s major outage. Consider what could happen if the outage was in your corporate environment.  We posed the question last week: Are you prepared for an outage in your environment?  Let’s minimize the noise and just look at the causes and outage time for each of the incidents.

  • Last summer, a faulty software upgrade resulted in hours to days of disruptions and remediation for affected enterprises
  • Last week, a Domain Name System (DNS) issue took approximately 15 hours from detection to remediation
  • This week, a problematic configuration change caused an eight-hour outage

These are all very common, frequently occurring problems that can easily happen in any enterprise network. Would it mean an 8-hour or 15-hour outage for your business? Would it be shorter? Could it be longer?

How fast can your IT organization go from detection to remediation if one of these common issues occurs in your environment?

The Need for Rapid Response to IT Outages in Your Network

In the wake of two major internet outages just a week apart there is no doubt IT and executive leadership at corporations and government agencies around the world are having conversations. Some questions may be covering their network’s resilience, disaster recovery, and redundancy. Others may establish new policies and processes for handling the outages.

In our last blog, we offered a four-step process to follow if your organization is experiencing a network disruption:

  • Implement true observability—not just monitoring
  • Establish incident readiness processes
  • Understand what you control and don’t control
  • Build collaboration across teams and vendors

These principles have been reinforced by recent events, and the guidance in our previous blog is worth revisiting, because each step plays a meaningful role in improving how your organization responds to disruptions.

How Observability Can Impact MTTR

What did you think of your answer to the question “How fast can your organization identify and recover from an outage?” If the answer was unacceptable, now might be an excellent time to evaluate how observability can reduce mean time to restore (MTTR) services following a disruption.

You’re not alone. In Enterprise Management Associates’ April 2025 study “Enterprise Strategies for Hybrid, Multi-Cloud Networks,” the IT research and consulting firm reported that only 29 percent of survey respondents were fully satisfied with their monitoring solution. Legacy, reactive troubleshooting tools; individual vendor point tools; and gaps in visibility have made many monitoring products obsolete.

Implementing the Best in Observability and DPI at Scale

Leveraging the right observability solution, supported by deep packet inspection (DPI) at scale, can significantly reduce MTTR when issues arise in your network. Much has changed in how networks are architected today, many of which may have evolved into more distributed, hybrid setups. Some critical business services might still run in private cloud environments, while many everyday applications now live in public cloud or colocation facilities, or are delivered via software-as-a-service (SaaS) and unified-communications-as-a-service (UCaaS) providers. Employees may connect via virtual private network (VPN) or virtual desktop infrastructure (VDI) hosted in colocation sites, and internet and wide-area network (WAN) services are delivered by multiple carriers around the world. The path from user to application has become far more complex, and IT organizations no longer have full visibility or control across all potential points of failure.

An observability strategy that overcomes visibility gaps enterprisewide— from remote locations through to hybrid/multicloud—with DPI at scale has the potential to dramatically reduce MTTR. DPI-based observability reveals the actual traffic flows across the infrastructure, showing the interactions between applications, services, and networks in real time. For instance, when DNS fails, a software update breaks a dependency, or a configuration change impacts service delivery, DPI can help pinpoint where in the ecosystem it exists and the community of users it is impacting. DPI reduces the mean time to knowledge (MTTK) on why the problem exists as well as lowering the overall MTTR for services in the environment (see Figure 1).

Figure 1: The top graph illustrates the mean time to resolution (MTTR) problem management lifecycle that includes four stages: identification, knowledge, fix, and verify. Each stage takes up a varying length of time in the overall process. The bottom graph shows that by reducing the time spent in MTTI and MTTK stages, organizations can dramatically reduce overall MTTR.

Figure 1: Observability strategy leveraging DPI at scale can provide ecosystemwide analysis to identify and resolve problems such as the ones causing recent internet outages.

Are You Ready to Reduce MTTR in Your Environment?

No one wants to be the latest headline in a news cycle. If resolving incidents in your environment takes longer than it should, delaying action only increases risk. Modern networks are more distributed, complex, and dependent on third-party services than ever, which makes identifying issues and restoring services difficult without the right visibility. NETSCOUT can help you build an observability strategy that restores control, accelerates resolution, and strengthens resilience in your digital environment.

Learn more about NETSCOUT’s observability solutions and how you can use DPI for Smart Data to put control at your fingertips.