惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
WordPress大学
WordPress大学
T
Tailwind CSS Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
G
Google Developers Blog
aimingoo的专栏
aimingoo的专栏
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
C
CXSECURITY Database RSS Feed - CXSecurity.com
宝玉的分享
宝玉的分享
T
Threatpost
Google DeepMind News
Google DeepMind News
N
News and Events Feed by Topic
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
The Cloudflare Blog
Microsoft Azure Blog
Microsoft Azure Blog
云风的 BLOG
云风的 BLOG
Recent Announcements
Recent Announcements
NISL@THU
NISL@THU
MongoDB | Blog
MongoDB | Blog
美团技术团队
大猫的无限游戏
大猫的无限游戏
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
博客园 - 三生石上(FineUI控件)
B
Blog RSS Feed
Spread Privacy
Spread Privacy
W
WeLiveSecurity
Recorded Future
Recorded Future
D
DataBreaches.Net
The GitHub Blog
The GitHub Blog
P
Privacy International News Feed
P
Proofpoint News Feed
A
Arctic Wolf
Vercel News
Vercel News
D
Docker
L
LangChain Blog
C
Cybersecurity and Infrastructure Security Agency CISA
V
Visual Studio Blog
U
Unit 42
Project Zero
Project Zero
Apple Machine Learning Research
Apple Machine Learning Research
L
LINUX DO - 热门话题
雷峰网
雷峰网
S
Securelist
阮一峰的网络日志
阮一峰的网络日志
S
SegmentFault 最新的问题
酷 壳 – CoolShell
酷 壳 – CoolShell
T
Threat Research - Cisco Blogs
小众软件
小众软件
N
News and Events Feed by Topic

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python) The Hidden Cost of AI Systems Nobody Talks About. undefined vs undeclared, and how typeof behaves Switching from file-based jobs to NATS/Kafka in Rust without changing code io_uring Adventures: Rust Servers That Love Syscalls Why Agentic AI is Killing the Traditional Database The POUR principles of web accessibility for developers and designers Quantum Neural Network 3D — A Deep Dive into Interactive WebGL Visualization How To Install Caveman In Codex On macOS And Windows Automation Pipeline Reliability: Why Your Workflow Breaks When Nobody Is Watching I Built an 'Open World' AI Coding Agent — It Works From ANY Folder From Freelancing to Product: A Tech Service Company's SaaS Transformation China's AI Giants: Adding Tencent Hunyuan & ByteDance Doubao to AI University (74 Providers) On the Vibe Coders and Their Lies clerk: Auto-Summarize Your Claude Code Sessions AI Weekly — 2026/04/10–04/17 | The Model Lockdown Is Here, but the Toolchain Is the Real Battleground AI 週報 — 2026/04/10–2026/04/17 模型封鎖潮來了,但工具鏈才是真戰場 Maybe this is how Open-Source apps are born... 🚀 Fine-Tune LLMs with LoRA and QLoRA: 2026 Guide tRPC v11 + Next.js App Router: End-to-End Type Safety Without the Boilerplate ShadCN UI in 2026: Why I Stopped Installing Component Libraries and Started Owning My Components SaaS Billing in React Server Components: Stripe + Supabase Without a Single `useEffect` Join our DEV Weekend Challenge — $1,000 in Prizes Across TEN winners! Submissions Due April 20 at 6:59 AM UTC. Implementing FSRS Spaced Repetition in Flutter + Supabase — Adding Memory Science to an AI Learning App "I Texted My Localhost From the Train — Claude Code Fixed the Bug Before I Got Home" I Built a Sales Prep AI and It Went Deeper Than Expected Design to Code #2: One JSON, Eleven Outputs Solving the 100M-Row Problem: A Summary Table Pattern for High-Volume Push Notification Logs Flutter Web With Wasm: What Actually Changes For Developers I Built 50 Royalty-Free Soundtracks for My Side Project in a Weekend Using AI Music Generation The Vibe Coding Security Checklist: 7 Things to Check Before You Ship Stop Letting Googlebot Guess Fix Your React App's SEO Right Desconstruindo o Streaming do LinkedIn: Como Criar um Engine de Extração de Vídeo de Alta Performance com HLS e FFmpeg (EDA Part-1) EDA (Exploratory Data Analysis) Explained With Real Life — Why Looking at Your Data Is the Most Important Step in Machine Learning Brand Relationship Management at Scale: Our 4-Touch Outreach System for 200+ Brands Why String.fromEnvironment() Might Return an Empty String in Dart JGuardrails 1.0.0 — Hardening Java LLM Apps Against Jailbreaks, Toxicity, and Prompt Injection Plan and Schedule a Full Week of Threads Content From One Claude Conversation Coding Cat Oran Ep3, Five Tables Changed Everything BFF模式详解:构建前后端协同的中间层 I'm done watching freelancers get buried by 200 proposals. So I'm building the alternative. This is my first post BFS Algorithm in Java Step by Step Tutorial with Examples Tracking LLM Pricing Monthly: An Open Dataset for 22 AI Models How We Measure Content ROI on a Comparison Site: Revenue Attribution Without Perfect Data Introducing Nova AI Ops: The AI-Native Operating System for SRE Teams I built a free desktop video downloader for Windows — Grabbit How Talkie OCR Helps Vision-Impaired & Dyslexic Users Read the World Around Them VRCFaceTracking安装和iPhone面捕配置教程,有bug Even CrowdStrike Can't See Your Agents The Automation Gold Rush: What n8n Workflows and Claude Are Opening Up for Developers Right Now
Why tech leaders should track service level objectives (SLOs) in load testing campaigns
Gatling.io · 2026-05-20 · via DEV Community

When Canal+ needed to guarantee its streaming platform could handle millions of concurrent viewers during a major live football broadcast, the team didn't simply run a load test and hope for the best.

They ran progressive, iterative load campaigns against explicit performance targets, identified and resolved bottlenecks in caching and licensing APIs, and optimised machine sizing before a single viewer tuned in. The result: zero incidents during the broadcast. Not "fewer incidents than last time." Zero.

That outcome didn't come from running harder tests. It came from running smarter ones — anchored to Service Level Objectives that defined, in user-relevant terms, exactly what "good enough" meant before go-live.

For tech leaders, this is the core argument: load testing without SLOs is activity. Load testing with SLOs is governance.

The framework: SLIs, SLOs, SLAs, and error budgets

Before getting into practice, the terminology needs to be precise — because sloppy definitions lead to sloppy governance.

Google's SRE literature provides the clearest foundation:

  • SLI (Service Level Indicator): A quantitative measure of service behaviour — request latency, error rate, throughput, availability.
  • SLO (Service Level Objective): The target or acceptable range for that SLI. For example: "99.9% of checkout requests complete within 300 ms over a 30-day window."
  • SLA (Service Level Agreement): The external commitment to customers, usually with financial penalties attached.
  • Error budget: The allowable unreliability implied by the SLO. At 99.9%, that's roughly 43 minutes of downtime per month. At 99.99%, it drops to about 4 minutes.
  • Burn rate: How quickly that budget is being consumed, the key signal for operational urgency.

One leadership principle follows immediately from this structure: your internal SLO should be stricter than your public SLA. Google Cloud's own guidance illustrates this with a 99.95% internal SLO paired with a 99.9% SLA. That gap is a deliberate safety buffer — and running load tests against the internal SLO means you surface contractual risk while there's still time to fix it.

The second principle is equally important: SLOs must be user-centred, not infrastructure-centred. A load test that only reports CPU utilisation and median response time is measuring what's convenient, not what customers experience. The right SLI is the one that, if barely met, still keeps the typical user satisfied.

Read more: SLO vs SLA vs SLI: what's the difference and why It matters

How SLOs change the design of load tests

Most load testing today still asks the wrong question: "What was the maximum RPS we achieved in the lab?" SLO-driven load testing asks a more useful set of questions:

  • At what request rate do we stop meeting the user-relevant objective?
  • How quickly are we burning error budget when we miss it?
  • What component saturates first and how does the system behave when it does?

That reframing has four concrete effects on how campaigns are designed.

  1. Pass/fail becomes explicit: A load test without SLOs may report that p95 latency was 280 ms and CPU reached 78%, but it doesn't answer whether the system is ready to release. Tools like k6, Gatling, and Azure Load Testing all support encoding user-relevant thresholds directly in test execution, producing a true pass/fail signal rather than a dashboard someone must interpret later.**

    2. Load shapes become more realistic.** Google Cloud explicitly recommends open-loop load patterns for this reason: production clients don't self-throttle the way closed-loop generators do. Open-loop tests send requests at a steady rate regardless of response times, which better mimics real traffic. A test that passes under artificially polite load can still fail catastrophically when production traffic arrives without courtesy.**

    3. Overload behaviour becomes a first-class objective.** SLO-driven testing doesn't just ask "what's our capacity?" It asks "what happens when we exceed it?" Does the system shed load cleanly? Does it recover without cascading failures? These are the questions that matter on launch days and during demand spikes — and they're the questions that "peak RPS in the lab" benchmarks never answer.**

    4. Short tests connect to long-horizon budgets.** A production SLO is measured over days or weeks; a load test runs for minutes or hours. The bridge is burn rate: you don't need to recreate an entire month to show that current error rates would exhaust your monthly budget unacceptably fast. That calculation turns a single test run into a release signal.

Try the SLO advisor

The technical upside: five benefits engineers should know

Realistic target-setting

SLOs prevent teams from optimising for the wrong number. Lab-only peak throughput figures are internally satisfying but commercially irrelevant. The SLO focuses attention on the tail latency and success rate of the journeys customers actually take.

Better prioritization

Google's error-budget policy explicitly uses budget consumption to redirect effort from features to reliability. When a load test shows your checkout service is burning budget at 3× the sustainable rate, that's a data-driven argument for investing in caching or query optimisation, not a matter of opinion.

Stronger root-cause analysis

When a latency SLO fails during a test, the investigation has a starting point: which resource, dependency, or code path saturated first? Correlating load test output with traces, logs, and server-side metrics compresses the time between "something's wrong" and "here's why."

Protection from average-only blindness

Google's "Tail at Scale" research shows why large systems are dominated by latency tails as scale and utilisation increase. The Home Depot's SLO programme explicitly chose percentile latency over arithmetic averages for exactly this reason. If your release gates use averages while your users feel the p99, you're under-measuring risk.

Automation and repeatability

SLOs, code-based assertions in Gatling make performance testing suitable for CI/CD in the same way unit tests are. For instance, LoginRadius moved away from a JMeter-based approach that wasn't integrated into its pipeline, and reported latency dropping from 500 ms to 250 ms alongside an 80%+ reduction in production issues.

The business case: five benefits leaders should own

Customer experience protection

SLOs formalise what "acceptable" means in terms customers feel, not in terms that are easy to instrument. Every load test run against an SLO is a forward-looking commitment to that experience under pressure.

SLA risk reduction

If a service can't pass its internal SLO under expected peak conditions, the risk of breaching its public SLA in production is already real — with 54% of significant outages costing over $100,000. Load testing against the internal SLO functions as an early-warning system for commercial exposure — before it becomes a legal conversation.

Infrastructure right-sizing

Canal+'s gains included improved machine sizing, .not over-provisioning "just in case," but provisioning to the SLO boundary. Google's tail-latency research notes that tail-tolerant techniques can allow higher utilisation without lengthening the tail, meaning SLO-driven testing often surfaces headroom that naive capacity planning leaves on the table.

Release confidence with teeth

Houghton Mifflin Harcourt now runs all 50 of its load simulations together before release, including campaigns at four to five times normal traffic before peak periods. They report fewer performance issues in production as a direct result. That's what release confidence looks like when it's backed by data rather than optimism.

Velocity preservation, not velocity reduction

This is the counterintuitive point that matters most for CTO-level conversations. Google's error-budget guidance is explicit: exhausting budget may temporarily slow release cadence, but the purpose is to restore safe release speed, not to punish teams. DORA's research consistently shows that speed and stability are not structural trade-offs for most organisations. SLO-driven load testing is not anti-delivery; it's what makes delivery sustainable at scale.

Scaling it: the organizational dximension

The most important lesson from The Home Depot's SLO program isn't technical. Before adopting a common SLO framework — covering volume, availability, latency, errors, and tickets — their monitoring was fragmented, root causes were hard to pinpoint, and teams wasted "countless hours" working backwards from user-facing symptoms.

After implementing the framework with training, automation, and executive reporting, they scaled from approximately 50 services reporting SLOs to 800 within a year. Around 50 new services were being onboarded per month. They also integrated SLOs into destructive testing, automatically recording the effect of chaos experiments on service metrics.

That's not a tooling story. It's an operating-model story. SLOs gave engineering, SRE, product, and leadership a shared language — and that language made reliability visible, discussable, and governable at scale.

Also Evernote's experience reinforces the cross-team effect. Working with Google's CRE team, they adopted an error-budget approach and within nine months were already on version 3 of their SLO practice. Monthly SLO reviews replaced ad hoc outage conversations, and both Evernote and Google had a common, data-driven way to discuss service quality. SLOs improved supplier management and internal prioritisation simultaneously.

Where to start: a practical roadmap

The highest-confidence starting point is narrow scope and high relevance: pick two or three critical user journeys, define SLIs for them, set internal SLOs that are stricter than your SLAs, and encode them as test thresholds.

Then connect those thresholds to runtime telemetry and attach burn-rate alerts and release-gate policies.

A five-phase performance testing maturity model emerges consistently from the literature:

  1. Define: Identify critical user journeys and existing telemetry. Draft SLIs, internal SLOs, and SLA buffer policy.
  2. Instrument: Add percentile histograms, error counters, and saturation metrics to your services.
  3. Automate: Encode SLO thresholds in load tests and CI/CD pipelines. Connect traces, logs, and server-side metrics.
  4. Operate: Run regular SLO reviews. Add fast-burn and slow-burn alerts. Use SLOs for canary releases and peak-readiness drills.
  5. Expand: Roll out to more services and teams. Build executive dashboards alongside service-owner dashboards.

The most common pitfalls are worth naming explicitly: setting 100% SLO targets (which eliminates the error budget entirely), using averages as pass criteria (which hides tail failures), copying another company's thresholds (which produces governance that doesn't fit your architecture or user expectations), and treating SLOs as dashboards without consequences (which fails to change engineering prioritisation).

The strategic call to action

The diagnostic question for any CTO is simple: if your load testing program isn't tied to SLO attainment, error-budget consumption, and release decisions, what decisions is it actually driving?

Canal+ answered that question before a major broadcast and served millions of viewers without a single incident. The Home Depot answered it and scaled reliable service delivery across 800 systems. LoginRadius answered it and halved its production latency.

The technology to do this is mature, well-documented, and largely open-source. The organizational will to tie test outcomes to release decisions and infrastructure investment is the harder part since four in five serious outages are attributed to preventable process failures, not missing technology.

But that's exactly what separates performance engineering that generates activity from performance engineering that generates governance value.

SLOs don't make load testing more complicated. They make it more useful.