惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

L
LangChain Blog
博客园 - 司徒正美
美团技术团队
WordPress大学
WordPress大学
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
人人都是产品经理
人人都是产品经理
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
T
Troy Hunt's Blog
S
Schneier on Security
T
The Exploit Database - CXSecurity.com
P
Proofpoint News Feed
云风的 BLOG
云风的 BLOG
Engineering at Meta
Engineering at Meta
Cisco Talos Blog
Cisco Talos Blog
T
Tor Project blog
B
Blog
NISL@THU
NISL@THU
月光博客
月光博客
博客园 - 【当耐特】
AWS News Blog
AWS News Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
腾讯CDC
L
Lohrmann on Cybersecurity
The Cloudflare Blog
L
LINUX DO - 最新话题
S
Security @ Cisco Blogs
S
Secure Thoughts
Spread Privacy
Spread Privacy
有赞技术团队
有赞技术团队
The Last Watchdog
The Last Watchdog
Project Zero
Project Zero
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Vercel News
Vercel News
H
Hacker News: Front Page
S
SegmentFault 最新的问题
Schneier on Security
Schneier on Security
aimingoo的专栏
aimingoo的专栏
P
Privacy & Cybersecurity Law Blog
博客园 - 三生石上(FineUI控件)
Forbes - Security
Forbes - Security
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
InfoQ
T
Tailwind CSS Blog
Application and Cybersecurity Blog
Application and Cybersecurity Blog
G
GRAHAM CLULEY
W
WeLiveSecurity
小众软件
小众软件
Recorded Future
Recorded Future
Cyberwarzone
Cyberwarzone
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org

Aikido Security's Blog

Axios CVE-2026-40175: a critical bug that’s… not exploitable GlassWorm goes native: New Zig dropper infects every IDE on your machine Aikido Attack finds multiple 0-days in Hoppscotch The cybersecurity doomerism around Mythos doesn't match what we see on the ground axios compromised on npm: maintainer account hijacked, RAT deployed Popular telnyx package compromised on PyPI by TeamPCP Aikido × Lovable: Vibe, Fix, Ship CanisterWorm Gets Teeth: TeamPCP's Kubernetes Wiper Targets Iran TeamPCP deploys CanisterWorm on NPM following Trivy compromise Security testing is validating software that no longer exists Aikido Recognized by Frost & Sullivan with the 2026 Customer Value Leadership Award in ASPM GlassWorm Hides a RAT Inside a Malicious Chrome Extension fast-draft Open VSX Extension Compromised by BlokTrooper Glassworm Strikes Popular React Native Phone Number Packages Glassworm Is Back: A New Wave of Invisible Unicode Attacks Hits Hundreds of Repositories How Security Teams Fight Back Against AI-Powered Hackers Introducing Betterleaks, an open source secrets scanner by the author of Gitleaks Trump’s 2026 cybersecurity strategy: From compliance to consequence How does AI pentesting work with compliance? What continuous pentesting actually requires Rare Not Random: Using Token Efficiency for Secrets Scanning Persistent XSS/RCE using WebSockets in Storybook’s dev server Why Determinism Is Still a Necessity in Security WAF vs. RASP vs. ADR Introducing Aikido Infinite: A new model of self-securing software Astro Full-Read SSRF via Host Header Injection How to Get Your Board to Care About Security (Before a Breach Forces the Issue) What is Slopsquatting? The AI Package Hallucination Attack Already Happening SvelteSpill: A Cache Deception Bug in SvelteKit + Vercel Top 6 Wiz Code Alternatives Aikido recognized as Platform Leader in Latio Tech's 2026 Application Security Report From detection to prevention: How Zen stops IDOR vulnerabilities at runtime npm backdoor lets hackers hijack gambling outcomes Introducing Upgrade Impact Analysis: When breaking changes actually matter to your code Why Trying to Secure OpenClaw is Ridiculous Claude Opus 4.6 found 500 vulnerabilities. What does this change for software security? Introducing Aikido Expansion Packs: Safer defaults inside the IDE International AI Safety Report 2026: What It Means for Autonomous AI Systems Self-Securing Software: What It Is, Why It Matters, and How It Works npx Confusion: Packages That Forgot to Claim Their Own Name What Is Continuous Pentesting? Introducing Aikido Package Health: a Better Way to Trust Your Dependencies AI Pentesting: Minimum Safety Requirements for Security Testing Secure SDLC for Engineering Teams (+ Checklist) Fake Clawdbot VS Code Extension Installs ScreenConnect RAT G_Wagon: npm Package Deploys Python Stealer Targeting 100+ Crypto Wallets Gone Phishin': npm Packages Serving Custom Credential Harvesting Pages Malicious PyPI Packages spellcheckpy and spellcheckerpy Deliver Python RAT Top 10 AI Security Tools For 2026 Agent Skills Are Spreading Hallucinated npx Commands Understanding Open-Source License Risk in Modern Software The CISO Vibe Coding Checklist for Security Top 6 Graphite alternatives for AI code review in 2026 From “No Bullsh*t Security” to $1B: We Just Raised Our $60m Series B Critical n8n Vulnerability Allows Unauthenticated Remote Code Execution (CVE-2026-21858) Top 14 VS Code Extensions for 2026 AI-Driven Pentesting of Coolify: Seven CVEs Identified Top Continuous Pentesting Tools in 2026 SAST vs SCA: Securing the Code You Write and the Code You Depend On JavaScript, MSBuild, and the Blockchain: Anatomy of the NeoShadow npm Supply-Chain Attack How Engineering and Security Teams Can Meet DORA’s Technical Requirements IDOR Vulnerabilities Explained: Why They Persist in Modern Applications Shai Hulud strikes again - The golden path MongoBleed: MongoDB Zlib Vulnerability (CVE-2025-14847) and How to Fix It First Sophisticated Malware Discovered on Maven Central via Typosquatting Attack on Jackson The Fork Awakens: Why GitHub’s Invisible Networks Break Package Security Top 10 Cyber Security Tools For 2026 SAST in the IDE is now free: Moving SAST to where development actually happens AI Pentesting in Action: A TL;DV Recap of Our Live Demo The Top 7 Threat Intelligence Tools in 2026 React & Next.js DoS Vulnerability (CVE-2025-55184): What You Need to Fix After React2Shell OWASP Top 10 for Agentic Applications (2026): What Developers and Security Teams Need to Know DAST vs Pentesting v AI Pentesting: Why DAST Cannot Replace Modern Pentesting PromptPwnd: Prompt Injection Vulnerabilities in GitHub Actions Using AI Agents Top 7 Cloud Security Vulnerabilities Critical React & Next.js RCE Vulnerability (CVE-2025-55182): What You Need to Fix Now How to Comply With the UK Cybersecurity & Resilience Bill: A Practical Guide for Modern Engineering Teams Shai Hulud 2.0: What the Unknown Wonderer Tells Us About the Attackers’ Endgame SCA Everywhere: Scan and Fix Open-Source Dependencies in Your IDE Safe Chain now enforces a minimum package age before install Shai Hulud Attacks Persist Through GitHub Actions Vulnerabilities Shai Hulud Launches Second Supply-Chain Attack: Zapier, ENS, AsyncAPI, PostHog, Postman Compromised CORS Security: Beyond Basic Configuration Revolut Selects Aikido Security to Power Developer-First Software Security The Future of Pentesting Is Autonomous How Aikido and Deloitte are bringing developer-first security to enterprise Secrets Detection: A Practical Guide to Finding and Preventing Leaked Credentials Invisible Unicode Malware Strikes OpenVSX, Again AI as a Power Tool: How Windsurf and Devin Are Changing Secure Coding Building Fast, Staying Secure: Supabase’s Approach to Secure-by-Default Development OWASP Top 10 2025: Official List, Changes, and What Developers Need to Know Top 10 JavaScript Security Vulnerabilities in Modern Web Apps The Return of the Invisible Threat: Hidden PUA Unicode Hits GitHub repositorties Top 7 Black Duck Alternatives in 2026 What Is IaC Security Scanning? Terraform, Kubernetes & Cloud Misconfigurations Explained AutoTriage and the Swiss Cheese Model of Security Noise Reduction Top Software Supply Chain Security Vulnerabilities Explained The Top 7 Kubernetes Security Tools Top 10 Web Application Security Vulnerabilities Every Team Should Know What Is CSPM (and CNAPP)? Cloud Security Posture Management Explained
How Aikido secures AI pentesting agents by design
Sooraj Shah · 2026-02-24 · via Aikido Security's Blog

You’ve heard all of the hysteria around AI agents and all the seemingly limitless possibilities. And while those possibilities are all well and good, you’re only really interested in agentic AI capabilities that address your actual problems head on.

And then when you think of all the productivity gains and ROI benefits, you stop and think, “okay this is great but, what if these agents go outside of their scope?”  - that’s regardless of whether you’re deploying your own AI agents internally or benefitting from an external vendor’s AI agent capabilities.

And it’s a valid question to ask. Agents, just like other AI capabilities, need constraints. Without them, they can run wild. Agents are curious by design. Like a toddler, they’ll try every door they can reach. In many cases you need them to explore, but you also need to ensure that the doors that shouldn’t open are physically locked. 

When it comes to cybersecurity, this matters even more: the minimum safety requirements for AI agents need to be even more stringent. For Aikido Attack, our AI pentesting capability, we’ve considered every layer to prevent agents from going out of scope. This covers elements such as accidentally testing production and losing control.

Going out of scope is one of the key topics security leaders and engineers are asking us about, and it’s something we considered when developing our platform at the outset. Naturally, as a cybersecurity company, we wanted to get this right.

It’s worth remembering that agents are expected to attempt unexpected or risky paths, but that guardrails exist to contain that behavior, not prevent it.

Aikido Attack and Infinite work on a layered approach, using both hard boundaries and soft boundaries. Here are the key elements you need to know about:

Layer 1: Hard architectural separation: Control plane vs execution

Aikido’s system is architected with a strict separation between the system that plans and evaluates pentests (the control plane) and the environment that actually executes actions (the isolated execution sandbox). 

All reasoning, orchestration, and access to sensitive data happen in the control plane. Tool execution, browser automation, and network interactions happen in a separate environment.

The separation exists because we assume that execution can misbehave and therefore, any impact must be contained. It’s for this reason that the execution environment has no access to orchestration secrets, internal infrastructure, or control-plane systems. 

Layer 2: Runtime scope enforcement

Production is never assumed as in-scope

Our system never assumes production is in scope to be attacked. Pentesting is expected to run against staging and test environments only. Production has to be explicitly configured as in-scope, and even then, this would be reviewed and acknowledged before anything runs. 

We’ve seen our guardrails work in practice. In one case, an agent followed application behavior that would have led it to production infrastructure. The hard boundary we have in place blocked the request at the network layer. However, we could see that the agent tried. This blocked attempt is proof that our guardrails work. 

Only domains that are allowed can be accessed 

Our agents can only interact with explicitly configured domains. If a domain is not allow-listed, it’s blocked at the network level. This is something you can set up yourself, specifying which domains are attackable or accessible. Put simply, we block domains by default to prevent the agent from interacting with servers that they’re not supposed to interact with.

This means we don’t rely on prompts or humans for scope enforcement. Aikido technically enforces it ourselves.

Accidental scope drift is blocked

Back to our toddler analogy. Even though most other safety controls mean that the agents won’t drift from scope, there’s a limited number of agents that, well, just do. Especially when you have 250 agents running at the same time. 

A classic example of this is if an agent is redirected to an external application through a link, they assume they’re still on the same page, but they’re actually on another website. So suddenly they’re on X or Reddit, and they assume this is part of the scope. 

This is why you need hard checks in place to protect the agents from, well, themselves. As Phillippe Dourassov, AI Pentest Lead at Aikido Security, puts it:

“There’s going to be five percent of agents that aren’t always sensible, and that’s why we make sure that we deal with this five percent”.

Layer 3: Prompt injection and data exfiltration 

We know that prompt injection is a key risk in autonomous AI systems, whereby an attacker inserts malicious instructions in content that the agent reads. The agent interprets those instructions as legitimate guidance and follows them.

That could mean content that urges agents to send the source code or internal data to somewhere where it shouldn’t be. This vulnerability comes from being exposed to untrusted content and then acting on it. Aikido removes both of those options.

First, Aikido’s agents do not have open internet access. This means agents cannot go on a Google search to find out how a type of technology works, or go on Reddit and take on instructions to do something unsafe. The only content they process is what exists inside the scoped application itself. 

Second, even if malicious instructions were somehow planted inside the target application, the agents are still not permitted to exfiltrate data. Network-level restrictions prevent outbound connections to random destinations, so the agent cannot upload source code to Google Drive, or post to an external endpoint, or send data to an attacker-controlled domain.

We enforce this at the network layer by intercepting and controlling both HTTP and DNS traffic from agents, preventing them from resolving or communicating with domains that aren’t explicitly approved.

So, in the worst-case scenario, where a model misinterprets instructions, it’ll still be unable to send anything outwards.

One edge case worth mentioning is if a customer deliberately injects malicious instructions into their own environment (although we’re not sure why this would be the case?!), the agent may well process this. But even then, the only impact will be on the customer’s own test. There’s no cross-tenant risk, infrastructure exposure or data leakage beyond what they already control. 

Layer 4: Isolated sandboxes for each agent

Each of our agents have their own little isolated sandbox (think: toddler in playpen). That means they’re separated from both Aikido's internal infrastructure and from other agents that are running at the same time. This means they’re separated from access to Aikido’s network, infrastructure and databases, and cannot interfere with or influence other active sessions.

If something behaves unexpectedly during a test, the impact is contained in that single sandbox - preventing both impact across agents and cross-tentant exposure. 

Layer 5: Operational safeguards

All requests are rate-limited and load-aware, ensuring that tests do not overwhelm target systems or trigger a barrage of alerts.

In addition, tests can be paused or terminated immediately at any time. Customers can see what agents are doing in real-time. Every request and action is visible. This means teams can intervene if they deem it necessary.

Setup validation

Configuration mistakes are more likely than malicious behaviour. It’s for this reason that, before tests begin, Aikido uses pre-flight checks to validate authentication and reachability. If someone appears misconfigured or resembles a production environment, warnings are surfaced early. This means safeguards are designed to catch human error before execution starts, rather than relying on runtime controls to fix avoidable setup mistakes.

Soft boundaries

Our layered approach means we also have soft boundaries. This is where you wouldn’t need a domain to be accessible for the agents to use it. 

For example, if you had an authentication portal, then within that portal, you may want the agents to use the authentication to log into the application, but you don’t want the agents to attack the portal itself.

The soft boundary means the agents can still reach the authentication portal, but are specifically instructed not to attack it. 

How scope is enforced: Human vs AI pentesting

In a traditional pentest, scope is enforced through documentation, contracts, and professional judgment. Testers are briefed on which environments are in scope. This works well in practice, but staying within the boundaries depends on the tester’s discipline and experience. 

For instance, if a tester follows a redirect into the wrong environment or misidentifies a system, the issue is typically discovered later through logs or review.

With AI pentesting, scope is enforced through technical controls. If a domain is not allow-listed, the connection is blocked. If production is not explicitly selected, it’s not reachable, and if a redirect leads outside the scope, then the request fails automatically. 

Both approaches are effective. The advantage of technical enforcement is that it reduces reliance on documentation and interpretation. 

To benefit from AI pentesting, which has already shown better results than manual pentesting, in terms of finding critical and high-severity issues, try Aikido Attack now