
























A first-of-its-kind security analysis of 12 widely deployed agentic offensive-security tools reveals critical architectural flaws that allow adversaries to steal LLM API keys, establish persistent footholds, and achieve full host compromise even inside sandboxed containers.
Security researchers from Cracken have published the first in-depth security analysis of agentic red-team systems, AI-powered tools designed to autonomously conduct penetration testing and offensive security operations.
The study exposes a sweeping set of shared design flaws that enable an active adversary to exfiltrate sensitive credentials, weaponize the victim’s own infrastructure, and fully compromise the operator’s machine, even when the agent runs inside a sandboxed Docker container.
Agentic red-team systems are fully autonomous LLM-driven platforms built to simulate offensive security operations, including black-box penetration testing.
The researchers analyzed 12 popular open-source tools, including PentestGPT, RedAmon, DarkMoon, AIRecon, CAI, PentAGI, STRIX, Artemis, METATRON, and others, all of which pair a large-language-model orchestrator with a Kali Linux worker container capable of executing arbitrary shell commands against targets.
.webp)
These tools are rapidly entering production security workflows, with adoption accelerating across enterprise security teams and growing interest from military cyber forces, making their attack surface an urgent area of concern.
The researchers introduce a tailored cyber kill chain modeled specifically for agentic red-team systems, progressing through five stages:
settings.json, allowing hook injection that triggered RCE on the orchestrator at every subsequent session start.
A particularly alarming finding is the novel agent-phishing attack, a prompt-injection-free manipulation technique that achieved 97.8% success across all tested agents and LLMs.
The attacker stages a fully functional binary (e.g., a password vault decryptor called pwcrypt) on an adversary-controlled honeypot, complete with a convincing README and fabricated CI pipeline logs.
The agent downloads and executes the binary, believing it is a critical artifact. The binary contains a self-planted memory corruption vulnerability, not malicious code, which is triggered upon execution and hijacks control flow to achieve arbitrary code execution.
This defeats model-based inspection entirely, since there is no shellcode, encoded payload, or suspicious syscall pattern. The attack was effective against Claude Opus 4.8, GPT-5.5, Gemini 3.1 Pro, DeepSeek V4 Pro, GLM-5.1, and Kimi K2.6.
Once a foothold is established, secret exfiltration is trivially achievable in 11 out of 12 analyzed tools. Most frameworks co-locate the worker and orchestrator in the same container, directly exposing LLM API keys and cross-session memory.
In RedAmon, a shared internal API key and unauthenticated Docker bridge network allowed the attacker to enumerate and exfiltrate conversation history, including target hosts and credentials for every user across all sessions.
Seven of twelve tools implement guardrails, regex filters, or LLM-as-judge validators to block sensitive targets. The study finds none are effective.
Guardrails are enforced only at the orchestration layer, validating LLM decisions but never monitoring actual worker network activity. An attacker with a worker shell bypasses every guardrail invisibly, a problem echoed in Microsoft’s recent disclosure of CVE-2026-25592 and CVE-2026-26030 in Semantic Kernel, where researchers confirmed that LLMs are not security boundaries.
The researchers propose a secure architecture grounded in one invariant: treat the LLM worker as an untrusted environment. Key principles include strict worker-orchestrator separation with no writable shared mounts, authenticated network segmentation, secrets isolation (API keys must never reach the worker), worker-layer guardrail enforcement via network egress filtering, and immutable worker filesystems rebuilt between operations.
Follow us on Google News, LinkedIn, and X to Get More Instant Updates.
Guru Baranhttps://cybersecuritynews.com
Gurubaran KS is a cybersecurity analyst, and Journalist with a strong focus on emerging threats and digital defense strategies. He is the Co-Founder and Editor-in-Chief of Cyber Security News, where he leads editorial coverage on global cybersecurity developments.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。