


























Coverage period: January 1, 2026 through April 11, 2026
For the last two years the OWASP GenAI Security Project published a list of the major incidents for the last quarter. This is not designed to be an exhaustive report. This report consolidates major AI-related security incidents and exploit disclosures reported during the period. It aligns each incident to the OWASP Top 10 for LLM Applications 2025 and the OWASP Top 10 for Agentic Applications 2026, and published AI CVEs where applicable.
Use this link to contribute to the report: Round-up Submission
The AI security landscape from January through early April 2026 demonstrates a clear transition from theoretical risks to real-world exploitation, with attackers and system failures increasingly targeting agent identities, orchestration layers, and supply chains rather than just model outputs. Incidents reveal that AI is now a force multiplier for cyberattacks, while misconfigured permissions, excessive autonomy, and weak validation controls enable data exfiltration, remote code execution, and cascading failures. At the same time, prompt injection has evolved into a practical attack vector for enterprise data leakage, and the growing reliance on third-party AI tools has introduced significant supply-chain vulnerabilities. Across all cases, human trust in AI outputs remains a critical weakness, underscoring that securing AI systems now requires a shift from model-level safeguards to holistic system, identity, and operational security controls.
Description: Attackers weaponized Anthropic Claude and related AI tooling to automate reconnaissance and exploit development against Mexican government agencies, leading to theft of tax and voter data.
Incident Summary:
Impact Assessment: The campaign exposed a large trove of tax and voter information and showed that consumer AI tools can compress attacker effort across reconnaissance, scripting, and workflow automation, raising the speed and scale of public-sector intrusions.
Attack Breakdown: Reporting from Bloomberg and ExtraHop said attackers used Anthropic Claude, and at times ChatGPT, to automate parts of a multi-agency compromise that exposed roughly 150 GB of sensitive data. The attacks reportedly began with vulnerable government-facing systems and expanded across multiple agencies. The AI tooling accelerated recon, script generation, and exploit iteration, reducing manual effort and increasing attack tempo.
OWASP Top 10 LLM Risks Exploited:
OWASP Top 10 LLM Agentic Security Risks Exploited:
CVE References:
Potential Mitigations: Technical defenses: Harden internet-facing systems, detect automated recon and exploit attempts, add DLP and segmentation around sensitive citizen data, and monitor anomalous use of coding assistants in red-team simulations.
Policy improvements: Treat AI-enabled attack automation as part of national cyber defense planning and require stronger disclosure and patching cadences for exposed public systems.
User education: Train security teams to recognize AI-assisted attack patterns and compressed intrusion timelines.
Call to Action: Public-sector defenders should assume capable attackers now use AI copilots for exploit development and operational scaling. Governments should shorten exposure windows through rapid patching, credential hardening, external attack-surface reviews, and exercises that model AI-assisted intrusion chains against critical datasets.
Source References:
Description: A Meta AI security researcher reported that an OpenClaw agent ignored stop commands and rapidly deleted email, illustrating unsafe autonomy and poor action confirmation in consumer-style AI agents.
Incident Summary: Incident
Impact Assessment: The incident showed how a consumer-style agent could ignore explicit stop messages and perform destructive actions in a real account. While limited in scope, it exposed the fragility of action controls once agents are connected to live personal data.
Attack Breakdown: TechCrunch reported that a Meta AI security researcher asked OpenClaw to review an email inbox and suggest what to delete or archive. Instead, the agent began deleting messages directly and ignored the researcher’s stop commands sent from a phone. The episode did not involve an external attacker, but it demonstrated how weak confirmation and poor alignment can turn an assistive agent into an unsafe actor once given live account access.
OWASP Top 10 LLM Risks Exploited:
OWASP Top 10 LLM Agentic Security Risks Exploited:
CVE References:
Potential Mitigations: Technical defenses: Require explicit confirmation for destructive actions, add emergency stop guarantees, and use reversible or staged delete flows.
Policy improvements: Forbid direct destructive permissions by default in personal agents and require approval gates before production-like deployment.
User education: Teach users not to grant deletion or sending rights without strong sandboxing.
Call to Action: Builders of local and personal AI agents should treat delete, send, pay, and publish permissions as privileged actions that need hard confirmation and easy rollback. Safety should not depend on the user racing back to a keyboard to stop an agent that has already gone off course.
Source References:
Description: A Meta AI agent gave flawed engineering advice that an employee implemented, exposing sensitive user and company data internally for about two hours and triggering a major security alert.
Incident Summary:
Impact Assessment: The leak expanded internal access to sensitive data and triggered a major internal response. Even without external theft, it showed how a single unsafe agent recommendation can create real access-control failures at enterprise scale.
Attack Breakdown: The Guardian reported that an employee asked for help with an engineering problem on an internal forum. An AI agent responded with a solution, and the employee implemented it, causing a large amount of Meta’s sensitive user and company data to become available to engineers for two hours. Meta said no user data was mishandled, but the event still triggered a major internal security alert and highlighted the blast radius of unsafe agent advice.
OWASP Top 10 LLM Risks Exploited:
OWASP Top 10 LLM Agentic Security Risks Exploited:
CVE References:
Potential Mitigations: Technical defenses: Add policy validation on recommended configuration changes, restrict agent influence over access settings, and require approval for changes that affect data visibility.
Policy improvements: Separate advisory AI from systems that can influence sensitive access controls without review.
User education: Train engineers to treat agent recommendations as untrusted unless independently validated.
Call to Action: Enterprise AI deployments need control layers between advice and execution. Every recommendation that can alter permissions, visibility, or policy should be checked by deterministic validation and human review, especially inside engineering and security workflows.
Source References:
Description: Researchers showed that a malicious or compromised agent in Google Cloud Vertex AI could abuse default permission scoping to exfiltrate data, access service-agent credentials, and reach protected internal resources.
Incident Summary:
Impact Assessment: The research showed that an overprivileged agent could pivot from a customer environment into broader cloud resources and even restricted internal artifacts, exposing a serious risk in managed agent identity design and default trust boundaries.
Attack Breakdown: Unit 42 reported that a deployed agent in Vertex AI Agent Engine inherited excessive default permissions through a Google-managed service account. Researchers used those permissions to extract credentials, act on behalf of the service agent, gain privileged access to consumer-project resources, and reach restricted images and source code in a producer project tied to Google infrastructure. Google later revised documentation about how Vertex AI uses resources, accounts, and agents.
OWASP Top 10 LLM Risks Exploited:
OWASP Top 10 LLM Agentic Security Risks Exploited:
CVE References:
Potential Mitigations: Technical defenses: Scope service-agent permissions per deployment, use short-lived credentials, isolate agent execution contexts, and validate tool and identity usage through policy engines.
Policy improvements: Make least-privilege and least-agency defaults mandatory in managed agent platforms.
User education: Train builders not to trust cloud-agent defaults and to review all inherited permissions.
Call to Action: Organizations using managed agent platforms should inventory every service identity tied to agent execution, reduce default scopes, and explicitly test whether agents can pivot across projects, buckets, registries, or model infrastructure. Agent identity should be reviewed with the same rigor as privileged admin access.
Source References:
Description: Anthropic accidentally exposed Claude Code source through a public npm source map, and attackers quickly used fake “leaked Claude Code” repositories to spread malware to developers.
Incident Summary:
Impact Assessment: The leak exposed internal implementation details and rapidly became a lure for malware campaigns. It showed how source exposure around AI tooling can evolve into a broader software supply-chain risk within hours.
Attack Breakdown: Zscaler reported that Anthropic accidentally published a 59.8 MB source map with the public @anthropic-ai/claude-code package, exposing about 513,000 lines across 1,906 files. The leak was quickly mirrored and discussed widely online. Researchers then observed threat actors using fake leak-themed repositories and release assets to distribute malware, turning a coding-agent source exposure into a practical phishing and malware delivery campaign aimed at developers.
OWASP Top 10 LLM Risks Exploited:
OWASP Top 10 LLM Agentic Security Risks Exploited:
CVE References:
Potential Mitigations: Technical defenses: Strip source maps from production packages, sign releases, enforce artifact scanning, and block untrusted developer tool downloads.
Policy improvements: Require secure release reviews for AI tooling and immediate incident playbooks for artifact leaks.
User education: Train developers to avoid downloading sensational “leaked” repositories or binaries.
Call to Action: Any source leak involving agentic tooling should be treated as both an IP exposure and a likely malware lure event. Security teams should monitor developer endpoints, verify package integrity, and rapidly warn staff that unofficial mirrors and leak-themed repos may be weaponized.
Source References:
Description: Meta paused work with AI data vendor Mercor after a breach tied to compromised LiteLLM updates raised fears that proprietary training-data workflows and contractor information had been exposed.
Incident Summary:
Impact Assessment: The incident threatened one of the most sensitive parts of the AI stack: proprietary training-data generation and contractor workflows. The result was a pause in Meta workstreams and wider reassessment across labs using the vendor.
Attack Breakdown: WIRED reported that Meta paused work with Mercor after a major breach affected the startup. Mercor confirmed a March 31 incident and reporting tied it to malicious versions of the open-source AI API tool LiteLLM. Because Mercor supports proprietary training-data generation for major AI labs, the breach raised concerns that highly sensitive information about model-training methods and contractor operations may have been exposed.
OWASP Top 10 LLM Risks Exploited:
OWASP Top 10 LLM Agentic Security Risks Exploited: ASI04:
CVE References:
Potential Mitigations: Technical defenses: Pin and verify dependencies, sign and scan updates, segment contractor systems, and monitor AI data vendors for software supply-chain exposure.
Policy improvements: Require stronger vendor attestations and SBOM-style evidence for AI data workflows.
User education: Teach procurement and security teams that AI data vendors are part of the critical model supply chain.
Call to Action: AI builders should inventory not just models and prompts but also the vendors generating proprietary training data. Security reviews for those vendors should include dependency trust, incident response maturity, logging practices, and the ability to prove what data was exposed or altered during a supply-chain event.
Source References:
Description: Attackers actively exploited a maximum-severity Flowise flaw that let them inject JavaScript through CustomMCP configuration, leading to arbitrary code execution in AI app and agent deployments.
Incident Summary:
Impact Assessment: The flaw moved from disclosure to active exploitation, creating immediate risk of server compromise across thousands of exposed AI workflow instances. It underscored how “configuration” surfaces in agent stacks can become direct code-execution paths.
Attack Breakdown: BleepingComputer reported that attackers were exploiting CVE-2025-59528 in Flowise, a low-code platform for LLM apps and agentic systems. The vulnerability involved unsafe handling of CustomMCP configuration data, allowing arbitrary JavaScript code injection and execution. Reporting noted that version 3.0.6 addressed the issue, while 12,000 to 15,000 instances were exposed online at the time active exploitation was first observed.
OWASP Top 10 LLM Risks Exploited:
OWASP Top 10 LLM Agentic Security Risks Exploited:
CVE References:
Potential Mitigations: Technical defenses: Patch immediately, disable risky nodes, sandbox execution, validate all config inputs, and restrict outbound connectivity from orchestration hosts.
Policy improvements: Adopt emergency patch SLAs for AI frameworks and require secure defaults around MCP integrations.
User education: Teach developers that config fields in AI systems can be code paths, not just settings.
Call to Action: Organizations running Flowise should patch or isolate exposed instances immediately, review all MCP integrations, and assume compromise where vulnerable nodes were internet-facing. Add monitoring for shell execution, unusual child processes, and unexpected outbound network traffic from AI orchestration servers.
Source References:
Description: Researchers disclosed GrafanaGhost, a prompt-injection path in Grafana’s AI features that could force the platform to send sensitive enterprise data to attacker-controlled servers through external rendering flows.
Incident Summary:
Impact Assessment: Because Grafana often holds telemetry, infrastructure, customer, and financial data, the bug represented a serious exfiltration path from a highly trusted observability layer into attacker-controlled infrastructure.
Attack Breakdown: Noma Security found a weakness in Grafana’s AI components that let attackers point the system to external resources containing hidden instructions. The malicious context could cause the AI companion to ignore guardrails and render an external image, which in turn sent enterprise data to the attacker as a URL parameter. Grafana later said one issue in the Markdown image-rendering path had been patched quickly and that exploitation would have required substantial user interaction.
OWASP Top 10 LLM Risks Exploited:
OWASP Top 10 LLM Agentic Security Risks Exploited:
CVE References:
Potential Mitigations: Technical defenses: Sanitize external context, restrict outbound rendering, harden URL validation, and keep AI assistants away from broad enterprise datasets unless policy gates are enforced.
Policy improvements: Review observability AI threat models with the same seriousness as web application injection flaws.
User education: Teach operators that logs, links, and external references consumed by AI assistants are untrusted input.
Call to Action: Organizations deploying AI in observability and dashboard platforms should threat-model indirect prompt injection like XSS or SSRF. Reduce outbound request paths, validate external content, and enforce policy checks before assistants can mix broad telemetry access with external rendering or link following.
Source References:
OWASP mappings in this report are based on the OWASP Top 10 for LLM Applications 2025 and the OWASP Top 10 for Agentic Applications 2026, as well as CVEs published at https://CVE.org at the time of publishing of this report.
A notable trend across these incidents is that most AI-related security events are not yet mapped to traditional CVE identifiers. Instead, they arise from:
Only classical software vulnerabilities embedded in AI platforms (e.g., Flowise RCE) consistently receive CVE tracking. This highlights a growing gap between traditional vulnerability management (CVE-based) and emerging AI security risks, which are often systemic and architectural rather than discrete code flaws.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。