惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
Recorded Future
Recorded Future
T
Tenable Blog
S
Securelist
C
CERT Recently Published Vulnerability Notes
T
Threatpost
S
Schneier on Security
A
Arctic Wolf
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
Know Your Adversary
Know Your Adversary
P
Privacy International News Feed
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Register - Security
The Register - Security
Cisco Talos Blog
Cisco Talos Blog
AWS News Blog
AWS News Blog
K
Kaspersky official blog
T
True Tiger Recordings
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
P
Palo Alto Networks Blog
T
The Exploit Database - CXSecurity.com
小众软件
小众软件
B
Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Microsoft Azure Blog
Microsoft Azure Blog
Cyberwarzone
Cyberwarzone
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tor Project blog
Spread Privacy
Spread Privacy
Malwarebytes
Malwarebytes
P
Proofpoint News Feed
F
Fox-IT International blog
F
Fortinet All Blogs
P
Privacy & Cybersecurity Law Blog
G
GRAHAM CLULEY
量子位
Latest news
Latest news
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 叶小钗
Project Zero
Project Zero
T
Tailwind CSS Blog
N
Netflix TechBlog - Medium
Martin Fowler
Martin Fowler
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
I
Intezer
博客园_首页
腾讯CDC
H
Hackread – Cybersecurity News, Data Breaches, AI and More
D
Darknet – Hacking Tools, Hacker News & Cyber Security

Vectra AI Blog

AI-Driven Network Detection and Response: Insights from a 2026 Gartner® Magic Quadrant™ Leader Securing AI Adoption Starts with Visibility by Aakash Gupta The Missing Data Layer Behind SIEM and SOAR Why Most SIEM/SOAR Integrations Break — and How to Fix Them Shai-Hulud Part 2: When the Worm Forged Its Own Security Certificate Improve SIEM and SOAR Workflows with Better Security Signal by Gearóid Ó Fearghaíl ShinyHunters isn’t a group. It’s a pattern. How Vectra AI Secures the AI Enterprise AI agents: the new workforce — and attack surface. by Tiffany Nip How Vectra AI Scoring Helps Security Teams Focus on What Matters First What’s Next for the Enterprise After Two GenAI Tidal Waves? If An Identity was Compromised, Would We Know? Help Over Hype: Claude Mythos, Project Glasswing and the Real Questions CISOs Want Answered Azure Logging just Changed - Your Detections May be Missing it by Alex Groyz When the Defender Becomes the Door: BlueHammer, RedSun, and UnDefend in the Wild by Justin Howe 4 Ways to Improve SOC Efficiency with AI by Jesse Kimbrel Why triage alerts - when AI can do it for you? Attackers Don’t Hack In — They Log In: The MFA Blind Spot The rise of supply chain-driven data theft in SaaS environments by Lucie Cardiet AI-Assisted Search: Clarity at the Speed of a Question What We Learned from Analyzing Millions of Alerts FortiClient EMS Zero-Day: When the Control Plane Becomes Initial Access by Lucie Cardiet Detecting Compromise After the Axios Supply Chain Attack. by Yusri Mohd Yusop Who’s Doing What on Your Network? by Mark Wojtasiak Breaking down the axios supply chain incident by Lucie Cardiet Detecting Sliver C2: When Advanced Beaconing Tries to Hide in Plain Sight Prompt Control: How Context Becomes the Command-and-Control Layer for AI Agents How Attackers Move Through Hybrid Networks After the Initial Breach How Attackers Establish Persistence in Hybrid Environments What the Stryker Incident Reveals About Handala’s Attack Playbook Why Cyber Resilience is Lagging in the AI Era 5-Minute Hunt: Six Queries to Detect Iranian APT Activity AI-Powered Attacks Are Here, But So Is AI-Powered NDR to Stop Them What is hiding in AI traffic AWS Compromised by AI Agents in Minutes The UX of Cybersecurity AI: Designing for Behavior at Machine Speed Molt Road and the Automation of Underground Marketplaces Moltbook and the Illusion of “Harmless” AI-Agent Communities From Network Detections to Understanding Risk: The Vectra AI Take on Gartner’s Redefinition of NDR From Clawdbot to OpenClaw: When Automation Becomes a Digital Backdoor Securing the AI Enterprise: How I’m Thinking About It as a CEO Cybersecurity Predictions 2026: AI, Agents, and SOC Defense OPSEC Failures: How Threat Actor Mistakes Help Defenders How Threat Actors Turned AI Into a Weapon CVE-2025-14847 MongoBleed in the Wild: Identifying MongoDB Exposure and Exploitation with Network Metadata Pro-Russia Hacktivists Are Targeting Critical Infrastructure How Vectra AI Connects Network Detections to Endpoint Processes Automatically by Dale O’Grady How Vectra AI and CrowdStrike Deliver Complete Context Across Endpoint and Network by Tiffany Nip TCP Reset Does Not Stop Modern Attacks – Here's Why Shai-Hulud: When a Supply-Chain Incident Turns Into a Worm How Typhoon APTs Infiltrate Infrastructure Without Leaving a Trace Think Your Microsoft Environment Is Resilient to Attacks? Think Again by Tiffany Nip Operation ENDGAME and the Battle for Initial Access by Lucie Cardiet What 400+ NDR Power Users Taught Us About Network Visibility How Attackers Gain Initial Access in Hybrid Environments Can Your SOC's AI Actually Think? Evaluating LLMs with the Vectra AI MCP Server How Vectra AI Hybrid NDR Enables Proactive Threat Hunting and Outcome-Driven Defense by Tiffany Nip Introducing the Vectra AI MCP Server for On-Premises (QUX) by Fabien Guillot From Conti to Black Basta to DevMan: The Endless Ransomware Rebrand by Lucie Cardiet How the F5 Breach Exposed Critical Edge Security Gaps Qilin’s 2025 Playbook, and the Security Gap it Exposes by Lucie Cardiet Vectra Fusion: Extending the Vectra AI Platform to Build Resilience Both Pre and Post Compromise Seeing Beneath the Surface: What Crimson Collective Reveals About Cloud Detection Depth Cl0p Is Back, Exploiting Supply Chains Again. How to Choose the Best NDR for Hybrid Environments Red Hat GitLab Breach Shows Why Consulting Data is a Goldmine for Attackers When GoAnywhere Lets Attackers Go Everywhere by Lucie Cardiet Vectra AI with Netography Redefining the SOC Platform around Modern Attack Resilience Beyond Endpoints: How BRICKSTORM Exposed Security Blind Spots by Lucie Cardiet EDR Isn’t Enough: Why Forward-Thinking CISOs Are Turning to Network + Identity by Mark Wojtasiak What Modern SOCs Should Know About NDR Alternatives Scattered Lapsus$ Hunters Announce They Are Going Dark but the Threat Remains LockBit is Back: What’s New in Version 5.0 The Npm Exploit Is The Entry Point, What Follows Is Just As Critical. How AI is Fueling Cybercrime and Why Security Gaps Are Growing by Lucie Cardiet 5-Minute Hunt: Detecting Risky Multi-Tenant Apps in Microsoft 365 GLOBAL RaaS: Dissecting a Modern Ransomware Franchise What the CISA Advisory Reveals About Nation-State Attacks New Technologies bring new risks: MCP-Powered Swarm C2 4 Real-World Attacks That Show Why SOCs Need NDR Why insider threats go undetected by security tools Black Hat USA 2025: What Security Teams Asked Us in Las Vegas Vectra AI and Google Security Operations: Breaking Down Security Silos by Zoey Chu Black Hat Takeaway: Everyone Talks Prevention, But Who Detects Compromise? Black Hat USA 2025: What It Told Me About Protecting the Modern Network from Modern Attacks Introducing the Vectra AI MCP Server Cloud Security Grey Zone: Who Owns the Risk of Managed Identities? CVE-2025-53770: A 9.8/10 Critical Exploit Targeting SharePoint 5 Ways Security Teams Can Start Driving Outcomes with Agentic AI Behind the Hunt: Real-World Threat Hunting Practices and How Vectra AI Makes the Difference Vectra AI named in Gartner hype cycle for security operations 2025 Choosing the Right NDR: Gartner’s 5 Questions Every Security Buyer Should Be Asking Gartner Security and Risk Conference – Chaos meets Opportunity Are Iranian APTs Already inside Your Hybrid Network? You Have the Right Tools. So Why Are Attackers Still Getting In? Vectra AI Named a Leader and Outperformer in the 2025 GigaOm Radar Report for Network Detection and Response (NDR) The Two Control Points That Will Define the Future of Cybersecurity – Network and Identity Challenges in Microsoft Log Monitoring: Insights for Your SOC How Sanofi Detected and Stopped a Cyberattack The Cutting Edge: AI’s Inevitable Rise in Offensive Security
You are the Blackboard - AI Agent Assisted Bug Hunting
2025-12-10 · via Vectra AI Blog

In vulnerability and CVE hunting, you have a search space problem.  Your attack surface is seemingly unlimited, so much so that target selection is often considered the most important skill of a bug bounty hunter.  

In this article, I’ll recount the process I used to narrow the search space.  From millions of lines of code to identifying three flawed lines.  Aiding me were, of course, the latest LLM agents, as well as years of experience in application security, which helped me avoid false positives a common occurrence when agents are tuned towards reward hacking.

The Search Space 

In October 2025, Wiz announced a new hacking contest dubbed Zeroday Cloud in partnership with the three major cloud providers, Google Cloud, AWS, and Microsoft. Inspired by hacking competitions like Pwn2Own, Zeroday Cloud included 20 open-source software targets, libraries, applications, and toolkits that cloud providers widely use to build and power cloud services. The goal of the competition was a simple, if not a high bar to clear: demonstrate unauthenticated Remote Code Execution (RCE) on the target.

Setting the Boundaries 

From a massive search space, the initial target slate was defined as just twenty repositories. Setting aside any possible authentication bypasses, we can further narrow the code paths for consideration to only those accessible by an unauthenticated user. Additionally, most targets rules specify that exploits are to be delivered over the network typically via a local HTTP API server.  

A Place to Start

To get some initial threads to pull on, we have two choices.  Traditionally, one might manually inspect source code and application logic to look for suspicious functionality.  File upload, scripting engines, or document renderers are all areas that could assist in arbitrary code execution.

I chose a different approach, given the still ample search space and limited time. Since the targets are publicly available code repositories, I trained a static code analysis tool onto the code base to generate leads.

As you can see in the screenshot below, dozens or even hundreds of code findings were generated per target.  These served as the starting point for my LLM-assisted vulnerability hunting.

Taint Tracing with Claude

Not all code findings are equally interesting.  Only those that had the potential to further the contest goal, Remote-Code Execution, were to be investigated.  These include issues like:

  • “Eval detected”
  • “Shell=True in Subprocess call”
  • “Pickles deserialization in Pytorch”
  • “Non-static Command in Exec”
  • “User input in path.join”
  • “Dangerous Write Command”

My prompts varied depending on the line of code and flagged issue, but they still had a general theme.  

Example Prompt:  

I’m using a static analysis tool to identify vulnerabilities in my code.  It has identified this line of code as a potential code injection and execution point.  Your task is to trace back the origin of the input executed on this line to see if it could be user-controlled or influenced at any point. Please respond with a detailed analysis tracing the input executed back to its source.

The Competing LLMs

Both Gemini 2.5 and Claude Sonnet 4.5 fared decently at tracing input in suspicious lines of code back to their source, methodically tracing the injection point back and describing the transformations and manipulations the input underwent along the way.  

The differences between the two models begin to show in their analysis of exploitability. While one adopted a skeptical, conservative stance, the other was more eager to find potential risks and explore tangential vulnerabilities. Let’s look at how these two performed during my initial triage of static code analysis findings.

The Conservative Architect vs The Eager Intern

The Gemini persona could be described as a grey-bearded, skeptical architect.  When asked to triage a line of code for the potential for arbitrary code execution, its response is both conservative and somewhat constrained to an unimaginative view of the exploitation path. It is decidedly not overly enthusiastic, nor does it embody an ‘out of the box’ mindset. 

Here, Gemini 2.5 attempts to convince me that all is well with a particular line of code (see figure A: Gemini's Conservative Triage). It is rigid in its belief that since the code executed originates from a configuration file, it cannot be exploitable.  The model attempts to close all intellectual doors for further investigation.

Figure A: Gemini's Conservative Triage

Claude, on the other hand, resembles your most enthusiastic intern.  Brilliant but squirrelly.  What it lacked in perspective, it made up for in its desire to clear every foxhole.  Its response to the same prompt strayed significantly from the original goal of performing a taint analysis to detect potential arbitrary code injection, instead making optimistic claims about other potential security risks. 

Here, you see Claude eager to propose possible next steps (see figure B: Claude's Eagerness). In practice, I’ve never seen Claude Sonnet respond without providing a glimmer of hope for a potential vulnerability. As you can see below, even when outlining mitigations, they are always framed as potential risks if not implemented correctly.

figure B: Claude's Eagerness

You Are The Blackboard Architecture

The workflow naturally introduces you—the human in the loop—as the essential expert. I found myself playing devil’s advocate, challenging the conservative architect's closed thinking and acting as the sober one to the enthusiastic intern's suggestions. Playing one model off the other, picking from the better of the two suggestions, is the Blackboard Architecture in practice.

The Blackboard architecture is essentially a design pattern that enables multiple, specialized Large Language Model (LLM) agents to collaborate to solve complex, messy problems. It’s effective in a multi-LLM setup because it provides agents with a central, shared workspace—the 'blackboard'—where they can communicate and incrementally build a solution without being locked into a rigid, predefined workflow.

This concept is best imagined as a team collaboration. Each team member brings unique skills to the table, and while you cannot speak to each other directly, you communicate and build the solution by writing on a shared blackboard or whiteboard.

Sophisticated multi-agent systems have an ‘overlord’ or Agent Manager that selects the best solutions, helping the team of agents navigate difficult situations.  My ad hoc workflow naturally evolved into me serving as the blackboard, the agent manager, and negotiator between strong personalities.

A Broader Definition of Success

And did this workflow bear fruit?  Not of the kind I had initially hoped for.  In the couple of weeks I spent triaging potential software vulnerabilities in open-source code, I failed to meet the strict goal of unauthenticated Remote Code Execution.  However, thanks to Claude’s curiosity and my willingness to explore rabbit holes, I uncovered some interesting issues in the codebase that had not been identified before.  

I suspect most vulnerability researchers using AI to search for bugs are trying to optimize their over/under.  Identify the most ‘in-scope’, highest CVSS 10.0 vulnerabilities with the fewest cycles possible.  This has left an opening for human intuition to continue to play a part in vulnerability discovery. For the time being, we remain the essential expert human in the loop.

Stay tuned (specifically in 90 days) for the continued discussion of AI-assisted bug hunting to learn the specifics of the vulnerabilities I uncovered with the assistance of multiple AI agents.