惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

SecWiki News
SecWiki News
D
Darknet – Hacking Tools, Hacker News & Cyber Security
I
Intezer
月光博客
月光博客
Cyberwarzone
Cyberwarzone
雷峰网
雷峰网
Security Latest
Security Latest
量子位
博客园 - 聂微东
小众软件
小众软件
NISL@THU
NISL@THU
C
Cisco Blogs
The GitHub Blog
The GitHub Blog
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tor Project blog
Y
Y Combinator Blog
V
V2EX
博客园 - 三生石上(FineUI控件)
P
Privacy & Cybersecurity Law Blog
F
Full Disclosure
Cisco Talos Blog
Cisco Talos Blog
Microsoft Security Blog
Microsoft Security Blog
S
Security @ Cisco Blogs
The Register - Security
The Register - Security
Google DeepMind News
Google DeepMind News
J
Java Code Geeks
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
IT之家
IT之家
Webroot Blog
Webroot Blog
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
aimingoo的专栏
aimingoo的专栏
腾讯CDC
S
Schneier on Security
L
LINUX DO - 最新话题
Latest news
Latest news
Simon Willison's Weblog
Simon Willison's Weblog
罗磊的独立博客
A
Arctic Wolf
MyScale Blog
MyScale Blog
云风的 BLOG
云风的 BLOG
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
S
Secure Thoughts
S
Securelist
Stack Overflow Blog
Stack Overflow Blog
T
Troy Hunt's Blog
Recorded Future
Recorded Future
I
InfoQ
The Cloudflare Blog
H
Heimdal Security Blog
Hugging Face - Blog
Hugging Face - Blog

Wiz Blog | RSS feed

Meet Wiz for M365: Bringing SaaS into the Security Graph How to Harden GitHub Actions: An Updated Guide Bringing Security Visibility to Vercel with Wiz Axios NPM Distribution Compromised in Supply Chain Attack Tracking TeamPCP: Investigating Post-Compromise Attacks Seen in the Wild The Wiz Blue Agent, now Generally Available Beyond the Badge: What Achieving Microsoft’s Certified Software Designation Means for Your Cloud Security Introducing the Green Agent: AI-Powered Remediation for the Cloud Three’s a Crowd: TeamPCP trojanizes LiteLLM in Continuation of Campaign KICS GitHub Action Compromised: TeamPCP Strikes Again in Supply Chain Attack Introducing the Wiz Red Agent- AI-Powered Attacker Introducing Wiz AI Application Protection Platform (AI-APP) Introducing Wiz Agents & Workflows: Security at the Speed of AI AI Runtime Threat Detection: From Input to Real-World Impact Trivy Compromised: Everything You Need to Know about the Latest Supply Chain Attack It’s Official: Wiz Joins Google Understanding and Reducing AI Risk in Modern Applications Introducing Wiz Tenant Manager: Multi-Tenant Management for Federated Organizations The Agile FedRAMP Playbook, Part 4: Reactive Risk Management through Enriched Incident Response Wiz Achieves CPSTIC Certification in Spain Seeing AI Clearly: Building Visibility Across Modern AI Applications The Agile FedRAMP Playbook, Part 3: Preventative Risk Management by building Secure by Design Wiz Leads the 2026 Latio Application Security Report with awards in 4 categories Building an Agentic Cloud Security Ecosystem: A Reference Architecture with Wiz MCP and Infosys Cyber Next The Agile FedRAMP Playbook, Part 2: Proactive Risk Management with Continuous Monitoring Cloud-native Security for your Windows environment: Announcing the Wiz Runtime Sensor for Windows Would You Click ‘Accept’? Automatically detecting malicious Azure OAuth applications using LLMs Wiz Named a Leader in The Forrester Wave™: Cloud Native Application Protection Solutions, Q1 2026 From Detection to Remediation: It’s Time to Rethink AppSec Around Exploitability and Root Cause Fixes The Agile FedRAMP Playbook, Part 1: Why Risk is Your Best Starting Point Introducing AI Cyber Model Arena: A Real-World Benchmark for AI Agents in Cybersecurity Wiz + Spotify Backstage: Security at the Developer’s Desk Building AI Security Together: New Ways to Partner with Wiz for AI Security in 2026 Hacking Moltbook: The AI Social Network Any Human Can Control The Year in Wiz Research: 2025 Most Read Blogs WizExtend is Here: AI and Cloud Security Insights in Your Daily Workflow From Detection to Remediation: Wiz in Your JetBrains IDE Agentic Browser Security: 2025 Year-End Review CodeBreach: Infiltrating the AWS Console Supply Chain and Hijacking AWS GitHub Repositories via CodeBuild A 90-Day Action Plan to Turn Resolutions into Results with Wiz Introducing the Wiz Partner Alliance: A New Chapter for Partner Success Preparing for Post-Quantum Cryptography Wiz Recognized as a 2025 Customers’ Choice in the Gartner® Peer Insights™ Voice of the Customer for CNAPP Expanding the Zero Critical Club to set a new standard for AppSec and SecOps teams Snipping the Long Tail of Shai-Hulud 2.0 Protecting Against Zero-Day Vulnerabilities with SOC-Level ASM Alert MongoBleed (CVE-2025-14847) exploited in the wild: everything you need to know The Kenna Transition: Your Strategic Shift to Exposure Management From MCP to Vibe Coding: Full Endpoint Visibility in Wiz AI Security Bringing Oracle Cloud Identity to Wiz Zero‑Days in the Age of AI: Behind the Scenes of ZeroDay.cloud 2025, with a Record High of CVEs in Critical Cloud Infra Gogs 0-Day Exploited in the Wild Code to Cloud Attacks: From Github PAT to Cloud Control Plane Top AWS re:Invent Announcements for Security Teams in 2025 React2Shell: Technical Deep-Dive & In-the-Wild Exploitation of CVE-2025-55182 React2Shell (CVE-2025-55182): Everything You Need to Know About the Critical React Vulnerability Wiz Product Announcements at re:Invent 2025: Expanding Visibility from Code to Cloud Introducing Wiz SAST: Where Code Risk Meets Cloud Context Wiz Becomes Fastest Security ISV to Reach $1 Billion in AWS Marketplace Lifetime Sales It's Here! Wiz Exposure Management is Now GA Shai-Hulud 2.0 Aftermath: Trends, Victimology and Impact Service Catalog is Here: Expand Risk Visibility for Your Service and Its Dependencies, Simplify Issue Ownership WizOS: Powering Secured Image Adoption with AI 3 OAuth TTPs Seen This Month — and How to Detect Them with Entra ID Logs Mastering Software Governance with Hosted Technologies Inventory Shai-Hulud 2.0 Supply Chain Attack: 25K+ Repos Exposing Secrets Get Certified on Wiz Defend for Threat Detection and Response Blueprint for Security: A Guide to Code, Governance, and Response Frameworks Google Unified Security Recommended Program Names Wiz Among First 3 Strategic Partners Introducing Posture Issues: Transform Security Findings into Actionable Outcomes Empower and Accelerate Your SOC with the Blue Agent Exposure Report: 65% of Leading AI Companies Found with Verified Secret Leaks Wizdom 2025 Product Announcements: Extending the Cloud Operating Model When AI Becomes the Heart of Security: Powering a Future You Can Trust AI-Powered Wiz: From Agents to Everyday Intelligence Defend Agentless Workload Detection: Bringing Visibility to Blind Spots in Threat Detection Securing AI Agents with Wiz AI-SPM Introducing Wiz ASM: Context-Driven Attack Surface Management Securing Critical Infrastructure in the Cloud Era: A Policy and Technology Blueprint How CISOs Should Plan Security Budgets for 2026 Beyond the Checkbox: How Wiz Transforms SOC 2 into a Security Powerhouse Bringing Visibility to Kubernetes: Unified Inventory and Network Insight The Foundation Modern AppSec Is Still Missing: Code to Cloud, Rebuilt the Right Way Dismantling a Critical Supply Chain Risk in VSCode Extension Marketplaces Introducing HoneyBee: How We Automate Honeypot Deployment for Threat Research RediShell: Critical Remote Code Execution Vulnerability (CVE-2025-49844) in Redis, 10 CVSS score Defending against database ransomware attacks AI Security 101: Mapping the AI Attack Surface Introducing zeroday.cloud: First-of-its-kind cloud and AI hacking competition Unifying Cloud Risk and Network Defense: Wiz and Check Point The emerging use of malware invoking AI Wiz achieves FedRAMP High authorization Wiz + HCP Terraform: Close the IaC-to-Cloud Infrastructure Security Gap IMDS Abused: Hunting Rare Behaviors to Uncover Exploits Beyond CVEs: The Exploitation of Everyday Misconfigurations Wiz Research Discovers One in Five Organizations Exposed to Systemic Risks in Vibe-Coded Applications - Here's How to Secure Them Introducing Wiz Incident Response: Your Expert Partner for Cloud Security Incidents Shai-Hulud: Ongoing Package Supply Chain Worm Delivering Data-Stealing Malware DORA Compliance in the Cloud Era: Insights from Deloitte and Wiz How Wiz Customers like Brex and FICO See AI Changing Security
Breaking NVIDIA Triton: CVE-2025-23319 - A Vulnerability Chain Leading to AI Server Takeover
Ronen Shustin, Nir Ohfeld · 2025-08-04 · via Wiz Blog | RSS feed

The Wiz Research team has discovered a chain of critical vulnerabilities in NVIDIA's Triton Inference Server, a popular open-source platform for running AI models at scale. When chained together, these flaws can potentially allow a remote, unauthenticated attacker to gain complete control of the server, achieving remote code execution (RCE).

This attack path originates in the server's Python backend and starts with a minor information leak that cleverly escalates into a full system compromise. This poses a critical risk to organizations using Triton for AI/ML, as a successful attack could lead to the theft of valuable AI models, exposure of sensitive data, manipulating the AI model's responses and a foothold for attackers to move deeper into a network.

Wiz Research responsibly disclosed these findings to NVIDIA, and a patch has been released. We would like to thank the NVIDIA security team for their excellent collaboration and swift response. NVIDIA has assigned the following identifiers to this vulnerability chain: CVE-2025-23319, CVE-2025-23320, and CVE-2025-23334. We strongly recommend all Triton Inference Server users update to the latest version. This post provides a high-level overview of these new vulnerabilities and their potential impact. 

The enclosed work is the latest in a series of NVIDIA vulnerabilities we’ve disclosed, including two container escapes: CVE-2025-23266 and CVE 2024-0132.

Mitigations

  1. Update Immediately: The primary mitigation is to upgrade both the NVIDIA Triton Inference Server and the Python backend to version 25.07 as advised in the NVIDIA security bulletin

Wiz customers can use the following to detect vulnerable instances in their cloud environment:

The Inner Workings of Triton

To understand the vulnerability, it helps to know a little about Triton's architecture. Triton is designed to be a universal inference server, capable of deploying models from any major AI framework (PyTorch, TensorFlow, etc.). It achieves this flexibility through a modular system of backends, where each backend is responsible for executing models from a specific framework. When an inference request for a specific model arrives, Triton automatically routes the request to the necessary backend for execution.

Our research focused on the Python backend, as it is one of the most popular and versatile backends in the Triton ecosystem. It not only serves models written directly in Python but also acts as a dependency for several other backends. This means that even models configured to run under a different backend might still internally use the Python backend for other phases of the inference process. Given its widespread usage, we decided to focus our security research on this component.

Python backend internals

The Triton Python backend's core logic is implemented in C++ and is designed to handle inference requests for Python models. When a request arrives, this C++ component communicates with a separate "stub" process, which is responsible for loading and executing the model code. To facilitate communication between its own C++ logic and this stub process, the backend relies on a sophisticated Inter-Process Communication (IPC) mechanism for both inference and internal operations. This IPC is built on named shared memory (/dev/shm), creating a memory region accessible via a unique system path. This design allows for high-speed data exchange, but it also introduces a critical dependency: the security and privacy of the shared memory names.

Vulnerabilities Overview

Step 1: Information Disclosure of the Backend's Shared Memory Name

During our audit of the Python backend, we discovered a flaw in its error handling mechanism. By sending a crafted, large remote request, an attacker can trigger an exception that results in a crucial information disclosure. The resulting error message, returned to the user, improperly includes the full, unique name of the backend's internal IPC shared memory region.

The returned error message appears as follows: {"error":"Failed to increase the shared memory pool size for key 'triton_python_backend_shm_region_4f50c226-b3d0-46e8-ac59-d4690b28b859'..."}

The disclosure of this name is the first critical step in the exploit chain, as it exposes an internal component that should remain private.

Step 2: Abusing the Shared Memory API for Arbitrary Read/Write

Triton offers a user-facing shared memory feature for performance. A client can use this feature to have Triton read input tensors from, and write output tensors to, a pre-existing shared memory region. This process avoids the costly transfer of large amounts of data over the network and is a documented, powerful tool for optimizing inference workloads.

Example of a user-created shared memory region's location

With the leaked name of the Python backend's internal IPC shared memory, an attacker can turn this public-facing API against itself. The vulnerability lies in the API's lack of validation; it does not check whether a provided shared memory key corresponds to a legitimate user-owned region or a private, internal one.

Forcing Triton to use the Python backend's internal shared memory by providing it with the key triton_python_backend_shm_region_4f50c226-b3d0-46e8-ac59-d4690b28b859

An attacker can therefore call the registration endpoint with the leaked internal key. Once the server accepts it, they can craft subsequent inference requests that use this region for input or output. This provides the attacker with powerful read and write primitives into the Python backend's private memory, which also contains internal data and control structures related to its IPC mechanism, all performed through standard, legitimate API calls.

Step 3: Ways towards a Remote Code Execution

Since an attacker can now alter the Python backend's shared memory, they can cause unexpected behavior in the server. This capability can be leveraged to gain full control of the server. There are multiple exploitation avenues an attacker might use to achieve this:

  • One method involves corrupting existing data structures within the backend's shared memory region. Furthermore, an attacker can target specific structures that contain pointers (e.g., MemoryShm, SendMessageBase), which allows out-of-bounds memory access beyond the backend's shared memory.

  • Crafting malicious IPC messages and manipulating the IPC message queue to process them opens a new attack surface for exploitation, ranging from native memory corruption to logical exploits.

For now, we will not be publishing further technical details regarding the exploitation of this vulnerability.

Impact

The chain could allow a remote, unauthenticated attacker to take over an NVIDIA Triton Inference Server. This can lead to several critical outcomes, including:

  • Model Theft: Stealing proprietary and expensive AI models.

  • Data Breach: Intercepting sensitive data being processed by the models, such as user information or financial data.

  • Response Manipulation: Manipulating the AI model's output to produce incorrect, biased, or malicious responses.

  • Pivoting: Using the compromised server as a beachhead to attack other systems within the organization's network.

Conclusion

This research demonstrates how a series of seemingly minor flaws can be chained together to create a significant exploit. A verbose error message in a single component, a feature that can be misused in the main server were all it took to create a path to potential system compromise. As companies deploy AI and ML more widely, securing the underlying infrastructure is paramount. This discovery highlights the importance of defense-in-depth, where security is considered at every layer of an application.

Responsible Disclosure Timeline

  • May 15, 2025: Wiz Research reported the vulnerability chain to NVIDIA.

  • May 16, 2025: NVIDIA acknowledged the report.

  • August 4, 2025: NVIDIA published the security bulletin, patches, and assigned CVEs: CVE-2025-23319, CVE-2025-23320, and CVE-2025-23334.

  • August 4, 2025: Wiz Research publishes this blog post.

Stay in touch!

Hi there! We are Ronen Shustin (@ronenshh), Nir Ohfeld (@nirohfeld), Sagi Tzadik (@sagitz_), Hillai Ben-Sasson (@hillai), Andres Riancho (@andresriancho) and Yuval Avrahami (@yuvalavra) from the Wiz Research Team (@wiz_io). We are a group of veteran white-hat hackers with a single goal: to make the cloud a safer place for everyone. We primarily focus on finding new attack vectors in the cloud and uncovering isolation issues in cloud vendors and service providers. We would love to hear from you! Feel free to contact us on X (Twitter) or via email: research@wiz.io. 

See more from Wiz Research