惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

T
Threat Research - Cisco Blogs
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
V
Vulnerabilities – Threatpost
GbyAI
GbyAI
P
Proofpoint News Feed
L
LINUX DO - 热门话题
P
Palo Alto Networks Blog
A
About on SuperTechFans
T
Tenable Blog
M
MIT News - Artificial intelligence
IT之家
IT之家
I
Intezer
D
DataBreaches.Net
爱范儿
爱范儿
T
Threatpost
C
CERT Recently Published Vulnerability Notes
云风的 BLOG
云风的 BLOG
博客园 - 三生石上(FineUI控件)
WordPress大学
WordPress大学
K
Kaspersky official blog
大猫的无限游戏
大猫的无限游戏
A
Arctic Wolf
Y
Y Combinator Blog
Cyberwarzone
Cyberwarzone
酷 壳 – CoolShell
酷 壳 – CoolShell
D
Darknet – Hacking Tools, Hacker News & Cyber Security
H
Help Net Security
Microsoft Security Blog
Microsoft Security Blog
Spread Privacy
Spread Privacy
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
AWS News Blog
AWS News Blog
博客园 - 聂微东
C
Check Point Blog
S
Securelist
有赞技术团队
有赞技术团队
雷峰网
雷峰网
aimingoo的专栏
aimingoo的专栏
Last Week in AI
Last Week in AI
Stack Overflow Blog
Stack Overflow Blog
MongoDB | Blog
MongoDB | Blog
D
Docker
G
GRAHAM CLULEY
T
The Exploit Database - CXSecurity.com
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tailwind CSS Blog
L
Lohrmann on Cybersecurity
G
Google Developers Blog
C
Cyber Attacks, Cyber Crime and Cyber Security
L
LangChain Blog

A10 Networks

What Is Low-latency Trading? | A10 Networks Multi-Vector DDoS: 11 Amplification Vectors | A10 Healthcare Cloud Compliance: HIPAA & GDPR Guide | A10 LLM Unbounded Consumption & DoS Attacks | OWASP LLM10 LLM Hallucination & Misinformation | OWASP LLM09:2025 Healthcare Network Protection for Hospitals & Clinics RAG Security: Vector & Embedding Weaknesses | OWASP LLM08 LLM Excessive Agency | OWASP LLM06:2025 Explained LLM Supply Chain Security | OWASP LLM03:2025 Trust, Control and Security in the Age of Agentic AI Summit | A10 Networks LLM Improper Output Handling | OWASP LLM05:2025 Data Poisoning Attacks in LLMs | OWASP LLM04:2025 Sensitive Information Disclosure | OWASP LLM02:2025 Game Over for DDoS Attacks in Gaming | How to Achieve Resilience Prompt Injection | OWASP LLM01:2025 Explained Beyond PCI Summit: Battling Bots, Fraud, and AI-powered Threats Web Application Security Best Practices for 2026 | A10 Networks A10’s 5 Key Takeaways on Application & API Security Trends Securing Financial Applications in the AI Era Summit Unified Application Delivery, Security, and AI Protection for Financial Services The Most Famous DDoS Attacks in History Post-quantum Cryptography Comes to A10 SSL/TLS Data Plane Real-time DDoS Carpet-bombing: NTP Amplification Evasion Shadow AI | Glossary AI & LLM Security: Hype vs. Reality and What to Prioritize App Delivery in the Age of AI Summit | Hybrid & Cloud-Native Strategies A Day in the Life of a Stressed Web Application | ADC & WAF Resilience Avans University of Applied Sciences Modernizes Hybrid Application Delivery with A10 Networks Preparing Government Infrastructure for AI Adoption | Expert Summit Report: IDC Spotlight Report: Modernizing Application Delivery Infrastructure for AI-powered Applications Broken Object Level Authorization (BOLA): The #1 API Security Risk | Free Webinar | A10 Networks Product Demo: A10 AI Firewall by A10 Networks AI Firewall for Enterprise AI Security | A10 Networks API Traffic Management for AI and Agentic Systems | Expert Summit AI is Here: How Ready Is Your Infrastructure? | A10 Networks Pulse Campaign Analysis: Brazil ISPs Expose Next-Gen DDoS Automation Trends Tech Companies Lead GenAI Adoption but Face Infrastructure Gaps Cyber Defense Magazine's 2026 Global InfoSec award – Editor's Choice – API Security | A10 Networks Load Balancing Solutions for Availability & Security | A10 Networks Top 9 Generative AI Security Risks in 2026 LLM Security: Protecting AI Models & Applications
System Prompt Leakage | OWASP LLM07:2025 Explained
Richard Tuma · 2026-05-27 · via A10 Networks

System prompt leakage refers to the risk that the system prompts or hidden instructions used to steer an LLM’s behavior may contain sensitive information that was not intended to be disclosed.

System prompts guide model behavior according to application requirements. However, they are not secure storage mechanisms and must not be treated as secrets or security controls.

Importantly, the real risk lies in what the system prompt contains and what the application improperly delegates to it.

Attackers can often infer guardrails and formatting rules simply by interacting with the model, even without directly extracting the exact wording of the system prompt.

Key Takeaways

  • System prompt leakage occurs when the instructions used to steer an LLM's behavior are exposed to users, potentially revealing sensitive information like API keys, credentials, internal rules, or permission structures
  • The system prompt itself should never be treated as a security control or used to store secrets; the real risk lies not in its disclosure, but in what it reveals about underlying security weaknesses and privileged access
  • Even when the exact system prompt wording isn't disclosed, attackers can infer guardrails and behavioral restrictions simply by sending inputs and observing how the model responds over time
  • Security controls such as privilege separation and authorization checks must never be delegated to the LLM via system prompts; these require deterministic, auditable enforcement in external systems
  • Prevention requires keeping sensitive data entirely out of system prompts, implementing external guardrails independent of the LLM, and avoiding over-reliance on prompt instructions to enforce critical application behavior

Why System Prompts are Not Security Controls

System prompts can be influenced by prompt injection. They may be extracted through meta-prompt techniques. They are not deterministic enforcement mechanisms and should not contain sensitive operational data

If sensitive information (credentials, API keys, role definitions, connection strings) is embedded in system prompts, its exposure is a design failure, not merely a leakage event.

Additionally, if authorization rules or privilege logic are implemented inside the system prompt rather than in deterministic back-end systems, the architecture itself is insecure.

Common Risk Patterns

Exposure of Sensitive Functionality

System prompts may reveal API keys, database credentials, user tokens, system architecture details, tool configuration or back-end technologies. For example, If a prompt reveals the type of database being used, attackers may tailor injection attacks accordingly.

Exposure of Internal Rules

System prompts may describe internal thresholds or decision rules. For example:

  • “Transaction limit is $5,000 per day.”
  • “Total loan amount per user is $10,000.”

Attackers can use this knowledge to manipulate workflows, bypass controls or target logic weaknesses.

Revealing Filtering Criteria

A system prompt may instruct: “If a user requests information about another user, respond with, ‘Sorry, I cannot assist.’”

Knowing this rule allows attackers to craft bypass strategies.

Disclosure of Permissions and Role Structures

System prompts may reveal role definitions such as: “Admin users have full access to modify records.”

Attackers may then attempt privilege escalation.

Example Attack Scenarios

Scenario 1 – Credential Exposure

A system prompt contains credentials for accessing a tool. An attacker extracts the prompt and uses those credentials independently to access back-end systems.

Scenario 2 – Guardrail Bypass

A system prompt prohibits offensive content, external links and code execution. An attacker extracts these rules and then uses prompt injection techniques to override or bypass them, potentially enabling remote code execution.

The Core Insight

Disclosure of the system prompt is not the core vulnerability. The core vulnerability is storing sensitive data where it does not belong, delegating authorization logic to an LLM and relying on prompt text for enforcement of critical controls. Even if the exact wording of the system prompt remains hidden, attackers can often reverse-engineer guardrails through interaction.

Prevention and Mitigation Strategies

Separate Sensitive Data from System Prompts

Never embed API keys, authentication tokens, database names, role structures or permission mappings. Sensitive information must reside in secure backend systems inaccessible to the model.

Avoid Relying on System Prompts for Strict Behavior Control

LLMs are vulnerable to prompt injection. Critical controls (e.g., content filtering, policy enforcement) must be implemented outside the LLM in deterministic systems.

Implement External Guardrails

Use independent systems to inspect model outputs, validate compliance and enforce content restrictions. Model training alone is not sufficient.

Enforce Security Controls Independently

Authorization, privilege separation, and access control must occur outside the LLM. It must be deterministic and auditable and not rely on model reasoning. If agents require different privilege levels, use separate agents configured with least privilege.

Architectural Principle

Treat system prompts as configuration hints, not security boundaries. Security must be enforced at the application layer in back-end systems through deterministic access control mechanisms. LLMs are probabilistic systems. Authorization and security enforcement must not be probabilistic.

The Key Takeaway

System prompt leakage highlights a deeper issue. If leaking the system prompt breaks your security model, the architecture is already flawed. Do not store secrets in prompts. Do not rely on prompts for access control and do not treat hidden instructions as security mechanisms.

Security must exist outside the model.

< Back to Glossary of Terms