慣性聚合 高效追讀感興趣之博客、新聞、科技資訊
閱原文 以慣性聚合開啟

推薦訂閱源

博客园 - 司徒正美
V
V2EX
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
aimingoo的专栏
aimingoo的专栏
Apple Machine Learning Research
Apple Machine Learning Research
IT之家
IT之家
Blog — PlanetScale
Blog — PlanetScale
A
About on SuperTechFans
月光博客
月光博客
T
The Blog of Author Tim Ferriss
宝玉的分享
宝玉的分享
Martin Fowler
Martin Fowler
博客园 - 聂微东
The GitHub Blog
The GitHub Blog
V
Visual Studio Blog
WordPress大学
WordPress大学
酷 壳 – CoolShell
酷 壳 – CoolShell
Engineering at Meta
Engineering at Meta
GbyAI
GbyAI

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python)
人工智能安全乃系统之患:构筑四重运行时防御
Otto Plane · 2026-05-24 · via DEV Community

论大语言模型之安,常沦于语义提示之辨或泛泛之旁路规约。然若汝之护栏策,全恃异步推理后之API呼或粗浅之字符串匹配,则非筑安全之界,乃设虚饰之貌也。

若运机关乎变态之系,安危非后制之能,乃根本之约也。

确然之控,必超默察,内植显层之御,直入执行之脉。此吾析定序、系先之工构之法也。


核心论点:未验证之意图即远程代码执行

汝以函数调用、原生工具集成或数据库连接之方式授LLM基础设施以权柄,则汝之应用程序之威胁态势根本转变。汝自静态数据检索移至动态、非确定性之执行生成。

若使代理得自动态构建其下游执行路径,则提示注入非复寻常之文本处理谬误。乃成功能未授权之远程代码执行(RCE)或未经验证、破坏性之数据库写入。

欲应此,必隔离AI之非确定性输出于严整、确定性之系统边界内。此需四层运行时栈,直接映射于数据路径。


┌────────────────────────────────────────────────────────┐
│ 1. INGRESS SURFACE   (Payload Parsing, Input Gating)   │
├────────────────────────────────────────────────────────┤
│ 2. OUTPUT BOUNDARY   (Type Enforcement, Token Slicing) │
├────────────────────────────────────────────────────────┤
│ 3. EXECUTION GATE    (Tool Interception, Scope Blocks) │
├────────────────────────────────────────────────────────┤
│ 4. POLICY TRACE      (Deterministic State Auditing)    │
└────────────────────────────────────────────────────────┘

入全景式 出全景式


一、入口之表

守卫之设,必先于单符触及推论之端。入口之表,若严疆之界,亦为负载之筛。

非直纳未解之用户输入于调序之核,入口之表,乃为内断之设,以应:

  • 结构输入验证: 验证传入之遥测数据、上下文负载及用户字符串,确其符合严苛之类型预期,而后始入编排管道。
  • 主动负载净化: 探查文数据流,察知间接注入之向量,逸出恶意字符,并涤除结构分隔符,以防其操纵底层系统提示。
  • 飞行前政策评估化解政策之逻辑冲突,中止请命为启昂昂贵、非确定性之模型推演之环。

二、输出之界

勿信原模之出。纵使精调专模,犹或幻生结构之句,于重压下失其类一,或泄内系之境。

输出边界者,乃显性出口验证之代理也。

  • 键入&制式严明 依机解析匹配所生模型应答,对 JSON 范式、Zod 类型或 Protobuf 定义。若应答结构违制编译之规,则代理立时捕获之。
  • 确定输出切片:以程序之法,截断、删削或阻隔数据流,使其不得逾越应用疆界,漏泄非所愿之个人身份信息,或于帧抵下游服务或客户端界面之前,输出系统配置之数据。

三、行刑门

此乃任何运用函数调用或工具调用之能动系统之关键执行核。能动者必永无直窥尔底层数据之执行层。

非然,此代理发执行意欲(工具调用之请),为执行之门所截,而审之:

  • 严控参数之门。严行函数名之硬性白名单,核验参数于显式编译时边界约束。若代理试图供未获准之参数,或唤用越界之法,则执行线程立时断绝。
  • 有状态授权之循环 须止高冲或毁性作业(如数据变易或外向API网钩),待人力环验或独立密码验证无碍,方可命令发遣。

4. 政策追迹

非确定性管道破应用之态,则标准之非结构syslog文件或非结构文块,于调试无益。需确定性、高结构之诊断可观测性。

政策追迹,乃执行全周期之不可易、逐步之审计实录也。

  • 陈情&藏符之术: 捕捉系统提示之确态,原始符文之入构,匹配策令之触发,中继函数之负载,及执行门之精应.
  • 确定性可复现: 整理执行之志为确定性之重演图,俾工师得将故障之确参数反哺于开发之境,辨析构架之漏,补正策令之置。

自理论至码库迁化

自被动验证迁至主动运行时强制,即当尔之安全逻辑直入数据路径。非运行异步之cron检视或带外评估,尔须构建低延迟之基础架构:

  • 内联网络代理: 断流未至编排之层,截取原味HTTP/gRPC之请,去其恶载,或中止不遵之唤。
  • 政策机解耦: 将验理之智卸于孤机(如Open Policy Agent或专司WASM之模),使政策更易,不须重布核心之用。
  • 运时拦截者:于汝之智能体工具调用SDK中注入确定性钩子,以在执行内核触发之前拦截、检视并变异函数参数。

吾今正营此确凿之运行栈之技术架构、核心代理及SDK整合于彼处也。开AI之规绳.

尔若今正撰制运行时验证之中介,将规约条则编译为码,或构建确定性隔离界于智能作业之流程,吾甚欲闻尔如何权衡迟滞之得失。诸君于下文详述实施之细。