慣性聚合 高效追讀感興趣之博客、新聞、科技資訊
閱原文 以慣性聚合開啟

推薦訂閱源

博客园 - 司徒正美
V
V2EX
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
aimingoo的专栏
aimingoo的专栏
Apple Machine Learning Research
Apple Machine Learning Research
IT之家
IT之家
Blog — PlanetScale
Blog — PlanetScale
A
About on SuperTechFans
月光博客
月光博客
T
The Blog of Author Tim Ferriss
宝玉的分享
宝玉的分享
Martin Fowler
Martin Fowler
博客园 - 聂微东
The GitHub Blog
The GitHub Blog
V
Visual Studio Blog
WordPress大学
WordPress大学
酷 壳 – CoolShell
酷 壳 – CoolShell
Engineering at Meta
Engineering at Meta
GbyAI
GbyAI

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python)
增设护栏,使汝AI应用不妄语——以NVIDIA NIM双层之法
Torkian · 2026-05-24 · via DEV Community

篇首,得遇一助教。__JHSNS_SEG_898eb559_1__篇末,授之取用相关文脉。二文终章,皆同此见——当人询WiFi密码,助教拒之。此拒得效,盖吾等所教也。若改辞令,彼亦能杜撰,未尝不悦。

此帖论固拒之道,使其非恃机缘。设两道护栏,皆篇幅短小,可一气读完,亦无需架构。首当收束其辞,使助者知其意。可也论之次者,复以第二LLM之呼,重读其答与境,决其可发与否。

吾乃B·托尔基安,南加州大学NVIDIA开发者冠军。此乃演示化为可用之物之层也。


所增者何

User question
  → retrieve top-k context (from Part 2)
  → scoped prompt: model answers OR returns the exact fallback line
  → grounding check: a second NIM call asks "is the answer supported by the context?"
  → ship the answer, or replace it with the fallback line

入全屏模式 出全屏模式

聊呼与嵌入之设,承自前二篇。此篇新者,不满四十行。


约束非可有可无之故

第二部分之检索步骤已收束何哉模型见之。然无以阻模型之巧于所据之数据,亦无以止其流于助理职外之题。

吾尝睹学子之演示,有二实败之状:

  1. 越界蔓延。 适有询曰:"可为我作绝交之辞乎?"模型欣然应之。检索者得南加州大学三段文(余弦相似度仅得某物),而提示未禁言关系之谏,故模型遂成此文.
  2. 自诩其幻。所取之段云"周一至周五,上午十时至下午六时。"用户询周六之辰。模型决其善答曰"周六之辰为上午十一时至下午四时"——此乃虚造之语也。闻之若合乎情理之推论。

首败之解,在提示之域其次者,乃何也接地检测者,


之用也。 第一步——设置(自成一统)

若汝已于同一Colab会话中运行Workshops 1与2,可略此单元格。若欲从头始,则当粘贴此段——其集客户端、嵌入模型、USC知识库及第1、2部分之检索器于一身,使后文自足自立。

%pip install -q openai numpy

import os, getpass
from openai import OpenAI
import numpy as np

if not os.getenv("NVIDIA_API_KEY"):
    os.environ["NVIDIA_API_KEY"] = getpass.getpass("Paste your NVIDIA API key (starts with nvapi-): ")

client = OpenAI(
    base_url="https://integrate.api.nvidia.com/v1",
    api_key=os.environ["NVIDIA_API_KEY"],
)

MODEL = "meta/llama-3.1-8b-instruct"
EMBED_MODEL = "nvidia/nv-embedqa-e5-v5"

def ask(system_prompt, user_message):
    response = client.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user",   "content": user_message},
        ],
        temperature=0.3,
        max_tokens=400,
    )
    return response.choices[0].message.content

knowledge_base = [
    {"title": "USC AI Club meeting",
     "text": "The USC AI Club meets every Thursday at 5 PM in the engineering building, room 204."},
    {"title": "USC GPU lab hours",
     "text": "The USC GPU computing lab is open Monday to Friday from 10 AM to 6 PM."},
    {"title": "NVIDIA Developer Program",
     "text": "USC students can join the NVIDIA Developer Program for free."},
    {"title": "Next USC workshop",
     "text": "The next USC AI Club workshop will cover Retrieval Augmented Generation (RAG)."},
    {"title": "USC AI/ML office hours",
     "text": "Office hours for the USC AI/ML faculty are Tuesdays 2-4 PM."},
    {"title": "USC robotics lab",
     "text": "The USC robotics lab requires safety training before students can use the soldering station."},
    {"title": "USC tutoring",
     "text": "Peer tutoring for introductory Python at USC is available Wednesdays from 1 PM to 3 PM."},
]

def embed_texts(texts, input_type="passage"):
    response = client.embeddings.create(
        model=EMBED_MODEL,
        input=texts,
        extra_body={"input_type": input_type},
    )
    return [np.array(item.embedding, dtype=np.float32) for item in response.data]

def cosine_similarity(a, b):
    denom = np.linalg.norm(a) * np.linalg.norm(b)
    if denom == 0:
        return 0.0
    return float(np.dot(a, b) / denom)

def retrieve_context(question, k=3):
    q_emb = embed_texts([question], input_type="query")[0]
    scored = [(cosine_similarity(q_emb, item["embedding"]), item) for item in knowledge_base]
    scored.sort(key=lambda p: p[0], reverse=True)
    return "\n".join(f"- {item['text']}" for _, item in scored[:k])

for item, emb in zip(knowledge_base, embed_texts([i["text"] for i in knowledge_base], "passage")):
    item["embedding"] = emb

print(f"Ready. Embedded {len(knowledge_base)} chunks.")

进入全屏模式 退出全屏模式

此单元格定义一切,工坊一与工坊二所产。下之第三部分代码,乃在此基础上构建。askretrieve_context,且内嵌knowledge_base.


第二步 — 第一层:提示范围,设固定备用句

FALLBACK = "I don't have that information — check with the USC AI Club."

SCOPED_SYSTEM_PROMPT_TEMPLATE = """You are a USC campus assistant for AI Club,
GPU lab, NVIDIA program, workshop, office hour, robotics lab, and tutoring
questions only.

Rules:
- Answer ONLY using the CONTEXT below.
- If the user asks about anything outside this scope (e.g. weather, jokes,
  personal advice, code generation, general world knowledge), reply with
  exactly: "{fallback}"
- If the answer is not present in the context, reply with exactly: "{fallback}"
- Do not invent names, dates, room numbers, links, passwords, schedules,
  policies, or instructions that are not in the context.

CONTEXT:
{context}
"""

入全景模式 出全屏模式

三事正在此提示中做事。

  • 有限之题单. 佐助有职掌之述。 "此范围外之事" 予模型以明途——毋须臆度何者在内.
  • 一确之退回之辞.辞语如一,屡屡如是。此于第三步尤关紧要——其固本之验,每回皆返同一之辞,故下游之码,惟识一形。
  • 明令禁止之清单也.模型之变通,若明示危险之类别(如房号、密码、规策),则幻象顿减,而无需额外之询问.

此层 alone,已能捕获多数离题之问及多数"语境未及"之案.


第三步 — 第二层:每答必验其根

限定之提示,乃所请也 — 此模式犹可漠视之。第二层乃别立之狭义NIM调用,其唯一职司,惟观乎上下文与应答,决应答是否得宜。

def answer_is_grounded(question: str, context: str, answer: str) -> bool:
    verdict = ask(
        system_prompt=(
            "You are a strict grounding verifier. Read the CONTEXT and the "
            "ANSWER. Respond with only 'yes' or 'no'. Say 'yes' if every "
            "factual claim in the ANSWER is directly supported by the CONTEXT. "
            "Say 'no' otherwise — including if the ANSWER adds information not "
            "in the CONTEXT, even if that information sounds plausible."
        ),
        user_message=(
            f"CONTEXT:\n{context}\n\n"
            f"QUESTION:\n{question}\n\n"
            f"ANSWER:\n{answer}\n\n"
            "Is every factual claim in the ANSWER supported by the CONTEXT?"
        ),
    )
    return verdict.strip().lower().startswith("yes")

Enter fullscreen mode Exit fullscreen mode

三者当察:

  • 此亦一 ask() 调用耳。 — 客户如一,所托模型亦如一,无新基建。层二之费,每问增一呼。
  • 是非而已。 约束应形,则解析可恃。若核者迟疑(“是也,然...”),吾以验字首唯允,余则视为败。
  • 亦或谬误。 验证者亦为大型语言模型。此于研讨会之安全已足;于生产则需加确定性检验(房间号用正则,回退用确字匹配)。

步第四—将两层皆接于ask_guarded()

def ask_guarded(question: str) -> str:
    context = retrieve_context(question)              # from Part 2
    system_prompt = SCOPED_SYSTEM_PROMPT_TEMPLATE.format(
        fallback=FALLBACK, context=context,
    )
    answer = ask(system_prompt, question)             # Layer 1
    if not answer_is_grounded(question, context, answer):
        return FALLBACK                               # Layer 2 override
    return answer

for question in [
    "When does the USC AI Club meet?",        # in scope, in context
    "Can you write my breakup text?",         # OUT of scope
    "What is the wifi password?",             # in scope, NOT in context
    "What are the USC GPU lab Saturday hours?",   # invites a hallucination
]:
    print(f"Q: {question}")
    print(f"A: {ask_guarded(question)}\n")

进入全屏模式 退出全屏模式

细读其出。

  • AI之社问,得乎境之实答,两层皆通.
  • 离析之文问,中乎层一——界域之则摄之.
  • 无线之问亦中乎层一——境中无及密码,界域之示禁造之。
  • 周六之问,乃其自存之题。文意云"周一至周五"。友者当测"周六歇业"。层二得此应答,察"周六"非属文意,遂返所备之策。

第五步——汝所实筑

尔取第二部之寻回器,置于二廉价可检之护栏内。其事犹为单一之Python文件,犹为单一之托管NIM端点,犹无向量数据库。其心象如是:

  • 寻回模型所见
  • 限定之提示决。此模型所许之文为何
  • 接地检查决之所云模型所书者,其船之否?请提供需要翻译的英文文本。

真实之生产系统,皆延展此诸般——定规之检核,结构之输出,信度之阈限,专司之安全模型,人手之复审序列。其形一也。每增一层数,皆用户所问与终局之应答间之可否之关也。


取其码

仓库: github.com/torkian/nvidia-nim-workshop
一键之Colab为第三部: part3_guardrails.ipynb
本地之Python: part3_guardrails.py于库中(in the repo)python3 part3_guardrails.py而后pip install -r requirements.txt)。

MIT 许可。吾于南加州大学行之——可分叉之,易其知识库于汝校、汝社、汝项目,随处可运之。


前文 / 后续

  • 第一部分: 以 NVIDIA NIM 速成三十分钟首度构建 AI 应用
  • 第二部:自手抄之RAG至真实检索——基于嵌入之RAG与NVIDIA NIM
  • 第四部(续):于己之GPU上运NIM——同OpenAI相容之API,异端点。于欲数据之邻近、可期之迟滞、或自设之开发回环时,尤宜。