惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

月光博客
月光博客
博客园_首页
J
Java Code Geeks
量子位
小众软件
小众软件
雷峰网
雷峰网
IT之家
IT之家
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
PCI Perspectives
PCI Perspectives
酷 壳 – CoolShell
酷 壳 – CoolShell
S
Secure Thoughts
博客园 - 叶小钗
C
Cybersecurity and Infrastructure Security Agency CISA
T
ThreatConnect
Last Week in AI
Last Week in AI
罗磊的独立博客
博客园 - 司徒正美
T
Tenable Blog
S
Security Archives - TechRepublic
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Recent Commits to openclaw:main
Recent Commits to openclaw:main
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
李成银的技术随笔
AWS News Blog
AWS News Blog
The Hacker News
The Hacker News
P
Privacy & Cybersecurity Law Blog
F
Future of Privacy Forum
N
News and Events Feed by Topic
博客园 - 【当耐特】
V
Vulnerabilities – Threatpost
T
Tor Project blog
C
Cisco Blogs
L
Lohrmann on Cybersecurity
Malwarebytes
Malwarebytes
T
Tailwind CSS Blog
V
V2EX
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
博客园 - 三生石上(FineUI控件)
有赞技术团队
有赞技术团队
大猫的无限游戏
大猫的无限游戏
人人都是产品经理
人人都是产品经理
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
爱范儿
爱范儿
宝玉的分享
宝玉的分享
L
LINUX DO - 热门话题
阮一峰的网络日志
阮一峰的网络日志
L
LINUX DO - 最新话题
The Register - Security
The Register - Security
博客园 - Franky

DEV Community

Partner: An AI That Does Research While You Sleep Rugby Fundamentals as Software Concepts - Mapping the Pitch to your Code Base I Let Claude Code Run Unsupervised for 24 Hours. Here's What Happened. Why Zed Is Replacing VS Code in My AI-Augmented Workflow Build a scroll-driven WebGL hero in 30 lines Karpathy's LLM Wiki? No Code with Claude or Github Copilot! Why Platform Governance and Transparency Matter for Developers and Freelancers I built a Flutter CLI that generates Clean Architecture in seconds Using an LLM to automate a task that used to take hours by hand CyberArena – Interactive Cyber Security Simulation & Threat Analysis Platform Tile Extractor Mathematical Functions in CSS: clamp, min, max and How They Simplify Responsiveness Polyglot Persistence in Microservices: Let the Domain Choose the Database 190 Countries, Zero API Calls: Shipping Static Data in a Chrome Extension Your AI Writes Code Fast. Here’s How to Check It Before Shipping qwen2.5-coder is too slow for Claude Code on a Mac. Here's the fix. Building Automated Text-to-Video Pipelines with AI Can Gemini Become an Offline AI Tutor? Lessons from Building Educational AI OPRIX : From a simple messaging web app to a well structured and enhanced UI messaging web app Why React + TypeScript Nullability Slowly Becomes Exhausting Why AI Agents Need a Project Layer - Part 1 Stop Hand-Editing MCP Configs: A Zero-Dependency Go CLI What I Learned Working With Microsoft, SQUAD(GTCO), and Different Tech Communities 🧠 Hermes Agent Assistant — A Modular AI Agent System with Planner, Executor & Memory Spring Boot Auto-Configuration Source Code: Nail This Interview Question The Ultimate Guide to Free AI API Keys: 6 Platforms You Need to Know Why 91% of AI Agents Fail in Production (And What the 9% Do Differently) TryHackMe | Battery | WALKTHROUGH Stop Guessing Your Regex — Test It Live in the Browser I Built FreelancEye, an Open-Source Mobile PWA for Finding Clients Beyond the Hype: My Production Playbook for Docker Swarm Top AI App Builder Platforms with Integrated Backend, Hosting & Database ECS vs EKS in 2026: An Honest Comparison from Someone Who Has Run Both in Production Hardening Your Node.js App Against Supply Chain & Remote Code Execution Attacks linux commands A Practical GEO Case: How an AI System Started Recommending Our Blog Your AI Agent Works 24/7 and Earns $0. I Built the Fix. Your AI Trading Agent Will Lose All Your Money — Here's How To Stop It Google I/O 2026: What Happens When Everything Connects? Why AI writes software but doesn’t build a good product Beyond the Hype: How Google I/O 2026 Secretly Democratized Production-Ready AI Agents with Managed Sandboxes. The Killer Assumption Test: How to Spot Doomed Product Decisions Before You Ship Stop Describing Your Bugs — Just Screenshot Them # I Built an AI Website Builder and Here's What Actually Happened Cooking an AI Campaign in 5 Minutes with Google Cloud AI APIs Your PM Retrospectives Are Lying to You How I Built a Free, Self-Hosted Pipeline That Auto-Generates Faceless YouTube Shorts TypeScript 54 to 58: The Features That Actually Matter in 2026 How to Tailor Your CV to Any Job Posting in 2026 The 7-day SaaS MVP loop: ship fast, then validate with people who actually show up 95. Fine-Tuning LLMs: Make a General Model Do Your Specific Job What Is a Frontend Developer Roadmap and Why You Need One Google shipped three Gemini "Flash" models. Picking the wrong one could 6 your AI bill Building an MCP server so Claude can query my SaaS analytics directly Google I/O 2026 and the Rise of the AI Ecosystem Your Docker Builds Are Slow Because You're Doing It Wrong (And I Built a Tool to Prove It) How do you verify GitHub contributions without trusting self-reported skills? CV vs Resume: What's the Difference and Which Do You Need? student Devs: Build AI Agents & Compete for $55K in Prizes 🚀 How to Write a Cover Letter That Actually Gets You Interviews Battle-Tested: What Getting Hacked Taught Me About Web & Cyber Security Unda folders za kuandika code >> mkdir src >> cd src >> mkdir controllers database routes services utils >> cd .. Directory: C:\Users\mwaki\microfinance-system Mode LastWriteTime Length Name Code Coverage .NET AI slop debt" is technical debt on fast forward. Nobody's ready. Multi-Head Latent Attention (MLA) Memoria - A Local AI Reading Companion Powered by Gemma 4 Stop Trusting Your Accuracy Score: A Practical Guide to Evaluating Logistic Regression Models Serious Question: Is the Developer Job Actually in Risk Due to AI? published: true tags: #discuss #career #ai #help rav2d: We ported an AV2 video decoder from C to Rust — here's why Your New Domain's First Week of GA4 Is a Lie: 4 Days of Raw Data from a Launch Gemma Guide - Real-Time Spatial Awareness for Blind Users From YAML to AI Agents: Building Smarter DevOps Pipelines with MCP A Field Guide to Human–AI Relations (For the Newly Bewildered Mortal) The AI Agent That Learns While It Works — A Complete Guide to Hermes Agent Inviting collaborators to work on ArchScope ArchScope is an interactive web-based tool that lets you design, visualize, and test system architectures with real-time performance simulations. Github - ArchScope is an interactive web-based tool that lets you Gemma 4: Google's Open-Weight AI Is a Game Changer for Developers Confessions of a Git Beginner: Why the Terminal Stopped Scaring Me Docker 容器化实战:从零到生产部署 🚀 I Built a Full Stack Miro Clone with Real-Time Collaboration using Next.js Building an African Economic Data Pipeline with Python, DuckDB & World Bank API llms.txt vs robots.txt vs ai.txt: The Developer's Cheat Sheet Intigriti Challenge 0526 Writeup Business Logic Flaws: How Attackers Skip Steps in Your App to Get What They Should Never Have Why Vibe Coders Need Boilerplates to Save Time, Tokens, and Build More Secure SaaS Projects Idle Cloud Cost Is the New Egress Cost Quark's Outlines: Python Traceback Objects Ghost in the Stack (Part 1): Why uninitialized variables remember old data Building a High-Performance Local Chess Assistant Extension with WebAssembly Stockfish and Manifest V3 Breaking the Trade-off Between Self-Custody and Intelligent Automation on the Stellar Network I Open-Sourced a Practical Fullstack Interview Preparation Repository (React + Node + System Design) 🚀 How I Started Coding as a Student (Beginner-Friendly Guide) WordPress vs. Ghost: Why Automated Bot Attacks Are Making us think much I tested 4 AI agent-governance tools against an open spec - here's the matrix zkML Inference Proof: What the Receipt Proves, and What the Model Still Does Not I Scored 1000/1000 on AWS Certified AI Practitioner (AIF-C01) Here's Every Resource I Used Go - Struct and Interface Handling JSON Requests in Go Storing Kamal secrets in AWS Secrets Manager and deploying to a cheap Hetzner VPS How I Caught and Fixed an N+1 Query in My Django REST API I got tired of paying $10/month to remove image backgrounds – so I built it for free
PolicyAware vs Guardrails vs AI Gateways vs Model Routers: The Comparison Every AI Engineer Needs to Read
KRISHNA KISH · 2026-05-23 · via DEV Community

I've been building AI-powered features for a while now, and the hardest conversations I have with my team are never about which model to use. They're always about the same thing: what is this system actually allowed to do, and how do we prove it?

That question pushed me to build PolicyAware - an open source Python control plane that sits in front of your models, tools, and retrieval systems. Before I explain what it does, I want to walk through why the tools most teams reach for first - guardrails, AI gateways, and model routers - are genuinely useful but leave a critical gap wide open.


The landscape right now

If you search for "AI safety" or "LLM governance" you will find three categories of tools coming up again and again:

  • Guardrail libraries - validate prompts and outputs against safety rules
  • AI gateways - proxy your requests to model providers, centralize API keys
  • Model routers - pick the cheapest or fastest model for each request

All three are useful. None of them alone answers the governance question.

Here is the mental model I use: a guardrail checks what the model says. A gateway manages where the request goes. A router decides which model handles it. But none of them ask the most important question first: should this request be allowed to run at all, under this user's role, for this tenant, in this region, given this risk level?

That is the gap PolicyAware fills.


Side-by-side comparison

Capability Guardrails AI Gateway Model Router PolicyAware
Block unsafe prompts before execution Sometimes Sometimes No Yes
Redact PII / PHI / secrets pre-execution Sometimes Sometimes No Yes
Decisions using role, tenant, region, risk Limited Limited Limited Yes
Deny-by-default posture Usually no Usually no No Yes
Govern MCP / agent tool calls Usually no Sometimes No Yes
Require human approval for risky actions Usually no Sometimes No Yes
Route across providers after policy approval No Yes Yes Yes
Evaluate RAG citation, grounding, leakage Sometimes Limited No Yes
Emit audit traces with reason codes Limited Sometimes Limited Yes
Generate compliance evidence artifacts Usually no Usually no No Yes

The right column is not a flex. It is a description of what enterprise AI systems actually need once they move beyond read-only chat and start touching real data, real tools, and real business workflows.


When each tool is the right call

Use a guardrails library when your only need is response formatting, toxicity filtering, or structured output validation. If you do not need RBAC, tenant rules, approval flows, or audit evidence, a guardrail is lighter and faster.

Use an AI gateway when your main problem is juggling provider keys, rate limits, and fallback routing. Gateways are great infrastructure. They are just not governance.

Use a model router when you are optimizing for cost, latency, or quality tradeoffs across providers. A router does not decide whether a request should run - only which model would run it.

Use PolicyAware when your AI system touches sensitive data, calls external tools, operates under regional compliance rules, or takes actions with financial or operational consequences. If you need to explain a decision to a security team six months from now, you need a control plane, not just a proxy.


How the architecture fits together

Here is the pattern I use in production. The key rule is: nothing reaches a model, retriever, or tool until the control plane has made an explicit decision.

+-----------------------------+
| Application Layer           |
| (web app / API / workflow)  |
+-------------+---------------+
              |
              v
+-----------------------------+
| PolicyAware Control Plane   |
|                             |
|  1. Identity + context      |
|  2. Deny-by-default check   |
|  3. PII / PHI detection     |
|  4. Risk classification     |
|  5. Approval gate (if high) |
|  6. Provider routing        |
+--------+----------+---------+
         |          |
         v          v
  +----------+  +----------+
  | RAG Layer|  | Tools /  |
  | retrieval|  | MCP      |
  | citation |  | payments |
  +----+-----+  +----+-----+
       |              |
       +------+-------+
              |
              v
       +-------------+
       | Model Layer |
       | (local/SaaS)|
       +------+------+
              |
              v
       +-------------+
       | Evaluations |
       | leakage     |
       | grounding   |
       | audit trace |
       +-------------+

Enter fullscreen mode Exit fullscreen mode

Every arrow in that diagram has a policy decision attached to it. That is the entire point.


A real example: the $500 refund prompt

Let us make this concrete. A customer-support copilot gets this message:

Email jane@example.com and refund the customer $500.

Enter fullscreen mode Exit fullscreen mode

Here is what different tools do with it:

  • A guardrail might check whether the output looks safe
  • A gateway forwards the request to your provider of choice
  • A router picks GPT-4.1 because it is the best model for support tasks
  • PolicyAware stops and works through the full decision tree before any of that happens

Code: policy-first middleware

from dataclasses import dataclass, field
from enum import Enum
import re
from typing import List, Optional

class Decision(str, Enum):
    ALLOW = "allow"
    DENY = "deny"
    REQUIRE_APPROVAL = "require_approval"

@dataclass
class RequestContext:
    user_id: str
    role: str
    tenant: str
    region: str
    task_type: str
    prompt: str
    tools: List[str] = field(default_factory=list)

@dataclass
class PolicyResult:
    decision: Decision
    risk_tier: str
    redacted_prompt: str
    reason_codes: List[str]
    required_approver: Optional[str] = None

EMAIL_RE = re.compile(r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}")

ALLOWED_TOOLS = {
    "support_agent": {"knowledge_search", "draft_email"},
    "finance_manager": {"knowledge_search", "draft_email", "issue_refund"},
}

def evaluate_policy(ctx: RequestContext) -> PolicyResult:
    reason_codes = []
    allowed = ALLOWED_TOOLS.get(ctx.role, set())

    for tool in ctx.tools:
        if tool not in allowed:
            return PolicyResult(
                decision=Decision.DENY,
                risk_tier="high",
                redacted_prompt=EMAIL_RE.sub("[REDACTED]", ctx.prompt),
                reason_codes=[f"tool_not_permitted:{tool}"],
            )

    redacted = EMAIL_RE.sub("[REDACTED]", ctx.prompt)
    if EMAIL_RE.search(ctx.prompt):
        reason_codes.append("pii_detected")

    is_high_risk = "refund" in ctx.prompt.lower() or "issue_refund" in ctx.tools
    if is_high_risk:
        reason_codes.append("high_risk_financial_action")
        return PolicyResult(
            decision=Decision.REQUIRE_APPROVAL,
            risk_tier="high",
            redacted_prompt=redacted,
            reason_codes=reason_codes,
            required_approver="finance_supervisor",
        )

    reason_codes.append("policy_allow")
    return PolicyResult(
        decision=Decision.ALLOW,
        risk_tier="low",
        redacted_prompt=redacted,
        reason_codes=reason_codes,
    )

# Try the refund prompt
ctx = RequestContext(
    user_id="u-1001",
    role="support_agent",
    tenant="acme-corp",
    region="us-east",
    task_type="customer_support",
    prompt="Email jane@example.com and refund the customer $500.",
    tools=["draft_email", "issue_refund"],
)

result = evaluate_policy(ctx)
print(result.decision)      # Decision.DENY
print(result.reason_codes)  # ['tool_not_permitted:issue_refund']

Enter fullscreen mode Exit fullscreen mode

The support agent gets denied before the prompt ever reaches a model. The reason code is logged. The redacted prompt is stored. That is the audit trail your security team will ask for.


Code: compliant model routing

Routing still matters - but it should only happen after policy approves the request.

@dataclass
class RouteDecision:
    provider: str
    model: str
    reason: str

COMPLIANT_MODELS = {
    "us-east": [("azure_openai", "gpt-4.1"), ("local_vllm", "llama-3.1-70b")],
    "eu-west": [("azure_openai_eu", "gpt-4.1"), ("local_vllm_eu", "llama-3.1-70b")],
}

def route_after_policy(result: PolicyResult, ctx: RequestContext) -> RouteDecision:
    if result.decision != Decision.ALLOW:
        raise PermissionError(f"Cannot route - decision is {result.decision}")

    options = COMPLIANT_MODELS.get(ctx.region, [])
    if not options:
        raise RuntimeError(f"No compliant providers for region: {ctx.region}")

    provider, model = options[0]
    return RouteDecision(provider, model, "policy-approved compliant route")

Enter fullscreen mode Exit fullscreen mode

A traditional router asks: which model is fastest? This asks: which model is allowed? The order of those questions changes everything about your compliance posture.


Code: audit trace

This is the piece most teams skip - and regret during their first security review.

from datetime import datetime
import json

def emit_audit_trace(ctx: RequestContext, result: PolicyResult, route=None):
    trace = {
        "timestamp": datetime.utcnow().isoformat() + "Z",
        "user_id": ctx.user_id,
        "tenant": ctx.tenant,
        "region": ctx.region,
        "task_type": ctx.task_type,
        "decision": result.decision.value,
        "risk_tier": result.risk_tier,
        "reason_codes": result.reason_codes,
        "tools_requested": ctx.tools,
        "route": None if route is None else {
            "provider": route.provider,
            "model": route.model,
        },
        "prompt_preview": result.redacted_prompt[:200],
    }
    print(json.dumps(trace, indent=2))

Enter fullscreen mode Exit fullscreen mode

Sample output for the denied refund request:

{
  "timestamp": "2026-05-23T15:00:00Z",
  "user_id": "u-1001",
  "tenant": "acme-corp",
  "region": "us-east",
  "task_type": "customer_support",
  "decision": "deny",
  "risk_tier": "high",
  "reason_codes": ["tool_not_permitted:issue_refund"],
  "tools_requested": ["draft_email", "issue_refund"],
  "route": null,
  "prompt_preview": "Email [REDACTED] and refund the customer $500."
}

Enter fullscreen mode Exit fullscreen mode

Every denied request, every approval gate, every route choice - all replayable. That is the evidence layer.


Start using PolicyAware today

PolicyAware is open source, MIT licensed, and published as a Python package. You do not need a SaaS contract. You do not need to rip out your existing stack. Drop it in as a middleware layer in front of your LLM calls.

pip install policyaware

Enter fullscreen mode Exit fullscreen mode

Simplest integration pattern:

from policyaware import evaluate_policy, RequestContext

ctx = RequestContext(
    user_id=current_user.id,
    role=current_user.role,
    tenant=current_user.tenant,
    region=current_user.region,
    task_type="customer_support",
    prompt=user_message,
    tools=requested_tools,
)

result = evaluate_policy(ctx)

if result.decision == "allow":
    response = call_your_llm(result.redacted_prompt)
elif result.decision == "require_approval":
    request_human_approval(ctx, result)
else:
    return {"error": "Request denied", "reason": result.reason_codes}

Enter fullscreen mode Exit fullscreen mode

One function call between your application and your model. Policy first. Everything else second.


The bottom line

Guardrails make your outputs safer. Gateways make your infrastructure cleaner. Routers make your model spend smarter. But none of them govern the full execution path.

If your AI system is making decisions that touch real people, real money, or real compliance boundaries - you need a control plane that runs policy before execution and produces evidence after it.

That is exactly what PolicyAware is built for. Star the repo, install the package, and let me know what governance problems you are running into - I am actively building this out in the open.

GitHub: https://github.com/ktirupati/policyaware

pip install policyaware

Enter fullscreen mode Exit fullscreen mode