惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

L
Lohrmann on Cybersecurity
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
Recorded Future
Recorded Future
S
Schneier on Security
I
Intezer
Latest news
Latest news
N
News and Events Feed by Topic
Scott Helme
Scott Helme
T
Threat Research - Cisco Blogs
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
U
Unit 42
量子位
博客园 - 【当耐特】
S
Security @ Cisco Blogs
Google Online Security Blog
Google Online Security Blog
博客园 - 叶小钗
酷 壳 – CoolShell
酷 壳 – CoolShell
NISL@THU
NISL@THU
The Cloudflare Blog
李成银的技术随笔
T
ThreatConnect
L
LINUX DO - 最新话题
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
有赞技术团队
有赞技术团队
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Jina AI
Jina AI
T
Tor Project blog
The Hacker News
The Hacker News
人人都是产品经理
人人都是产品经理
小众软件
小众软件
S
Security Archives - TechRepublic
美团技术团队
博客园 - Franky
Security Latest
Security Latest
J
Java Code Geeks
P
Proofpoint News Feed
V
V2EX
The GitHub Blog
The GitHub Blog
WordPress大学
WordPress大学
Application and Cybersecurity Blog
Application and Cybersecurity Blog
H
Help Net Security
PCI Perspectives
PCI Perspectives
Cyberwarzone
Cyberwarzone
Hugging Face - Blog
Hugging Face - Blog
N
Netflix TechBlog - Medium
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
SecWiki News
SecWiki News
腾讯CDC
爱范儿
爱范儿
D
Docker

DEV Community

I wrapped Claude Code in a zsh function. Here's every decision I almost got wrong. Mobile Game Optimization: A Unity Developer's Checklist Git: Best Practices for Beginners Three days I lost chasing a ghost that was already dead on disk Why Too Many Parts Hurt ClickHouse Performance Gemma Forge: Local AI Without the Setup Wall From Half‑dead Prototype to Local‑Only AI Medical Assistant: Rewiring MedClinic with GitHub Copilot Runninig a forkbomb in Jenkins What’s Actually Happening When You Use Git Preventing Recursive Tool Loops in LangChain Agents Building a Rock-Paper-Scissors CLI with TypeScript — Union Types, Conditionals, and Jest Your AI Coding Agent Wastes 80% of Its Context. Fixed That with Graph Theory. Why Flutter Has Become the Go-To Framework for Fintech App Development We built a scripting language just for AI agents. Here's why. Stop building AI inboxes. Build decision layers instead. Meme Monday Why I Built @editora/ui-react? Are AI tools the next level of abstraction in software development? Identity on Solana: Your Wallet Is Your Account One API Call Changed Everything The Internet Career Nobody Talks About Enough: What Is DevRel? Solar Panel Wiring Diagram: Series vs Parallel Hello everyone! Glad to join the dev.to community I Built an AI Agent That Tailors My Resume - Here's How Agents Actually Work I Built a WhatsApp OTP + AI Chatbot Platform for African Businesses MTP Explained — And Why It Matters for Android on Mac Most Beginners Learn Full-Stack Development Backwards GitHub Glow-Up: Open Source, READMEs, Badges, Streaks, Git and gh CLI System Design Cheat Sheet: Concepts Every Developer Should Know Are Junior Developer Roles Actually Dying? A Fresher's Honest Take Using DigitalOcean Droplets as Ephemeral Sandboxes for AI Agents I built a VSCode extension that visualises your code navigation as a call tree — made for legacy codebase pain Vite predev/prebuild: chaining scripts without losing your mind A website to save you from messy browser tabs Dear Web2 Developer... Solana is here calling Postgres JSONB indexes: GIN vs BTREE on the same column The $5 AI That Remembers Everything What are your goals for the week? #180 Zettelkasten for Developers: A Practical Method That Works OpenClaw vs Hermes Agent: Stars, Downloads & Usage 2026 `act` vs. `waitFor` Global Teams Don’t Struggle With Time Zones. They Struggle With Context Python as a JavaScript Dev $5.4 Billion in Damage. 8.5 Million Machines Down. Three YAML Controls Would Have Prevented It. Here's the Structural Analysis. 🚫 Stop Using PN532 V1 for Your NFC Projects (Real Debugging Experience) Probabilistic Graph Neural Inference for smart agriculture microgrid orchestration for extreme data sparsity scenarios Inference Is Becoming the New Steady-State Cost Center Why AI-Generated Code Is Always Good Enough — And Never Great I built a dark admin dashboard template in HTML — no React, no npm, just pure HTML What is the Difference Between Lattice-Based and Hash-Based Signatures? Next.js App Router caching: revalidate, dynamic, and no-store without the folklore Next.js App Router caching: revalidate, dynamic y no-store sin folklore I built Stashly — a full-stack content manager with a rich text editor published: false tags: react, node, mongodb, typescript Why I Started Building React Projects Instead of Just Watching Tutorials ? Every Tool Eventually Becomes Tuesday Nobody Warns You That Real Software Engineering Feels Chaotic Tích hợp VNPay, Stripe trong Odoo 19 BeautifulSoup and Requests for Web Scraping With Python: When Simple Still Works I Was Stuck Debugging React — Then Developer Tools Changed It Buck Converter Ripple: Sizing the Inductor and Capacitor With Confidence AWS Just Made Its MCP Server Generally Available. Here's What It Actually Gives AI Agents. RAMPART Tests Your AI Agents in Dev. What Catches Malicious Tool Calls in Production? Vibe Team Software Engineering: What a Real AI Human Dev Team Workflow Actually Looks Like An npm Package for AI Agent Orchestration Just Shipped With Its Front Door Unlocked. Here's What the CVE Actually Reveals. Microsoft Foundry Just Added CI/CD for AI Agents. Here's What That Actually Changes. The Best Career Insurance Is a Tech Event You Don't Want to Attend Your GitHub Profile Already Tells Recruiters More Than Your Resume. Most Devs Just Don't Surface It. How to Add Execution Budgets to OpenAI Agents SDK Binary Tree Interview Problems: 6 Traversal Patterns, 15 Problems We trained a personal voice DoRA on Qwen3-8B for $1.50 — beat stock model 100% in blind A/B Stop Leaking API Keys: Why I Built a Local-First Vault for Developers 🔐 RAG Explained: How Retrieval-Augmented Generation Actually Works I Built a Fast Async JioSaavn API Wrapper in Python 🎧 chown & chgrp Deploying Your First App on Kubernetes: A Beginner's Guide (Minikube & Kind) Logs in code It's called a PR "review" for a reason DePIN GPU Market: The Failed Job Receipt Developers Should Demand Why Your AI Agent Monitoring is Wrong (And How to Fix It) Lock Down Your Cloud Shares: A Beginner’s Guide to Azure Files Security. Building a Multi-Channel Content Syndication Pipeline with EmDash Plugins Turn Your Phone Into Voice Input for Any React Text Field Which package is bloating your Docker image? Putting Claude Code Under Version Control: Configs Since July, Memory Since April What I Thought DevRel Was vs. What It Actually Is (A Mentee's Honest Take) What I Thought DevRel Was vs. What It Actually Is (A Mentee's Honest Take) 400 Million Tokens Burned Overnight Reviving My Linux Mastery Game from a Merge Conflict — A Finish-Up-A-Thon Comeback Don’t let AI break your collective thinking: a practical guide for engineering teams First Gemma 4 ExecuTorch Deployment on Raspberry Pi 5 — and Why It's 7.7 Slower Than llama.cpp Per-Turn Evaluation: Dynamic Governance for AI Agents The AI Triforce of seed4j: Power, Wisdom, and Courage for Your Dev Agent Your AI agent reports 80% task completion. It fabricated it. Pourquoi les overlays d'accessibilité ne tiennent pas leurs promesses (et ce que la FTC vient d'acter) AI May Break Product-Market Fit in Enterprise Software I’m Building Around the Gap Between AI Output and Repo Truth How to Build a Stripe Customer Portal in Next.js SaaS On-Demand Pricing Feels Safe - Until You See the Bill Building an Internal Developer Portal with Backstage A Production Deployment Guide After the Last Song
Guardrails for Agent Output: Pluggable Validation Before and After LLM Calls
mgd43b · 2026-05-25 · via DEV Community

One of the harder problems in agent systems is constraining output quality without turning every prompt into a wall of instructions. You can ask the LLM to stay under 3000 characters, or to always include a conclusion section, or to never mention competitor products. But prompt-based constraints are probabilistic. The LLM might follow them. It might not.

Guardrails are the deterministic layer. They run as Java code before and after the LLM call, and they enforce rules that prompts cannot guarantee.

The Model

AgentEnsemble implements guardrails as two functional interfaces: InputGuardrail and OutputGuardrail. Both return a GuardrailResult -- either success or failure with a reason.

Input guardrails run before the LLM is contacted. If any fails, execution stops immediately and the agent's LLM is never called. Output guardrails run after the agent produces a response (and after structured output parsing, if configured).

InputGuardrail piiGuardrail = input -> {
    String desc = input.taskDescription().toLowerCase();
    if (desc.contains("ssn") || desc.contains("credit card")) {
        return GuardrailResult.failure(
            "Task description may contain personally identifiable information");
    }
    return GuardrailResult.success();
};

OutputGuardrail lengthGuardrail = output -> {
    if (output.rawResponse().length() > 3000) {
        return GuardrailResult.failure(
            "Response is " + output.rawResponse().length()
            + " chars, exceeds limit of 3000");
    }
    return GuardrailResult.success();
};

Enter fullscreen mode Exit fullscreen mode

Both are configured per-task:

var task = Task.builder()
    .description("Write an executive summary")
    .expectedOutput("A concise summary")
    .agent(writer)
    .inputGuardrails(List.of(piiGuardrail))
    .outputGuardrails(List.of(lengthGuardrail))
    .build();

Enter fullscreen mode Exit fullscreen mode

Why Functional Interfaces

The choice to make guardrails functional interfaces rather than annotation-based or configuration-driven has a few practical consequences.

First, guardrails are composable. You can build them from lambdas, combine them, or wrap them in utility methods. A guardrail that checks for PII can be reused across every task in the ensemble without any framework-specific wiring.

Second, they are testable in isolation. A guardrail is a pure function from input to result. You can unit test it without standing up an ensemble or mocking an LLM.

Third, they are stateless by default. Since guardrails may run concurrently (in parallel workflows), stateless lambdas are inherently thread-safe. If you need stateful validation, thread safety is your responsibility.

What Input Guardrails See

The GuardrailInput record carries everything you need to make a pre-execution decision:

  • taskDescription() -- the task description text
  • expectedOutput() -- the expected output specification
  • contextOutputs() -- outputs from prior context tasks (immutable)
  • agentRole() -- the role of the agent about to execute

This means you can write guardrails that check not just the current task, but the outputs of upstream tasks. For example, a guardrail that rejects a writing task if the research task upstream produced no findings:

InputGuardrail requireResearch = input -> {
    boolean hasResearch = input.contextOutputs().stream()
        .anyMatch(o -> o.getRaw().length() > 100);
    if (!hasResearch) {
        return GuardrailResult.failure("No substantive research output found");
    }
    return GuardrailResult.success();
};

Enter fullscreen mode Exit fullscreen mode

Output Guardrails and Typed Output

When a task uses outputType for structured output, the execution order is:

  1. Input guardrails run (before LLM)
  2. LLM executes and produces raw text
  3. Structured output parsing (JSON extraction + deserialization)
  4. Output guardrails run (with both rawResponse() and parsedOutput() available)

This means output guardrails can inspect the typed Java object directly:

record ResearchReport(String title, List<String> findings, String conclusion) {}

OutputGuardrail findingsGuardrail = output -> {
    if (output.parsedOutput() instanceof ResearchReport report) {
        if (report.findings() == null || report.findings().isEmpty()) {
            return GuardrailResult.failure(
                "Report must include at least one finding");
        }
    }
    return GuardrailResult.success();
};

Enter fullscreen mode Exit fullscreen mode

This is where guardrails and typed outputs reinforce each other. The type system gives you a parsed object; the guardrail gives you a place to enforce business rules on that object.

Multiple Guardrails and Evaluation Order

Multiple guardrails per task are evaluated in order. The first failure stops evaluation -- subsequent guardrails are not called.

var task = Task.builder()
    .description("Write an article")
    .expectedOutput("An article")
    .agent(writer)
    .inputGuardrails(List.of(piiGuardrail, roleGuardrail, domainGuardrail))
    .outputGuardrails(List.of(lengthGuardrail, conclusionGuardrail))
    .build();

Enter fullscreen mode Exit fullscreen mode

If you want to collect all failures rather than short-circuit, compose them into a single guardrail:

InputGuardrail compositeGuardrail = input -> {
    List<String> failures = new ArrayList<>();
    for (InputGuardrail g : List.of(piiGuardrail, roleGuardrail)) {
        GuardrailResult r = g.validate(input);
        if (!r.isSuccess()) failures.add(r.getMessage());
    }
    return failures.isEmpty()
        ? GuardrailResult.success()
        : GuardrailResult.failure(String.join("; ", failures));
};

Enter fullscreen mode Exit fullscreen mode

Exception Propagation

When a guardrail fails, GuardrailViolationException is thrown. It propagates through the workflow executor and is wrapped in TaskExecutionException, following the same pattern as other task failures.

The exception carries structured information -- guardrail type (INPUT or OUTPUT), violation message, task description, and agent role -- so you can route failures to metrics or alerting without parsing error strings.

try {
    ensemble.run();
} catch (TaskExecutionException ex) {
    if (ex.getCause() instanceof GuardrailViolationException gve) {
        metrics.increment("guardrail.violation." + gve.getGuardrailType());
        log.warn("Guardrail blocked task '{}': {}",
            gve.getTaskDescription(), gve.getViolationMessage());
    }
}

Enter fullscreen mode Exit fullscreen mode

The Tradeoff

Guardrails are deterministic checks, not semantic analysis. A length limit is easy to enforce. A toxicity check is harder -- you would need to call an external classifier inside the guardrail, which adds latency and its own failure modes.

The design intentionally keeps guardrails as simple synchronous functions. If you need async validation, external API calls, or retry logic, you implement that inside the guardrail function. The framework does not impose an opinion on how complex your validation should be.

This means guardrails are most useful for structural and policy checks -- length limits, required sections, PII filters, role-based access, schema validation on typed outputs. For semantic quality checks, the phase review and task reflection mechanisms (covered in earlier posts) are a better fit.


The full guardrails guide is in the AgentEnsemble documentation.

I'd be interested in whether the input/output split feels like the right abstraction, or whether you have seen validation needs that do not fit cleanly into either category.