惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Google DeepMind News
Google DeepMind News
Martin Fowler
Martin Fowler
T
Threatpost
云风的 BLOG
云风的 BLOG
博客园 - 司徒正美
C
CERT Recently Published Vulnerability Notes
V
Vulnerabilities – Threatpost
Help Net Security
Help Net Security
Project Zero
Project Zero
博客园 - 聂微东
博客园_首页
T
Tor Project blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
V
Visual Studio Blog
人人都是产品经理
人人都是产品经理
The Register - Security
The Register - Security
Latest news
Latest news
K
Kaspersky official blog
L
LINUX DO - 热门话题
P
Proofpoint News Feed
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
美团技术团队
C
Cyber Attacks, Cyber Crime and Cyber Security
A
Arctic Wolf
aimingoo的专栏
aimingoo的专栏
J
Java Code Geeks
F
Full Disclosure
Recent Announcements
Recent Announcements
SecWiki News
SecWiki News
C
Cybersecurity and Infrastructure Security Agency CISA
F
Fortinet All Blogs
The Hacker News
The Hacker News
Apple Machine Learning Research
Apple Machine Learning Research
NISL@THU
NISL@THU
The GitHub Blog
The GitHub Blog
量子位
Hugging Face - Blog
Hugging Face - Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
P
Palo Alto Networks Blog
T
Troy Hunt's Blog
O
OpenAI News
T
Threat Research - Cisco Blogs
博客园 - Franky
Hacker News - Newest:
Hacker News - Newest: "LLM"
A
About on SuperTechFans
C
Check Point Blog
Hacker News: Ask HN
Hacker News: Ask HN
AWS News Blog
AWS News Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
T
Tenable Blog

Redis

Real-Time Fraud Detection: Latency, Features & Scale Context window in AI: why every token is a budget decision Connecting to Redis Cloud with AWS PrivateLink vs. VPC peering | Redis Redis Data Integration in Redis Cloud is now GA in AWS | Redis Why AI Misses Business Context & How Teams Fix It AI Reasoning Explained: Why Context Matters Semantic Layer vs Context Layer: Key Differences Redis array data type: How it works and when to use it Context Graphs vs. Vector Search: When RAG Falls Short What’s new in two – May 2026 edition Redis 8.8 performance improvements: Faster string, hash, streams, SCAN & more Redis 8.8: New array data structure & open source features How Conflict-free Replicated Data Types power active-active database replication Context Orchestration: What It Is & How It Works Context Compaction for AI Agents: A Complete Guide Prompt Bloat: Causes, Costs & Fixes for LLM Apps Agentic Retrieval Techniques: A Complete Guide Single-shot reliable consumers with XREADGROUP CLAIM in Redis 8.4 | Redis Long-Horizon AI Agents: Memory & State Infrastructure What is a context engine? What Is a Context Layer? AI Agent Infrastructure Context Retrieval for AI Agents: What It Is & Why It Matters Context Poisoning: How Bad Data Breaks Agent Reasoning Context is all you need: Introducing Redis Iris | Redis Context Engineering for AI: What It Is & How to Build It Dynamic endpoints: Migrate databases without changing your endpoint | Redis AI Shopping Assistants: How They Work & What to Build Endless Aisle Retail: Infrastructure & Real-Time Data LLM Speed Benchmarks: Metrics & Infrastructure Guide Context Pruning: Cut LLM Tokens Without Losing Quality What’s new in two – April 2026 edition Agentic AI Architecture: 5 Patterns Explained AI Agent vs Chatbot: Key Differences Explained Advantages of Building a Vector Search Solution API Latency in LLM Apps: Causes & How to Fix It Security advisory: [CVE‑2026‑23479] [CVE‑2026‑25243] [CVE-2026-25588] [CVE‑2026‑25589] [CVE-2026-23631] | Redis Edge Computing Latency: Causes & How to Reduce It Streaming LLM Responses: Make Your AI App Feel Fast Active-Active vs Active-Passive Database Architecture Prefill vs Decode: LLM Inference Phases Explained Long-Term Memory Architectures for AI Agents Time to First Byte Test: Tools, Causes & Fixes Speculative decoding: how it works & when to use it P95 Latency: What It Is & Why It Matters Why Multi-Agent LLM Systems Fail & How to Fix Them AI Human in the Loop: Production Oversight Patterns Native OpenTelemetry metrics for Redis client libraries | Redis Client-side geographic failover for Redis Active-Active | Redis Use Redis with SQL | Redis Introducing Redis Feature Form Build Google ADK Agents with persistent, real-time memory on Redis | Redis Startup Spotlight: Neuron Systems API Throttling: Algorithms, Patterns & Mistakes Agentic AI Examples Across 6 Industries Best Chunking Strategies for RAG Pipelines Agentic AI Guardrails: Controls That Work Redis joins AWS at GDC to support the next generation of gaming | Redis Designing a semantic routing system: From static rules to dynamic intelligence with Redis and Java | Redis Real-Time Dispatch System: A Complete Guide P99 Latency: What It Means & How to Fix It Tokenization in LLMs: What AI App Devs Need to Know TTFT Meaning: What is Time to First Token? Atomic slot migration with Redis 8.4 Hybrid search benefits: Why your RAG system needs both keyword & vector search What’s new in two: March 2026 edition Vector embedding generators: How they work & how to use them Throughput-optimizing Redis for L2 KV Cache Reuse What is a data pipeline? Building AI agent pipelines that don't forget, fail, or fall apart Redis achieves Google Cloud Ready, Distributed Cloud status ahead of Google Cloud Next ‘26 | Redis Real-time network monitoring: what your data platform needs to keep up AI agent API: How agents connect to the real world What is multicloud infrastructure? A guide for 2026 What is a transaction monitoring system & how does it work? Why your AI agent fails in production & how tracing helps AI agent benchmarks: Where they fall short & why your infrastructure matters What is a JSON database (and when should you use one)? Introducing the Redis Partner Network: A new foundation for real-time innovation How real-time customer segmentation works in retail Payment orchestration & vault architecture in retail Agentic systems vs. GenAI: when generation isn't enough What is fuzzy matching? Semantic caching & routing: two powerful patterns for vector classification Redis alternatives: Why there are no exact substitutes Connect to Azure Managed Redis with Redis Insight 3.2.0 How to tame the thundering herd problem Redis to Manage Storage Replication | Redis How hierarchical navigable small world (HNSW) algorithms can improve search | Redis How leading financial institutions use Redis to drive growth | Redis What’s new in two: May 2025 | Redis Introducing Model Context Protocol (MCP) for Redis | Redis Redis vs. Elasticsearch: What’s faster for GenAI & vector search? | Redis Build fast, production-worthy AI apps with Spring AI and Redis | Redis Azure Managed Redis is GA today | Redis Redis then & now: Adapting with developers through every era | Redis Supercharge Your AI with OpenShift AI and Redis: Unleash speed and scalability | Redis What’s new in two: April 2025 | Redis Redis 8 is now GA, loaded with new features and more than 30 performance improvements | Redis What is a data strategy? 6 key components explained Data replication explained: types, examples & use cases
AI Agents vs Workflows: When to Use Each
Redis · 2026-04-30 · via Redis

Everyone building with LLMs right now is bumping into the same question: should you wire up a predictable, step-by-step workflow, or let an AI agent figure things out on its own? The answer shapes your system's reliability, cost, latency, and how many 3 AM pages you'll field.

The good news: you don't have to pick one forever. But it helps to understand what each approach is actually good at before you start combining them. This guide covers what AI workflows and agents are, when each one makes sense, why most production systems use both, and what infrastructure you need underneath to keep everything running.

Apps

Build fast, accurate AI apps that scale

Get started with Redis for real-time AI context and retrieval

What are AI workflows?

An AI workflow is a system where LLMs and tools are orchestrated through predefined code paths. You, the developer, decide the execution order before the system runs. The LLM handles the reasoning within each step, but it doesn't get to choose what step comes next: your code does. You can add conditional logic, but every possible path is something you designed and can test ahead of time.

A few workflow patterns show up across production LLM systems:

  • Prompt chaining: Break a task into sequential steps where each LLM call processes the previous one's output. Write an outline, check it, then write the full document.
  • Routing: Classify an input and send it to a specialized handler. Easy questions go to a smaller, cheaper model; hard questions go to a more capable one.
  • Parallelization: Run multiple LLM calls at the same time and combine the results. Useful for running several checks against the same input.
  • Orchestrator-workers: A central LLM breaks down a task, delegates to workers, and synthesizes the results. Unlike parallelization, the subtasks are determined at runtime based on the input.
  • Evaluator-optimizer: One LLM generates a response while a separate LLM evaluates it, looping until quality criteria are met.

Those patterns differ in complexity, but they all keep the control flow largely in code. Taken together, they often resemble Directed Acyclic Graphs (DAGs), or graphs with explicitly controlled cycles. That structure is what gives workflows their biggest advantage: every path is testable. You can trace exactly what happened, reproduce bugs, and predict costs because you control how many LLM calls each run makes.

What are AI agents?

If workflows keep control in code, agents move that control into the model. The LLM directs its own execution, decides what to do next at runtime, recognizes when the task is done, and uses tools to interact with external systems.

Agents run in a loop: reason about the current state, pick a tool, observe the result, and decide what to do next, repeating until an exit condition is met.

A few things that matter in practice:

  • Autonomy is a spectrum. An agent with minimal autonomy does exactly what you ask; a highly autonomous agent makes its own decisions about what to do and how. Autonomy isn't a fixed model property. It's shaped by your deployment design.
  • Tool design is important. Tools form a contract between deterministic systems and non-deterministic agents. Too many tools, or overlapping tools, can distract agents from efficient strategies.
  • Errors compound. In long-running agents, minor failures can snowball into catastrophic ones. You can't just restart from the beginning because restarts are expensive and frustrating for users.

In other words, agents buy flexibility by moving more decisions into runtime behavior. Multi-agent patterns add another dimension, whether a manager coordinating specialists or peers handing off tasks, but the core principle stays the same: the LLM, not your code, determines the execution path.

When workflows beat agents (& vice versa)

Once that control difference is clear, the next question is when each pattern wins. A practical test helps: can you draw a flowchart of the task before the LLM runs? If yes, use a workflow. If the flowchart depends on what the LLM discovers at runtime, you likely need an agent.

Workflows win when you need predictability

Workflows are the better fit when steps are known, repeatable, and low-ambiguity. You get a fixed token budget per run, so costs are predictable. Debugging is localized to explicit code paths. And for teams operating under SOC 2, GDPR, or internal model governance, repeatable execution is often a practical requirement.

A few categories show up repeatedly:

  • Order exception triage: Same classification and routing logic each time.
  • Content generation pipelines: A fixed sequence: generate, review, translate, publish.
  • Multi-step approval processes: Each step has a defined input, output, and handoff.

The path is known before execution starts.

Memory

AI is only as good as its memory

Power real-time context and retrieval with Redis for AI.

Agents win when flexibility matters more than predictability

Agents excel when the steps are unclear or evolve during execution. A debugging agent might gather context, classify team owners, apply fixes, run validation, and create PRs: steps that depend on what the agent discovers as it goes.

The trade-offs are real. Errors compound, so one step failing can send the agent down an entirely different trajectory. Agents can hallucinate, loop on failed actions, overflow their context window, or misuse tools. Runtime behavior is hard to predict until you run it in production, hurting testing and observability. And without explicit turn limits and cost caps, looping agents can accumulate unbounded token spend.

Start with a workflow. Add agent behavior only where the task actually demands it.

Why production systems often combine both

That trade-off is why many real systems land in the middle. Most production agentic systems combine workflows and agents.

Pure agent chains have a compounding reliability problem. Even at 99% per-step reliability, a 10-step process only succeeds 90% of the time, and that degradation accelerates as chain length grows. Pure workflows have the opposite problem: stuffing branching logic, state tracking, and error handling into prompts becomes unmaintainable at any real scale.

The solution is a hybrid: deterministic boundaries where you need reliability, agent autonomy where you need flexibility. That split should drive architectural decisions before any agent is built.

Deterministic routing with autonomous specialists

One common split puts a deterministic supervisor at the top and lets agents reason freely inside bounded scopes. Routing stays predictable; specialists get autonomy only within their assigned domain.

For example, one Vodafone/Fastweb deployment uses a deterministic supervisor for intent routing and lets specialized sub-graphs evolve independently. Open-ended queries route to a combined RAG pipeline using both a vector store and a knowledge graph.

LLM reasoning with deterministic code execution

Flip the arrangement and you get the other common split: LLM-driven planning with deterministic execution. The model decides the plan; code does the doing.

For example, one HR and payroll onboarding system uses tool-calling to decide what steps to take, then writes and runs real Python code to transform the data. The LLM handles the "what," deterministic code handles the "how." Because the transform logic runs as code, it's repeatable and auditable, which matters for sensitive employment data across jurisdictions.

The missing ingredient: memory, state & coordination

Once you've decided where those deterministic boundaries belong, the next problem is infrastructure: the memory and state layer that keeps everything connected.

LLMs are stateless, so every memory tier has to be externalized and managed by infrastructure: short-term state for the current task, long-term memory for past interactions, and semantic knowledge for facts and learned patterns.

Short-term memory fills up fast

Long-context limits mean irrelevant history drags down performance, so retrieval-augmented generation remains important for focusing on task-relevant state. Without retention policies (summarize, forget, prune), unbounded context growth can cause agents to forget their original instructions. And retrieval itself can become the bottleneck when all your other pipelines run at millisecond latency but your memory lookup doesn't.

Long-term memory needs durable storage

LangGraph's architecture splits memory into thread-scoped checkpointers for short-term state and cross-thread stores for long-term state. Thread-scoped checkpointers default to in-process implementations that aren't durable: teams that ship with InMemorySaver in production lose state on restart or deployment. Checkpoint collections can also grow unbounded without TTL, so teams need a durable backend and explicit retention policies.

Coordination ties it all together

Multi-agent systems need real-time coordination: pub/sub messaging for event-driven orchestration, durable task queuing for work distribution, and suspension mechanisms for human-in-the-loop approvals that can span hours across systems like Slack or Jira.

Most teams stitch this together from a vector database, a cache, a message broker, and a task queue. Redis handles all four in one platform: in-memory data structures for hot session and conversational state, vector search for long-term memory with metadata filtering, pub/sub for event-driven coordination, and streams for durable task queuing. Redis' open-source Agent Memory Server implements both memory tiers, so you start from a working reference rather than stitching the stack from scratch.

How agents & workflows show up in your stack

Once those memory and coordination requirements are clear, the next question is where they sit in your architecture. The single-model call pattern has given way to coordinated systems with distinct infrastructure layers.

Orchestration

Your workflow and agent logic lives at the orchestration layer. LangGraph, Pydantic AI, and Google Agent Development Kit (ADK) are the current standouts, with CrewAI and AutoGen seeing active use too.

Memory & state

This tier holds the short-term checkpoints and long-term stores covered above. For teams on LangGraph, Redis integrates through the RedisSaver checkpointer for thread-scoped state and the Store interface for cross-thread long-term memory, with TTL-based retention for collections that would otherwise grow unbounded.

Retrieval & caching

Vector databases handle long-term memory retrieval and RAG pipelines. Semantic caching reduces LLM costs by recognizing when queries mean the same thing despite different phrasing. "Tell me about our Q3 revenue" and "What was our revenue in the third quarter?" should hit the same cache entry. In Redis benchmarks, LangCache reported cache hits up to 15x faster than live inference. In benchmarks on high-repetition workloads, LangCache reported up to 73% lower inference costs without code changes.

Tool protocols & observability

The Model Context Protocol (MCP) and Agent-to-Agent (A2A) protocol are standardizing how agents connect to tools and to each other, analogous to how HTTP standardized web communication. Redis added A2A integrations in its Fall 2025 release, alongside new AutoGen and Cognee integrations. For observability, Langfuse, LangSmith, and Arize Phoenix provide the tracing you need to debug non-deterministic agent behavior in production.

Redis Iris

You've made it this far

Now see how this actually runs in Redis. Power AI apps with real-time context, retrieval, and semantic caching.

Redis handles the memory tier so your agents ship

Agents and workflows aren't competing philosophies. Workflows give you predictability, auditability, and cost control. Agents give you flexibility for open-ended tasks. The best production systems combine both and use deterministic boundaries to contain agent autonomy where it matters.

What separates demos from production is the layer underneath: durable memory, real-time coordination, and fast retrieval. Redis covers that tier in one platform, which is why it shows up so often in agent stacks.

If you're building agentic systems and want to see how the memory and state layer works in practice, try Redis free or talk to our team about your architecture.