惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
G
GRAHAM CLULEY
P
Privacy & Cybersecurity Law Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
宝玉的分享
宝玉的分享
P
Proofpoint News Feed
H
Help Net Security
V
Visual Studio Blog
阮一峰的网络日志
阮一峰的网络日志
C
Cisco Blogs
人人都是产品经理
人人都是产品经理
Know Your Adversary
Know Your Adversary
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Recorded Future
Recorded Future
I
Intezer
罗磊的独立博客
T
The Exploit Database - CXSecurity.com
Blog — PlanetScale
Blog — PlanetScale
Malwarebytes
Malwarebytes
Spread Privacy
Spread Privacy
T
Tor Project blog
V
Vulnerabilities – Threatpost
云风的 BLOG
云风的 BLOG
腾讯CDC
B
Blog RSS Feed
Stack Overflow Blog
Stack Overflow Blog
F
Future of Privacy Forum
MyScale Blog
MyScale Blog
Latest news
Latest news
IT之家
IT之家
MongoDB | Blog
MongoDB | Blog
The Hacker News
The Hacker News
S
Securelist
博客园 - 【当耐特】
C
CXSECURITY Database RSS Feed - CXSecurity.com
T
Threat Research - Cisco Blogs
Jina AI
Jina AI
Cisco Talos Blog
Cisco Talos Blog
B
Blog
博客园 - 三生石上(FineUI控件)
Last Week in AI
Last Week in AI
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
M
MIT News - Artificial intelligence
V
V2EX
D
Darknet – Hacking Tools, Hacker News & Cyber Security
The Cloudflare Blog
The GitHub Blog
The GitHub Blog
博客园 - 聂微东
F
Full Disclosure
C
CERT Recently Published Vulnerability Notes

DEV Community

Selling Without Stripe in a Country That Stripe Can't Reach: When Compliance Becomes a Technical Problem Solana's Biggest Consensus Overhaul Is Live for Testing. Here's What Builders Need to do right now. Your agent keeps using that word ... OpenSparrow v2.3 – visual admin panel, zero dependencies, now with ERD and M2M support Why AI Engineering Is Becoming More Like Distributed Systems Engineering How I Cut My LLM Costs by 90% Without Changing My App Logic Security Is Important. Automate It I killed my SaaS after 17 days and rebuilt it into something else GitHub Actions for HIPAA-compliant deployments Apache Kafka for Beginners: Building Real-Time Streaming Systems with Python Dating the Crawler AI-Assisted Frontend Reviews Using Gemma 4 Building Secure Multi-Agent Systems: My Takeaways from Google I/O 2026 The Most Underrated Announcement from Google I/O 2026 Was Buried in a 90-Second Demo How to Fix CUDA Out of Memory Errors in Stable Diffusion WebUI My Experience Building My First Token And Having it Exist On-Chain. African Creators Deserve Better: How I Built a Payment Gateway for Every Corner of the Continent React CRUD basics Should Websites Allow AI Search Crawlers? Chunking Strategies for AI Code Review on Large Repos Beyond the Prompt: How to Build Stateful AI Agents with Persistent Memory and Self-Learning Loops What 10 University Visits in Cameroon Taught Me About Building AI for the Real World, and Why Gemma 4 Was the Answer The Universal Remote for AI: A Deep Dive into the Model Context Protocol (MCP) AgentGuard 0.3.0 — macOS menu bar app, Telegram rollback, and more Antigravity CLI: A Hands-On Guide to Google's Terminal Coding Agent Shopify Functions vs Shopify Scripts: A Migration Walkthrough What Actually Survives a Chicago-Area Winter on Your Deck Rethinking Geo-Blocking and Stripe's Failures in Global Access: A Cautionary Tale of Misoptimization I Built a Free Brat Generator - Here's What I Learned About Next.js Performance published Found a Second Layer to a GitHub Follow Botnet? AI Daily Digest: May 22, 2026 — Agentic Workflows, Coding Agents & Embodied AI How I Secured Internal Microservice Calls Without Passing JWTs Stop Mixing Them Up: SLI vs SLO vs SLA Explained Rebuilding My Engineering Mind Building a Music Production Ecosystem Instead of Just Releasing Plugins The Vonage Dev Discussion: How AI is transforming software development I Gave Our Enterprise AI a Memory. It Started Citing Last Quarter's Incidents. 𝐓𝐡𝐞 𝐂𝐨𝐦𝐦𝐮𝐧𝐢𝐜𝐚𝐭𝐢𝐨𝐧 𝐒𝐭𝐲𝐥𝐞 𝐂𝐫𝐢𝐬𝐢𝐬 Hermes Agent in the Wild: How I Turned It Into an AI Ops Employee Navigating the Hazy Jungle of Global E-commerce: How We Built a Reliable System for Digital Creators in Tanzania The Cost of Cross-Platform Development: Native Module Integration AI-Native Apps Will Swallow the Web I switched my Gemma 4 model three times in 72 hours. Here's the decision tree I wish I'd had. Inside #100DaysofSolana: A Guided Path into Web3 I Built and Shipped TinyHab: an ADHD-Friendly Habit Tracker for iOS I'm an ECE Student Who Vibe Codes Hardware Projects — Here's What Google I/O 2026 Actually Changed for Me From Fragmented Pipelines to Coherent Intelligence — Why Gemma 4 Actually Changes How I Work Our AI Inference Bill Dropped 65% After We Stopped Treating Every Query the Same Why P95 Latency Is the Only Metric That Matters at 3 AM Recycling made easy: a Polish recycling assistant powered by Gemma 4 The Complete Guide to Running a Midnight Node: Setup, Sync & Monitoring De CSRF a RCE: una visita web cuesta una shell en OpenYak Why We Built a Faster Wiki Building a Browser-Based Inkarnate Alternative for D&D Battle Maps Apache Kafka How to Build a FinTech Platform as a Solo Developer (By Any Means Necessary) Your LLM Logs Deserve Better — Send Claude Code Events to Bronto I built a free tool to track subscriptions and stop getting surprised by charges Building the TEYZIX CORE Internship Portal — My Full-Stack Development Journey PocketCFO: a private personal-finance brain that runs entirely in your browser Go Idioms I Wish I Knew Earlier Hey how are you guys I'm newbie web developer , learning wordpress+elementor Right now I don't know what to make I don't know what to write or use what color can you tell me about it ? Google I/O 2026 Blew My Mind — Here's What It Means for the Family App I'm Building 5 Things I Learned in My First Month as a Dev Intern EU AI Sovereignty Belongs in the Workflow Layer Why AI Coding Agents Need Business Context, Not Just Code Context How I Built 9 Claude AI Features into a Production SaaS Expo SDK 56 HashiCorp built an MCP server for writing Terraform. I built one for reviewing it Why Enterprise AI Agent Deployments Keep Failing Date Shear: A New Term for a Common Programming Pain Point Compass v1.1.0 · we shipped a memory plugin that catches its own consumption drift Zod Validation: Type-Safe APIs & Forms in TypeScript (Complete Guide) GitHub Actions CI/CD: Build a Complete Node.js Pipeline (2026) MCP in 2026: The numbers behind the ecosystem explosion working with an ai model mirror Learnt new things Four Metrics That Actually Tell You Whether Your Enterprise RAG Is Working Beyond the Stateless Prompt: Building an Auditable Product Intelligence Pipeline with Cascadeflow and Hindsight Most Creators Are Building in Pieces. I’m Building the Entire System. The Hidden Privacy Problem in Every AI App CVE-2026-26007: Subgroup Confinement Attack in pyca/cryptography The One Thing I See in Every Developer Who Gets Unstuck AI Memory Governance for Legal Tech: How Contract AI Agents Handle Privileged Data Two tables, zero migrations, full LINQ — a .NET data engine that's been running our production for 3 months Join the GitHub Finish-Up-A-Thon Challenge: $3,000 Prize Pool! I Replaced a $50/Month OCR API with Gemma 4’s Native Vision (And You Can Too) Building a Data-Driven Medical Image Enhancement Pipeline with Differential Evolution 🔥🩻 Why I Like Small Software Beyond the Model: Why the Gemini Ecosystem and Google AI Studio Are Redefining Enterprise AI Architecture in 2026 Complete set of Claude Skills for Solo Developer I read 50 years of network science, then built a CRM that runs entirely in the browser The New AI Workflow Is Not “More Agents” How to Make Large Time-Series Charts Smooth in Vue.js + ApexCharts (and fix Zoom & Scroll behavior issues) I Built a Cross-Platform Port Intelligence Tool to Stop Accidental Process Kills During Local Dev AI is heading toward a wall, and most people still don’t see it... Python String Methods Explained Simply (Common Operations) Why We Built a Zero-Knowledge Clipboard Manager for Developers (And Dropped Native Mobile Apps) Add Your Own Component to Bombie in 5 Edits Why Your OSS Advocacy Strategy Probably Doesn't Fit
How to Stop Your LLM Agent From Looping Itself Into Oblivion
Alan West · 2026-05-22 · via DEV Community

You build a shiny new agent. It works great in the demo. Then you deploy it, and the next morning you wake up to find it called the same search function 47 times in a row before finally giving up. Sound familiar?

I hit this exact problem last week on a client project. The agent was supposed to research a topic, summarize findings, and write a report. Instead, it kept fetching the same URL, getting the same content, "reflecting" on whether it had enough information, deciding no, and fetching it again. Beautiful infinite loop. Expensive infinite loop.

This is one of those problems that doesn't show up in tutorials. Every "build an agent in 50 lines" post conveniently skips it. So let's actually dig into why it happens and how to fix it.

Why Agents Loop

There are three main reasons your agent gets stuck repeating itself.

Fuzzy completion criteria. Most agent loops look something like: "keep calling tools until the model says it's done." That works fine when the task is clear. It falls apart when the model isn't sure whether it has enough information. Without a hard stopping rule, "I'll just check one more time" can repeat indefinitely.

Context degradation. As tool results pile up in the context window, the model starts losing track of what it has already done. By turn 20, the system prompt and original task are buried under JSON blobs. The model essentially forgets that it already searched for "user authentication patterns" and searches again.

No structured memory of past tool calls. Many agent loops naively dump tool results back into context with no separate tracking. The model has no easy way to ask "have I already called search('X')?" because that information lives somewhere in 30k tokens of half-remembered chat history.

Step One: Add a Hard Iteration Cap

This sounds obvious, but you'd be amazed how many production agent loops don't have one. Always set a hard upper bound:

MAX_ITERATIONS = 15

def run_agent(task, tools):
    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": task},
    ]

    for i in range(MAX_ITERATIONS):
        response = call_model(messages, tools)

        # Model decided it's done — return the final answer
        if response.stop_reason == "end_turn":
            return response.content

        # Execute tool calls and feed results back into the conversation
        tool_results = execute_tools(response.tool_calls)
        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": tool_results})

    # Fail loudly instead of silently burning more tokens
    raise AgentLoopExceeded("Hit max iterations without completing task")

Enter fullscreen mode Exit fullscreen mode

Yes, you'll occasionally truncate a legitimate long task. That's fine. Failing loudly is much better than racking up a $400 inference bill at 3am.

Step Two: Deduplicate Tool Calls

Track what the agent has already called. If it tries to call the same tool with the same arguments, intercept it:

import hashlib
import json

class ToolCallTracker:
    def __init__(self):
        self.seen = {}  # fingerprint -> cached result

    def fingerprint(self, name, args):
        # Stable hash of the call signature (sort keys for determinism)
        canonical = json.dumps({"name": name, "args": args}, sort_keys=True)
        return hashlib.sha256(canonical.encode()).hexdigest()

    def get_or_execute(self, name, args, executor):
        fp = self.fingerprint(name, args)
        if fp in self.seen:
            # Return the cached result plus a nudge for the model
            return {
                "result": self.seen[fp],
                "warning": "You already called this. Try something different.",
            }
        result = executor(name, args)
        self.seen[fp] = result
        return {"result": result}

Enter fullscreen mode Exit fullscreen mode

The warning field is the secret sauce. It tells the model "hey, you've been here before." In my testing this alone reduced loops by something like 70 percent. I haven't run a rigorous benchmark — that's just from eyeballing trace logs across maybe 200 runs.

Step Three: Detect Semantic Loops

Sometimes the agent doesn't repeat the exact same call. It does search("python async"), then search("async in python"), then search("python asyncio"). Same intent, different arguments.

For this you need fuzzy matching. The cheap version uses embeddings — sentence-transformers is fine for this:

from sentence_transformers import SentenceTransformer
import numpy as np
import json

model = SentenceTransformer("all-MiniLM-L6-v2")

class SemanticLoopDetector:
    def __init__(self, threshold=0.92):
        self.threshold = threshold
        self.history = []  # list of (embedding, call_repr)

    def check(self, name, args):
        repr_str = f"{name}({json.dumps(args, sort_keys=True)})"
        emb = model.encode(repr_str)

        for prev_emb, prev_repr in self.history:
            # Cosine similarity between the new call and each prior one
            sim = float(np.dot(emb, prev_emb) / (
                np.linalg.norm(emb) * np.linalg.norm(prev_emb)
            ))
            if sim > self.threshold:
                return prev_repr  # Loop detected — return what matched

        self.history.append((emb, repr_str))
        return None

Enter fullscreen mode Exit fullscreen mode

Tune the threshold to taste. Around 0.92 is roughly where you catch real loops without flagging genuinely different queries. Higher and you miss loops; lower and you start blocking useful exploration.

Step Four: Force a Decision

If you detect three near-duplicate calls in a row, stop being polite. Inject a system message telling the agent to either commit to an answer or stop:

if detector.consecutive_duplicates >= 3:
    messages.append({
        "role": "user",
        "content": (
            "You have repeated similar tool calls three times. "
            "Based on what you already know, either provide your "
            "best answer now or explicitly say you cannot complete "
            "this task. Do not call any more tools."
        ),
    })

Enter fullscreen mode Exit fullscreen mode

This is brutal and it works. Most stuck agents will produce a reasonable answer once you take the option to keep looping off the table.

Prevention: Habits to Bake In From Day One

A few things that save real pain later:

  • Log every tool call with timestamps. When something goes wrong you want to read the trace, not guess.
  • Set a token budget per task, not just an iteration count. A loop that fits in 5 iterations but each one pulls a 50k-token document is just as bad.
  • Write completion criteria into the system prompt. "Stop after you have at least 3 sources" beats "stop when you're done."
  • Test with adversarial inputs. Give the agent a task with no good answer. Make sure it gives up gracefully instead of looping forever.
  • Make tool errors visible to the model. If a tool failed, say so plainly in the result. Silent failures push the model into retry-storm territory.

The agent ecosystem is still figuring out best practices. Most of the tooling we'd actually want — proper tracing, deterministic replay, structured tool-call memory — is being reinvented in every framework. Until things settle, defensive coding is the price of admission.

If you've already shipped an agent to production without these guards, I'd suggest checking your billing dashboard before you check anything else. Ask me how I know.