One of the fastest ways for LangChain agents to become unstable in production is not model quality.
It’s recursive tool loops.
A workflow starts normally:
- search
- retrieve
- summarize
Then suddenly:
- the same tool gets called repeatedly
- retries compound
- context grows
- token usage spikes
- execution drifts indefinitely
The agent technically remains “alive.”
Operationally, it stopped making progress a long time ago.
This article shows a simple way to detect and interrupt recursive tool loops in LangChain agents using TypeScript.
The Problem
A basic agent workflow often looks harmless:
```ts id="jlwm4"
const result = await agentExecutor.invoke({
input: userPrompt
});
But production agents can drift into patterns like:
```txt id="0jlwm4"
search_documents
→ search_documents
→ search_documents
→ search_documents
or:
```txt id="1jlwm4"
search
→ summarize
→ retry
→ search
→ summarize
→ retry
This usually happens because:
* the model fails to converge
* tool outputs are ambiguous
* retries reinforce uncertainty
* the agent misinterprets partial progress
The result is:
## runaway execution.
# Why This Is Dangerous
Most AI workflows behave normally most of the time.
T
he problem comes from tail events:
* recursive retries
* unstable recovery behavior
* escalating context windows
* repeated tool invocation
A tiny percentage of unstable runs can consume a disproportionate amount of:
* inference cost
* latency
* compute
* operational attention
This is not just an observability issue.
It’s a runtime governance issue.
---
# Basic Strategy
We want to:
* track recent tool usage
* detect repetition patterns
* interrupt execution safely
before the workflow spirals.
The simplest version:
```txt id="2jlwm4"
“If the same tool is called too many times consecutively, stop execution.”
Simple.
Effective.
Easy to implement.
Step 1 — Track Tool History
We’ll maintain lightweight runtime state:
```ts id="3jlwm4"
type ExecutionState = {
toolHistory: string[];
};
Initialize it:
```ts id="4jlwm4"
const state: ExecutionState = {
toolHistory: []
};
Step 2 — Detect Recursive Patterns
Now create a helper:
```ts id="5jlwm4"
function detectRecursiveLoop(
toolHistory: string[],
threshold = 3
): boolean {
if (toolHistory.length < threshold) {
return false;
}
const recent = toolHistory.slice(-threshold);
return recent.every(
tool => tool === recent[0]
);
}
This checks:
```txt id="6jlwm4"
Did the same tool run 3 times in a row?
Step 3 — Wrap Tool Execution
Now intercept tool calls:
```ts id="7jlwm4"
async function guardedToolCall(
toolName: string,
execute: () => Promise
) {
state.toolHistory.push(toolName);
if (detectRecursiveLoop(state.toolHistory)) {
throw new Error(
Recursive loop detected for tool: ${toolName}
);
}
return execute();
}
---
# Step 4 — Use Inside LangChain Tools
Example:
```ts id="8jlwm4"
const result = await guardedToolCall(
"search_documents",
async () => {
return searchTool.invoke(query);
}
);
That’s it.
Now your workflow can:
- detect runaway repetition
- interrupt unstable execution
- prevent unnecessary cost escalation
Why Simple Detection Works Surprisingly Well
A lot of teams initially assume they need:
- anomaly detection
- reinforcement learning
- advanced telemetry pipelines
But simple operational heuristics already eliminate many expensive failures.
Especially:
- recursive retries
- retry storms
- repeated tool churn
You do not need perfect intelligence initially.
You need:
bounded execution.
Production Improvements
The minimal approach above works surprisingly well, but production systems usually add:
- semantic similarity detection
- token velocity monitoring
- execution depth limits
- tool-call budgets
- runtime ceilings
- timeout policies
- adaptive thresholds
Example:
```txt id="9jlwm4"
search
→ search
→ search
is easy to detect.
More advanced loops look like:
```txt id="10jlwm4"
search
→ summarize
→ retry
→ search
→ summarize
→ retry
These require broader trajectory analysis.
The Distributed Systems Parallel
Distributed systems eventually evolved:
- retry limits
- circuit breakers
- bounded failure domains
- timeout controls
because unconstrained retries became dangerous at scale.
Autonomous agent systems are beginning to encounter similar operational realities.
As agents become:
- more autonomous
- more persistent
- more deeply integrated
runtime governance becomes increasingly important.
Final Thoughts
Most teams focus heavily on:
- prompts
- model quality
- orchestration frameworks
But production AI systems also need:
- bounded execution
- runtime constraints
- operational safeguards
- economic stability
Because eventually:
the challenge is not just building autonomous agents.
It is building governable autonomous agents.

























