惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
人人都是产品经理
人人都是产品经理
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
V
V2EX
博客园 - 三生石上(FineUI控件)
Martin Fowler
Martin Fowler
WordPress大学
WordPress大学
D
Docker
S
SegmentFault 最新的问题
博客园 - 聂微东
美团技术团队
Apple Machine Learning Research
Apple Machine Learning Research
月光博客
月光博客
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Last Week in AI
Last Week in AI
M
MIT News - Artificial intelligence
F
Fortinet All Blogs
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The GitHub Blog
The GitHub Blog
GbyAI
GbyAI
L
LangChain Blog
Vercel News
Vercel News
博客园 - 叶小钗
MongoDB | Blog
MongoDB | Blog
Stack Overflow Blog
Stack Overflow Blog
H
Help Net Security
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
The Cloudflare Blog
Engineering at Meta
Engineering at Meta
T
Threat Research - Cisco Blogs
T
Threatpost
Scott Helme
Scott Helme
T
Tailwind CSS Blog
Latest news
Latest news
Stack Overflow Blog
Stack Overflow Blog
Blog — PlanetScale
Blog — PlanetScale
The Register - Security
The Register - Security
罗磊的独立博客
P
Proofpoint News Feed
腾讯CDC
S
Schneier on Security
雷峰网
雷峰网
A
About on SuperTechFans
T
Tenable Blog
F
Full Disclosure
Cyberwarzone
Cyberwarzone
博客园_首页
有赞技术团队
有赞技术团队
K
Kaspersky official blog

DEV Community

Bringing MongoDB Atlas and Voyage AI to Dify: Build RAG Workflows and Data Agents Without Heavy Glue Code Intel Targets World's First Mass Production of Glass Substrates for AI Chip Packaging 🔮 Hermes Agent 🤖: A Practical Guide 🔥 — and How It Stacks Up Against OpenClaw & GoClaw 📊 CSS @function CSS @function Agent Payment Stablecoin Fallbacks: Do Not Retry the Changed Quote Daily-summary-agent Opus 4.8 barely moved the leaderboard. It moved the one number that decides if your agents can be trusted. I Built an AI Interview Coach That Turns Any Resume Into a Personalized Prep Package — No API Keys Needed The best Claude Code agents are defined by what they refuse to do I Built a Tiny Skeleton Loader for React Why I Generated Synthetic Patients to Make Identity Matching Better SPIFFE Compliance Deep Dive PostgreSQL 08007 오류 원인과 해결 방법 완벽 가이드 I Was Tired of Writing Daily Standups, So I Built an AI Agent using claude code I got tired of LLM observability tools getting acquired. So I built one that can't be. Oracle ORA-00072 오류 원인과 해결 방법 완벽 가이드 Multi-Agent Negotiation Protocols: How AI Agents Should Bargain for Resources uBlock Origin No Longer Works on Chrome - Here Are the Best Alternatives in 2026 SSH Agent Forwarding vs ProxyJump: Why Agent Forwarding Is Dangerous and What to Use Instead The Best Technology Disappears I Built a Production-Oriented Multi-Provider AI Chatbot in Rust — Here's How Markov Chain Coin Sequence: E[HH] vs E[HTH] Explained LLM Deal Flow Automation in CRM The Do-Over Game: Nash Equilibrium at the Golden Ratio Cash Flow Waterfall Model for LBO Automated Client Reporting The Monty Hall Problem: Why Switching Wins 2/3 of the Time Chat With Your Database Using Natural Language: The Future of Business Analytics Google Apps Script Automation Amoeba Extinction Probability: The Branching Process Solution RAG Architecture Deep Dive Real-Time KPI Dashboards OpenAI Agents SDK的5个隐藏用法 🔥 Algorithmic Trading Pipelines 131 tokens per second on GPU under Kubernetes one of the best blogs about hermes agent Nous Research Hermes Agent: Setup and Tutorial Guide Day 20 - AWS Lambda Spending Hours Designing the UI? Or Just Telling AI the Pain Story Karpenter on AKS in 2026: What Actually Works I built a Chrome extension that shows your ChatGPT token usage in real-time Day 1 Field Report — Barriers to an Autonomous Agent Earning Money Online Mastering Background Processing in Rails 8: Sidekiq & Redis Optimization I shipped three fixes to my product in seven days. All three came from readers. Claude Code Model Switching: The Verification Notes That Could Save You $200/Month Three agent-memory threads this week, one missing field The Way to Break Through: Why Others Sail Through While You Struggle Simple Snap Layout Overlay for Tauri v2 CSS Animation vs Lottie: Which Should You Use in 2025? How to Add Lottie Animations to Vue.js (2025 Guide) Building BayouOps Suite Pro — Lightweight Operational Readiness & Visibility for IT Teams Detecting Adversary-in-the-Middle (T1557) with Data Science HTTP Headers Every Developer Should Know (2026) Detecting Ingress Tool Transfer (T1105) with Python Linux Command Line: The 25 Commands I Use Every Day (2026) Starting My Cybersecurity Learning Journey 🚀 CSS in 2026: Modern Techniques You Might Not Know (2026) TypeScript Deep Dive: Advanced Types and Patterns (2026) Three SQL Injection Patterns That Still Ship in Node.js — And the ESLint Rule That Catches Them From Idea to Production: How I Built a Decoupled Chatbot Ordering Engine I Spent 8 Months Building a Framer Killer as a Solo Undergrad. Here's What Happened. unknown 5 Git Commands I Wish I Knew 5 Years Ago How to Find users who don't follow you back in Github Bulk-check DNS, SSL and email auth for a whole list of domains (no scraping) Monolithic vs Microservices Architecture: Which One Should You Choose? The Full-Stack Developer's 2026 Playbook: 7 Shifts That Separate Senior Engineers from the Rest MCP Tool Budget for AI SaaS: Stop Agents From Burning Tokens, Tools, and Trust Untrusted Code, Trusted Cluster Scaling Secure AI Agent Workspaces with GKE Agent Sandbox Learning, Experimenting - Concurrency in Go Building Dhrishti Part 2: Go-Lang Quirks Announcing My New Book: Web Automation with Playwright and Python using AI and MCP Why MTP Batch Transfers Slow Down Between Files How We Cut Our AI Coding Bill by 65% Without Sacrificing Quality Claude vs Gemini Across 4 Security Domains: A Dead Heat — and the Hardening 63% of AI Code Skips I Benchmarked 4 Lightweight Transformers for Fault Detection. Here's What Survived. 🗡️ Tsundoku Slayer: An Agent That Decides What Not To Read Animated Icons for Web Apps — The Complete 2025 Guide How to Use Lottie Animations in React (2025 Guide) Azure API Management - Deploy gRPC API on Azure API management using self hosted gateway I Built pretext-pdf: Serverless PDFs Without Chromium Lottie JSON vs .lottie Format — What's the Difference and Which Should You Use? SVG Icon Systems in 2025 — Everything You Need to Know My Trading Bot Tried to Execute the Same Trade Twice. That Became SafeAgent. Free Loading Animations for Web Apps — Lottie, GIF, and SVG Spinners (2025) How to Add Lottie Animations to Your Website (Free JSON Files Included) Idempotency Keys: The One API Pattern That Prevents Duplicate Payments (and Worse) CONFIGURING SEMANTIC MODEL IN POWER BI Surviving Global Vendor Outages: Federated Cellular Architecture with EKS, AKS, and Istio I Turned My Cursor + Claude Code Setup Into 12 Reusable Files I Built a Cognitive Threat Hunter on Hermes Agent — It Analyzed the Session Where I Built It and Found Three Blind Spots Making AI-Generated Code Fail Gracefully How to Convert Lottie JSON to GIF (Free, Browser-Based, No Signup) Observability 2.0: Tracing AI "Thought Chains" with OpenTelemetry Best Free Lottie Animation Tools in 2025 (No Signup, No Paywall) What Is a Function in Scala Three ways to gate an MCP server: OAuth, L402, and proof-of-work You don't know kubectl — you know how to Google kubectl. The first-principles fix. Building a DevOps Incident Investigator with Coral SQL — From 15 Minutes to 15 Seconds
Stop Burning Tokens on Chat / Agent Loops — Here's What Actually Works
lilili · 2026-05-31 · via DEV Community
  1. You’re Overpaying Every Day — You Just Can’t See It Think about the last time you asked an AI to clean up your meeting notes.

You probably opened a new chat, pasted in the transcript — maybe 1,500 words — then pasted your usual notes template on top of that, then said something like “format this, bold the action items.”

It worked. Useful, even.

But here’s what actually happened: the model just read ~3,000 words to produce ~300 words of output.

Do that five times a week. Every week. And now think about what’s riding along in that context every single time — your template, your formatting preferences, all the background you’ve already explained before. The model doesn’t remember any of it. It reads it fresh on every call.

Every repeat. Every charge.

This isn’t a flaw in ChatGPT. It’s the fundamental nature of chat as a paradigm.

2.
Chat Is Great — But It Has a Structural Bug
Chat is the most natural way to start with AI. Unclear what you want? Talk it out. Need to change direction? Just say so. The feedback loop is instant, the barrier is zero.

That’s why everyone starts there.

But chat has a structural problem: every single turn carries the entire history.

This is how context windows work — the “conversation history the model reads every single time.” Every API call packages up your full history and sends it to the model. You pay for every token the model reads. Ten rounds in, round ten doesn’t cost the price of one message. It costs the price of all ten, stacked.

Here’s a concrete version of this. Say you use AI to write your weekly status update. You paste in your bullet points from the week, say “turn this into a proper update,” tweak the tone, go back and forth a couple times. Feels efficient.

But those bullets, plus the AI’s draft, plus your follow-up messages, plus the format you’re implicitly re-explaining each time — the real token cost of one weekly update is probably 5 to 8x what you’d guess.

You’re paying for repeated context. The bill just isn’t obvious enough to feel.

  1. Agents Are Smarter — and Much Harder to Budget

So if chat gets expensive, agents sound like the upgrade. Let the AI break down the task itself, call its own tools, decide its own next steps.

The demos are genuinely impressive. Hand it a goal, walk away, come back to a finished report.

Then you try to ship it.

Agents typically run on a loop: think about the next step → call a tool → observe the result → think again. This is called a ReAct loop — think → act → observe → repeat, each loop = one LLM call. The model controls how many loops it takes. You don’t.

Picture this: you’ve set up an agent to triage your inbox and draft replies to routine emails. Most days it’s smooth — it reads the email, matches a template, sends a reply, cost is stable.

Then one day someone sends you a vague message: “Hey, on that proposal from last week — we’re still thinking it over.” The agent isn’t sure which proposal. So it decides to scan your sent folder. Finds two candidates. Still not certain. Checks the thread. Then the attachment.

You didn’t ask it to do any of that. It didn’t tell you it was doing it. You just saw your bill at the end of the month and noticed that one day cost three times the usual.

The cost depends on how the model reasoned in that moment. You have no control over that.

That’s not a bug — it’s by design. Agents are built for exploration. When the path isn’t clear, they walk further. For research tasks, deep one-off problems, open-ended investigation: that’s exactly the right tool. But for daily recurring work where you need predictable costs and reliable output, that same autonomy becomes a surprise expense waiting to happen.

An agent is like a brilliant contractor who never writes down what they did or why. You trust the output. You can’t audit the process.

4.
Workflow: Only Use AI Where AI Is Actually Needed
At this point you might be thinking: okay, so workflow is just a bunch of if-else logic?

Not quite.

Workflow isn’t “avoid AI.” It’s “only use AI where it genuinely earns its place.”

Here’s an example you can map directly to your work.

Say you produce a weekly sales report: pull last week’s numbers, calculate week-over-week changes, identify top and bottom performers, write a natural-language summary, email it to your manager.

In ChatGPT, every Friday looks the same: paste the data, re-explain what you want, tweak the output, copy it into an email. You’re re-describing your own job to the model every single week.

In a workflow on MorphMind, the same task looks like this:

4 steps. 1 LLM call. Set it once. It runs every week on its own.

This is the core idea behind workflow: split the task into deterministic steps (rules handle it, no model needed) and LLM nodes (where language understanding actually matters), and only pay for the second category.

Step 3’s input is clean, structured numbers — no extra context, no history, no re-explanation. The token cost is minimal. The output is consistent. And Monday morning, the report is already in your manager’s inbox before they sit down.

You didn’t open ChatGPT. You didn’t paste anything. You didn’t say a word.

  1. Three Paradigms. Three Jobs. Know Which Is Which. The question isn’t “which is best?” That’s the wrong frame. Each paradigm has a job.

Good AI practitioners don’t pick one and stay there. They know which tool fits which situation.

Notion AI polishing your doc is workflow logic: fixed input, fixed format, predictable cost. Asking it to “brainstorm a completely new direction” is chat logic. Both are right. Different jobs.

  1. Why Serious Products Keep Landing on Workflow “Shippable” isn’t just “it runs.”

It means cost-controlled — you know what each execution costs, and there are no surprises.
It means reproducible — the same input produces reliably similar output.
It means debuggable — when something breaks, you know exactly which step broke, and you can fix just that step.

Chat and agents struggle with all three. You can’t explain to a stakeholder why costs vary 3x day to day. But you can draw a workflow diagram — here’s every step, here’s what it does, here’s how many times it runs per week, here’s the cost per run.

That’s not technical sophistication. That’s delivery discipline.

There’s also an engineering reason workflow wins in production: replay. Every step has logged inputs and outputs. When something goes wrong, you don’t restart the whole pipeline — you re-run the broken step with the same inputs and fix it in isolation. Chat and agents don’t give you that. Every conversation is a black box that starts over from scratch.

  1. One Thing You Can Do Right Now If you’ve read this far, you probably fit one of these:

You’re a heavy ChatGPT user — handling a lot of recurring work through chat, and it mostly works, but you’ve never actually looked at what it’s costing you or how much repeated context you’re sending every single time. This article is describing your situation.

You’ve tried an agent tool — Manus, OpenAI Operator, Devin, something like that. The demos blew you away. But the reliability and cost unpredictability make you nervous about actually depending on it for anything critical. You want the capability without the chaos.

You’re burning tokens in your dev workflow — using AI for code, and you’ve noticed it goes back and forth more than you expected. Fixing one function turns into four or five rounds. The context balloons fast.

You’ve hit the ceiling on Zapier or Make — your automations work great until they don’t. Every time a step requires actual judgment — a non-standard email, a field that’s missing, a case that doesn’t fit the template — the workflow chokes and you’re back to doing it manually.

In every one of these cases, the underlying problem is the same: you haven’t found a way to use AI without letting it freelance on your dime.

That way exists.

Pick the repetitive task you do most often. Break it apart: which steps follow fixed rules? Which single step actually needs language understanding? Build that as a workflow. Set it once. Let it run.

That’s what MorphMind is built for. It lets you call the LLM only where you need it, while everything else runs automatically. Traditional automation tools like Zapier hit a wall the moment they need AI judgment — there’s no LLM node to slot in. Chat and agents have the intelligence but no structure and no cost control. MorphMind combines both. Every step’s inputs and outputs are logged, so if something breaks you can replay just that step — no starting over from scratch.

Free to try. No credit card required. Go in, find your most familiar recurring task, and build your first workflow.

You’ll do the same work. Spend a fraction of the tokens. And never have to think about it again.

👉 morphmind.ai