惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

L
Lohrmann on Cybersecurity
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
Recorded Future
Recorded Future
S
Schneier on Security
I
Intezer
Latest news
Latest news
N
News and Events Feed by Topic
Scott Helme
Scott Helme
T
Threat Research - Cisco Blogs
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
U
Unit 42
量子位
博客园 - 【当耐特】
S
Security @ Cisco Blogs
Google Online Security Blog
Google Online Security Blog
博客园 - 叶小钗
酷 壳 – CoolShell
酷 壳 – CoolShell
NISL@THU
NISL@THU
The Cloudflare Blog
李成银的技术随笔
T
ThreatConnect
L
LINUX DO - 最新话题
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
有赞技术团队
有赞技术团队
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Jina AI
Jina AI
T
Tor Project blog
The Hacker News
The Hacker News
人人都是产品经理
人人都是产品经理
小众软件
小众软件
S
Security Archives - TechRepublic
美团技术团队
博客园 - Franky
Security Latest
Security Latest
J
Java Code Geeks
P
Proofpoint News Feed
V
V2EX
The GitHub Blog
The GitHub Blog
WordPress大学
WordPress大学
Application and Cybersecurity Blog
Application and Cybersecurity Blog
H
Help Net Security
PCI Perspectives
PCI Perspectives
Cyberwarzone
Cyberwarzone
Hugging Face - Blog
Hugging Face - Blog
N
Netflix TechBlog - Medium
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
SecWiki News
SecWiki News
腾讯CDC
爱范儿
爱范儿
D
Docker

DEV Community

From Half‑dead Prototype to Local‑Only AI Medical Assistant: Rewiring MedClinic with GitHub Copilot Runninig a forkbomb in Jenkins What’s Actually Happening When You Use Git Preventing Recursive Tool Loops in LangChain Agents Building a Rock-Paper-Scissors CLI with TypeScript — Union Types, Conditionals, and Jest Why Flutter Has Become the Go-To Framework for Fintech App Development We built a scripting language just for AI agents. Here's why. Stop building AI inboxes. Build decision layers instead. Meme Monday Why I Built @editora/ui-react? Are AI tools the next level of abstraction in software development? Identity on Solana: Your Wallet Is Your Account One API Call Changed Everything The Internet Career Nobody Talks About Enough: What Is DevRel? Solar Panel Wiring Diagram: Series vs Parallel Hello everyone! Glad to join the dev.to community I Built an AI Agent That Tailors My Resume - Here's How Agents Actually Work I Built a WhatsApp OTP + AI Chatbot Platform for African Businesses MTP Explained — And Why It Matters for Android on Mac Most Beginners Learn Full-Stack Development Backwards GitHub Glow-Up: Open Source, READMEs, Badges, Streaks, Git and gh CLI System Design Cheat Sheet: Concepts Every Developer Should Know Are Junior Developer Roles Actually Dying? A Fresher's Honest Take Using DigitalOcean Droplets as Ephemeral Sandboxes for AI Agents I built a VSCode extension that visualises your code navigation as a call tree — made for legacy codebase pain Vite predev/prebuild: chaining scripts without losing your mind A website to save you from messy browser tabs Dear Web2 Developer... Solana is here calling Postgres JSONB indexes: GIN vs BTREE on the same column The $5 AI That Remembers Everything What are your goals for the week? #180 Zettelkasten for Developers: A Practical Method That Works OpenClaw vs Hermes Agent: Stars, Downloads & Usage 2026 `act` vs. `waitFor` Global Teams Don’t Struggle With Time Zones. They Struggle With Context Python as a JavaScript Dev $5.4 Billion in Damage. 8.5 Million Machines Down. Three YAML Controls Would Have Prevented It. Here's the Structural Analysis. 🚫 Stop Using PN532 V1 for Your NFC Projects (Real Debugging Experience) Probabilistic Graph Neural Inference for smart agriculture microgrid orchestration for extreme data sparsity scenarios Inference Is Becoming the New Steady-State Cost Center Why AI-Generated Code Is Always Good Enough — And Never Great I built a dark admin dashboard template in HTML — no React, no npm, just pure HTML What is the Difference Between Lattice-Based and Hash-Based Signatures? Next.js App Router caching: revalidate, dynamic, and no-store without the folklore Next.js App Router caching: revalidate, dynamic y no-store sin folklore I built Stashly — a full-stack content manager with a rich text editor published: false tags: react, node, mongodb, typescript Why I Started Building React Projects Instead of Just Watching Tutorials ? Every Tool Eventually Becomes Tuesday Nobody Warns You That Real Software Engineering Feels Chaotic Tích hợp VNPay, Stripe trong Odoo 19 BeautifulSoup and Requests for Web Scraping With Python: When Simple Still Works I Was Stuck Debugging React — Then Developer Tools Changed It Buck Converter Ripple: Sizing the Inductor and Capacitor With Confidence AWS Just Made Its MCP Server Generally Available. Here's What It Actually Gives AI Agents. RAMPART Tests Your AI Agents in Dev. What Catches Malicious Tool Calls in Production? Vibe Team Software Engineering: What a Real AI Human Dev Team Workflow Actually Looks Like An npm Package for AI Agent Orchestration Just Shipped With Its Front Door Unlocked. Here's What the CVE Actually Reveals. Microsoft Foundry Just Added CI/CD for AI Agents. Here's What That Actually Changes. The Best Career Insurance Is a Tech Event You Don't Want to Attend Your GitHub Profile Already Tells Recruiters More Than Your Resume. Most Devs Just Don't Surface It. How to Add Execution Budgets to OpenAI Agents SDK Binary Tree Interview Problems: 6 Traversal Patterns, 15 Problems We trained a personal voice DoRA on Qwen3-8B for $1.50 — beat stock model 100% in blind A/B Stop Leaking API Keys: Why I Built a Local-First Vault for Developers 🔐 RAG Explained: How Retrieval-Augmented Generation Actually Works I Built a Fast Async JioSaavn API Wrapper in Python 🎧 chown & chgrp Deploying Your First App on Kubernetes: A Beginner's Guide (Minikube & Kind) Logs in code It's called a PR "review" for a reason DePIN GPU Market: The Failed Job Receipt Developers Should Demand Why Your AI Agent Monitoring is Wrong (And How to Fix It) Lock Down Your Cloud Shares: A Beginner’s Guide to Azure Files Security. Building a Multi-Channel Content Syndication Pipeline with EmDash Plugins Turn Your Phone Into Voice Input for Any React Text Field Which package is bloating your Docker image? Putting Claude Code Under Version Control: Configs Since July, Memory Since April What I Thought DevRel Was vs. What It Actually Is (A Mentee's Honest Take) What I Thought DevRel Was vs. What It Actually Is (A Mentee's Honest Take) 400 Million Tokens Burned Overnight Reviving My Linux Mastery Game from a Merge Conflict — A Finish-Up-A-Thon Comeback Don’t let AI break your collective thinking: a practical guide for engineering teams First Gemma 4 ExecuTorch Deployment on Raspberry Pi 5 — and Why It's 7.7 Slower Than llama.cpp Per-Turn Evaluation: Dynamic Governance for AI Agents The AI Triforce of seed4j: Power, Wisdom, and Courage for Your Dev Agent Your AI agent reports 80% task completion. It fabricated it. Pourquoi les overlays d'accessibilité ne tiennent pas leurs promesses (et ce que la FTC vient d'acter) AI May Break Product-Market Fit in Enterprise Software I’m Building Around the Gap Between AI Output and Repo Truth How to Build a Stripe Customer Portal in Next.js SaaS On-Demand Pricing Feels Safe - Until You See the Bill Building an Internal Developer Portal with Backstage A Production Deployment Guide After the Last Song Sudoers Configuration in Linux Terraform + Terragrunt + Ansible: A Hands-On Learning Journey Switching Users in Linux (su, sudo) AI 智能体的鲁莽速度 Quick Win Card #01 — Ton backlog.md t'a menti (la cure en 30 secondes) Quick Win Card #01 — Your backlog.md lied to you (a 30-second cure) How to Manage an IT Team: Structure, Scaling, and Daily Workflows That Work
Your AI Coding Agent Wastes 80% of Its Context. Fixed That with Graph Theory.
Dhrupo Nil · 2026-05-25 · via DEV Community

The problem nobody admits

When you give Claude Code, Cursor, or Codex a task like "fix the login validation bug", here's what they usually do:

  1. Run grep -l login src/ → 17 files
  2. Read all 17 files top-to-bottom (because context is "free")
  3. Spend 80% of the model's context window on irrelevant imports, type aliases, and helper functions the bug doesn't touch
  4. Generate a fix using whatever 20% of attention is left

This works. Sort of. But it's wasteful — and on big codebases, it's wrong: the agent runs out of context before it sees the actual buggy function.

The instinct is to throw a bigger model at it. Bigger context window, fancier RAG, vector embeddings. All of which trade real cost for diminishing returns.

There's a better answer that's been sitting in classical CS the whole time: treat the repo as a graph.

demo

The idea, in one paragraph

Your codebase already is a graph. Functions call functions. Modules import modules. Classes extend classes. Pick a node (the symbol your task is about), and the structurally-closest neighborhood is almost certainly what an agent needs to see.

So I built mincut-context — an npm package that:

  1. Parses your repo into a symbol graph (tree-sitter, supports TS/JS/Vue/Python/PHP)
  2. Derives seed nodes from your task description (keyword IDF on symbol names + file paths)
  3. Runs personalized PageRank with the seeds as the restart vector
  4. Picks the minimum-cut subgraph that fits a token budget you choose

The output: a list of files + line ranges that an agent should look at. Nothing more, nothing less.

Show me the numbers

I built an evaluation suite into the repo itself. 28 hand-labeled tasks across 3 real codebases at a 4,000-token budget:

strategy precision recall F1 token-efficiency
mincut 0.27 0.83 0.39 0.270
mincut + --embed (semantic) 0.27 0.83 0.39 0.270
grep keyword baseline 0.11 0.42 0.16 0.105
random selection (control) 0.01 0.04 0.01 0.009

Per-repo breakdown:

repo tasks mincut recall grep recall mincut F1 grep F1
mincut-context (self) 12 0.97 0.56 0.44 0.30
FluentForm (PHP+Vue+JS) 8 0.88 0.13 0.43 0.04
Fluent Player (TS/JSX) 8 0.63 0.56 0.31 0.13

mincut catches ~2× more of the correct files than grep, at ~2.5× better token efficiency. Reproducible with npm run eval. Add your own labeled tasks under eval/fixtures/ to score against your own codebase.

The math, briefly

Given a symbol graph $G = (V, E, w)$ where:

  • $V$ are code units (functions, classes, methods)
  • $E$ are dependency edges (imports, calls, references)
  • $w(v)$ is the token cost of including symbol $v$
  • $B$ is your token budget
  • $S \subseteq V$ are seed nodes derived from the task

Find $T \supseteq S$ with $\sum_{v \in T} w(v) \le B$ minimizing the boundary cut cost:

$$\text{cut}(T, V \setminus T) = \sum_{e \in E, \text{ crossing}} w(e)$$

In plain English: pick a connected, low-token region that has few "loose ends" pointing outside it. The inside of the cut is what the agent needs; the outside is safely ignorable.

The objective is submodular, so a greedy algorithm gives a $(1 - 1/e) \approx 0.63$ approximation guarantee. The full pseudocode is in the README; the implementation is ~200 lines in src/core/select.ts.

Three ways to use it

1. As an MCP server — recommended for agents

Drop this block into your Claude Code / Codex / Cursor settings:

{
  "mcpServers": {
    "mincut-context": {
      "command": "npx",
      "args": ["-y", "mincut-context", "mcp"]
    }
  }
}

Enter fullscreen mode Exit fullscreen mode

Your agent now has six new tools: pack_context, expand_node, find_callers, find_callees, search_symbols, explain_selection. They operate on the cached graph from the most recent pack_context call — effectively free traversal after the first pack.

2. As a CLI

npm install -g mincut-context

mcx pack "fix the login validation bug" --budget 4000             # plain output
mcx pack "..." --format tree                                       # directory-grouped
mcx pack "..." --format json | jq                                  # pipe to anything
mcx pack "..." --interactive                                       # Ink TUI: vim keys + preview
mcx pack "..." --embed                                             # semantic seeding
mcx pack "..." --cache                                             # 5× warm-run speedup
mcx watch "..." --debounce 300                                     # re-pack on file change
mcx doctor                                                         # environment self-check

Enter fullscreen mode Exit fullscreen mode

mcx doctor is my favorite — it tells you in 6 lines what's installed and what isn't:

doctor

3. As a library

import { pack } from 'mincut-context';

const result = await pack({
  task: 'fix the login validation bug',
  repo: process.cwd(),
  budget: 4000,
  cache: true,
  parallel: 4,
  chunk: { enabled: true, maxTokens: 400 },
});

for (const f of result.files) {
  console.log(f.path, f.score.toFixed(3), f.tokens, '·', f.reasons[0]);
}
// → src/auth/login.ts        0.541  612 · seed — matched directly by task
// → src/auth/session.ts      0.408  483 · attached (60%)

Enter fullscreen mode Exit fullscreen mode

What I learned by building this

1. Embeddings are oversold for this problem

Adding semantic embeddings (--embed flag, via @xenova/transformers running locally) did not improve recall on any of my three eval task sets. Why? Because the labels were named honestly. When you label "stripe payment processor" → StripeProcessor.php, the keyword match catches it without help. Embeddings only earn their keep when your task vocabulary diverges from the code's — "centrality and ranking" → PageRank, that kind of gap.

I left --embed in because it doesn't hurt, and there are real users whose mental model doesn't match the code. But the marketing-friendly "AI-powered" framing for this stuff is mostly noise.

2. Greedy beats CELF for this objective

I implemented CELF (Cost-Effective Lazy Forward, Leskovec 2007) hoping for a free speedup over the naive greedy. It diverged — not just slower (8× slower on FluentForm) but wrong: it produced smaller, structurally weaker selections.

Why: our "no isolated nodes" acceptance rule (a candidate must have at least one edge into the current selection) breaks CELF's submodular-monotone assumption. A candidate's eligibility flips discontinuously when a node with an edge to it joins T. The lazy cache becomes unreliable.

I wrote the dead end up in eval/ALGORITHM-RESEARCH.md so nobody re-treads it. Honest negative results are worth shipping.

3. Sub-symbol chunking matters more than I expected

Big legacy codebases have huge functions. A 500-line function is one symbol in the graph, and if it gets selected, the whole thing eats your budget. So --chunk splits big functions at statement boundaries — each chunk becomes its own sub-symbol, individually selectable.

On FluentForm: indexing without chunking → 4,333 symbols. With --chunk → 4,878 symbols (+545 chunks). Same budget, much finer-grained selection. The greedy can pick just the relevant if/for/try block instead of all-or-nothing.

4. Test coverage of 88% isn't the whole story

The CI gates on 85% statements / 80% branches / 90% functions / 85% lines. But the genuinely-untestable files — worker scripts, lazy-loaded LSP clients — are excluded from the calc. Honest reporting means saying what is tested, not just the headline number.

The honest tradeoffs

Honest tradeoff What we do
True optimal min-cut is NP-hard Greedy submodular — (1−1/e) bound
Tree-sitter symbols are syntactic, not type-aware --lsp refines TS/JS via typescript-language-server
Embedding model adds ~22 MB on first run Opt-in behind --embed flag
LSP startup is slow (~1–5s) Opt-in; cached after init
Cold start parses whole repo --cache (5× speedup) + --parallel n (2.7× speedup)

What I'd build next if you asked

The roadmap that's not checked off yet:

  • Pyright / Intelephense LSP adapters — type-aware calls for Python and PHP (~1–2 days each on the existing LSP infrastructure)
  • Svelte / Rust / Go parsers — one file each on the parser template
  • Incremental neighborhood caching in the greedy — keep attach(v, T) cached and update only when a node with an edge to v is added. Expected 3–5× speedup on graphs with bounded degree.

Each is bounded effort and additive. The core is done.

Stop building, start using

The hardest lesson: a tool's value comes from someone actually using it on real work, not from feature count. mincut-context is at v1.7.0 — 261 tests, 88.6% coverage, CI green on Ubuntu + macOS × Node 18/20/22. There's no honest "but it's not ready" excuse left.

If you've watched an AI agent burn 80% of its 200k-token context on imports it doesn't care about, install it now and tell me what breaks:

npm install -g mincut-context

Enter fullscreen mode Exit fullscreen mode

🔗 GitHub: github.com/dhrupo/mincut-context
📦 npm: npmjs.com/package/mincut-context
📊 Reproducible benchmarks: eval/CROSS-REPO-RESULTS.md

I'd love feedback — especially "your numbers don't replicate on my codebase" feedback. That's literally what the eval suite is for.


If you got value from this, ⭐ the repo or drop a comment about a tooling problem you're solving. mincut-context is open-source MIT; the eval suite welcomes new fixtures.