惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
Recorded Future
Recorded Future
T
Tenable Blog
S
Securelist
C
CERT Recently Published Vulnerability Notes
T
Threatpost
S
Schneier on Security
A
Arctic Wolf
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
Know Your Adversary
Know Your Adversary
P
Privacy International News Feed
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Register - Security
The Register - Security
Cisco Talos Blog
Cisco Talos Blog
AWS News Blog
AWS News Blog
K
Kaspersky official blog
T
True Tiger Recordings
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
P
Palo Alto Networks Blog
T
The Exploit Database - CXSecurity.com
小众软件
小众软件
B
Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Microsoft Azure Blog
Microsoft Azure Blog
Cyberwarzone
Cyberwarzone
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tor Project blog
Spread Privacy
Spread Privacy
Malwarebytes
Malwarebytes
P
Proofpoint News Feed
F
Fox-IT International blog
F
Fortinet All Blogs
P
Privacy & Cybersecurity Law Blog
G
GRAHAM CLULEY
量子位
Latest news
Latest news
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 叶小钗
Project Zero
Project Zero
T
Tailwind CSS Blog
N
Netflix TechBlog - Medium
Martin Fowler
Martin Fowler
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
I
Intezer
博客园_首页
腾讯CDC
H
Hackread – Cybersecurity News, Data Breaches, AI and More
D
Darknet – Hacking Tools, Hacker News & Cyber Security

DEV Community

An Intelligence Briefing for the Port of Rotterdam, from a Single Prompt How I Built Semantic Discussion Clustering Without Embeddings (and Why It Was Good Enough) I Built a Real-Time Simulation Game in a Single HTML File (Without React or Custom JavaScript) I Got Tired of SNMP Dev Hell, So I Built Trishul SNMP Suite 98. RAG: Give Your AI Access to Your Documents Why Getting a Tech Job Right Now Feels Broken? The Container Runtime Nobody Told You About (And Four Others) The Singleton Labyrinth Build your first MCP server in TypeScript: the 2026 setup that takes 30 minutes. Check Wallet Balances Across 4 Chains with Zero Dependencies — chain_balance.py Veltrix Was Killing Us With YAML 5 PostgreSQL locking behaviors that trip people up Beyond Monolithic AI: How to Build a Pluggable "Brain" Architecture for Autonomous Agents The Operational Cost of JWT Lifecycle Management: Overlooked Details Mastering Structured JSON Outputs with Gemini API ATR Implements the Detection Layer the NSA Identified as Missing in MCP I tried both Cursor and Antigravity(1.20) - Switching Context - which one is better? Negative Lookups in Bf-Tree: Caching Things That Don't Exist My Struggles as a Software Engineer in 2026 Why Hybrid Metaheuristics Still Beat “Smarter” AI in Real-World Optimization Cómo destacar como JR DEV en tu equipo I got tired of guessing which model holds my VRAM, so I built a tiny dashboard Qwen Is Not Yet Ready to Power Local OpenClaw Deployments Top 7 Featured DEV Posts of the Week Why I got frustrated with AI job search tools and built my own 10 Best Open-Source AI Agents for 2026 Contract Analysis Will Replace Legal Gatekeeping AWS Cloud Shell with Antigravity CLI Building Reliable Event Delivery for XRPL Applications AMTP: HTTP for the Agentic Web — A New Markdown-First Protocol for AI Agents LLM Security Vulnerabilities Engineers Need to Know in 2026 Shared Build Cache: Makes Sense for the Independent Developer? Live Lessons From Running a 5-Minute Polymarket Crypto Bot Cómo Evaluar Agentes IA: Tutorial de LLM-as-Judge Day 2 of Python Learning 🐍 I built a local-first Apple Health recovery briefing that shows its math I Built a REST Microservice With a Database in 3 Files — and Wrote Zero Code 10 Avro Schema Mistakes Even Experienced Developer Do Commit: Refactor background workers and logging pipeline GitHub Actions vs Jenkins vs GitLab CI: A Developer's Honest Comparison (2026) Clean Architecture in MongoDB + C#: Why is the Repository Pattern Alone Not Enough? I Tested 10 More Models. Five Brand New Families Debuted. None Scored Below 75%. I Almost Quit Coding to Become a Welder Understanding Reinforcement Learning with Human Feedback Part 6: How the Reward Model Trains the Original Model # Level Up Your Portfolio with Wowfolio.in: Free, Customizable, Type Inhabitation in Lean: Why “Hello {name}” Can Become a Theorem Mastering Context in Go: A Senior Engineer’s Playbook for Lifecycle Management Solana Transactions Through a Backend Developer’s Eye Agent as a Tool Call: Claude Code's Fork-Exec Pattern How I wired Stripe subscriptions to Supabase in Next.js 15 (the parts tutorials skip) Introduction to A2A and Agent Search Why Doesn't Linux Break Every Week? The "AI" Label Is Losing Its Meaning, and Companies Are the Ones Diluting It Bucky Fuller's To-Do List: Can AI Finally Solve the World's Cataloged Problems? My $10/Month VPS Gets 659 SSH Attacks per Day — Here's What 4 Weeks of Running an Autonomous AI Has Taught Me About Infrastructure Speed Up Your WordPress Site in 30 Minutes: A No-Plugin Performance Guide Breaking Code: The Addiction Nobody in Tech Will Admit To Nobody Reads AI Safety Papers. But 649 People Upvoted a Letter to an LLM. The Pope wrote about me Je vibe-coded app werkt. Maar kan hij ook live? The Event Store That Survived Black Friday Without a Single 5xx Audit-trail-by-construction: a thesis for spec-driven AI coding Day 8 - Sparse embedding - RAG How we made our Mac launcher feel instant by killing slow providers How we made our Mac launcher feel instant by killing slow providers Enterprise AI Agent Orchestration Patterns How to build your first MCP server in 10 minutes Claude Code's plan mode is prompt engineering, not hard enforcement Built a C# AI Agent That Researches Errors and Suggests Fixes From Shell Scripts to MCP Servers: How SEO Broke My Brain (in a Good Way) AI Agent Platform Buyer's Guide: 12 Questions to Ask Before You Sign 🦋 I Built a Living Terminal Animation with Hermes Agent — Here's How It Went. AI Agents Are Coming for Your WordPress Admin Panel, and That's Not a Bad Thing Tailscale + k3s in a 2‑node homelab: why I use Tailscale ONLY for the control plane When NOT to Use AI Agents: A Realistic Framework Human-in-the-Loop Patterns for High-Stakes AI Agent Decisions LLM Cost Optimization for Agent Workflows: A Practical Guide An Evolving Strategy for Knowledge Work: From Human-In-the-Loop to Human-Before-the-Loop Why I Wake Up at 5am to Run (And Why You Might Want To) I Scanned 260 Packages that your are using and Found 43 With Security Vulnerabilities The Easiest Way to Implement Theme Toggling in React 19 using next-themes & Tailwind CSS v4 AI skill testing: yes, your prompts need regression tests Why We Built AnToAnt: Designing Software Before Writing Code How I Built an End-to-End HR Attrition Dashboard Using MySQL & Power BI Why Hytale Treasure Hunt Engines Stumble Before 1,000 Concurrent Diggers: What Veltrix Does Not Document How to Implement Dark/Light Mode with No Flickers in Next.js Building My First Solana Transfer CLI Tool | #100DaysOfSolana What Is OAuth Token Exchange? CLI wrapper for Cloudflare Tunnel with Zero Trust Your Agent Acts Without Checking Your Error Budget — That's the Failure Mode Nobody Is Tracking The Death of the Junior Developer Is Greatly Exaggerated How I Built a Programmatic SEO Site with 16,750 Pages Using FastAPI and PostgreSQL Toward a Standard Model for Agent Memory I Applied SLA Concepts to My Email Inbox — Here's What I Learned Building the Chrome Extension How Spring Data JPA, JPA, and Hibernate work together What useOptimistic Actually Saves You The Vibe Tax: How Unvalidated AI Code Is Flooding the Market and Driving Up Technical Debt Building My First MCP Server with Claude and Python Azure Blob Storage for Beginners: Private Access, SAS Tokens & Cost Savings Explained I'm building a TypeScript data grid where config reads like English
Vectr — Code Intelligence AI Tool
Swapnanil Sa · 2026-05-27 · via DEV Community

You log off for the day after two hours of research. You know the entry point is EvaluateSegments in targeting/segment/evaluator.go. You know the nil visitor_id case is unhandled. You know bidder/auction.go calls this function and can't have its interface changed.

Next morning, Claude Code knows none of that. It starts fresh. It greps, reads files, consumes 8,000 tokens rediscovering what you already found. Every session is day one.

This is the actual friction in AI-assisted development — not the quality of code generation, but the complete absence of working memory across session boundaries.

The problem with how AI assistants use context

On a codebase with 40,000 files, the AI runs rg -l "authenticate", gets 200 results, reads 8 complete files — 12,000 tokens gone for one query. And the next session, it starts over from zero: no memory of what it found, no record of what's still missing.

A 200,000-token context window sounds vast, but a 40,000-file codebase is vastly larger. Assistants compensate by running grep-style searches, finding matching files, then reading entire files to locate the relevant function. Within a session, experienced users manage this. The real problem is across sessions. Every conversation starts empty. Research done Monday is redone Thursday.

Humans solve this differently. A developer who worked on a feature last week doesn't remember every line — but they remember that targeting code lives in targeting/, that segment evaluation has an edge case around nil visitor IDs, and that the auction pipeline calls EvaluateSegments. They remember at different levels of fidelity, and they can re-read the details in seconds when needed. They can afford to forget, because retrieval is fast.

What Vectr does

Vectr is a local codebase indexer that gives an AI assistant the same layered recall capability. It provides three kinds of knowledge — and a memory system for working state.

Layer 1: Codebase map. At startup, Vectr makes one LLM call over the directory structure and README to build a ~300-token plain-English passport. It captures module purposes, tech stack, entry points, and domain vocabulary. Every session, the AI gets this for free via vectr_map — no file reading required.

vectr_map() →
"Go DSP ad server. Main modules: targeting/ (audience matching),
bidder/ (bid logic), tracker/ (event recording).
Entry: bidder/pipeline.go:RunBidPipeline
Domain terms: segment, visitor_id, bid_request, floor_price"

Enter fullscreen mode Exit fullscreen mode

Layer 2: Symbol graph. Vectr uses tree-sitter to extract every function, class, and method into a persistent SQLite-backed graph with call relationships. vectr_locate finds where a symbol is defined — file, line number, kind — without returning any code content. vectr_trace follows the call graph in either direction.

vectr_locate("EvaluateSegments") →
[function] EvaluateSegments  targeting/segment/evaluator.go:45

vectr_trace("EvaluateSegments", direction="callers") →
Called by (2):
  RunBidPipeline  in bidder/pipeline.go:88
  RequestBid      in bidder/auction.go:134

Enter fullscreen mode Exit fullscreen mode

Layer 3: Content search. AST-aware chunks — split at function and class boundaries, never mid-logic — are embedded with Snowflake/snowflake-arctic-embed-m-v1.5 (local, no API key, ~440MB download once). Adaptive hybrid search: vector similarity + BM25 keyword, with weights tuned per codebase fingerprint — small repos lean on BM25, large ones on semantics, static-typed monorepos use graph traversal first. Override with VECTR_EMBED_MODEL=<hf-model-id> for any sentence-transformers compatible model.

vectr_search("nil visitor_id handling segment evaluation") →
[1] targeting/segment/evaluator.go  lines 45-89  score 0.934
    symbol: EvaluateSegments
    ...

Enter fullscreen mode Exit fullscreen mode

The part that's actually new: working memory

The layer that makes Vectr different from every other code search tool is the bidirectional protocol between the AI and the memory store.

vectr_remember lets the AI offload a working note:

vectr_remember(
  "Implementing segment targeting. Entry: EvaluateSegments() in evaluator.go:45.
   Need to add nil guard for visitor_id before line 61.
   bidder/auction.go calls this — cannot change its interface.
   Missing: integration test for multi-segment visitor with expired segments.",
  tags=["segment-targeting", "wip"],
  priority="high"
)
→ "Stored note #4. You can safely drop related code chunks from context."

Enter fullscreen mode Exit fullscreen mode

The AI can now discard the code chunks from its context window. Vectr has them and will return them in under 50ms.

vectr_evict_hint makes this explicit. When the AI has accumulated a session's worth of retrieved content, Vectr proactively tells it what to drop:

vectr_evict_hint() →
"Vectr has 6 chunks (~3,840 tokens) indexed and instantly retrievable.
You can safely drop these from your context window:
  targeting/segment/evaluator.go  [lines 40-110 (EvaluateSegments)]
  bidder/auction.go  [lines 88-134 (RequestBid)]
Recall latency: <50ms. Nothing will be lost."

Enter fullscreen mode Exit fullscreen mode

Next morning:

vectr_recall("segment targeting") →
[HIGH] [seg, wip] (14h ago)
  Implementing segment targeting. Entry: EvaluateSegments() in evaluator.go:45.
  Need to add nil guard for visitor_id before line 61.
  bidder/auction.go calls this — cannot change its interface.
  Missing: integration test for multi-segment visitor with expired segments.

Enter fullscreen mode Exit fullscreen mode

Three MCP calls, roughly five seconds, and the AI is fully context-loaded — without re-reading any code.

How to run it

Two install options depending on your environment.

Option A — pip (recommended for individual developers):

pip install git+https://github.com/swapnanil/vectr

cd /path/to/your/project
vectr start

Enter fullscreen mode Exit fullscreen mode

Option B — Docker (for servers and CI pipelines):

git clone https://github.com/swapnanil/vectr
docker-compose up api

Enter fullscreen mode Exit fullscreen mode

On first run, Vectr downloads the embedding model (~440MB), indexes the workspace, builds the symbol graph, and writes MCP configuration files for Cursor and Claude Code. No configuration files to write, no environment variables required for local-only use.

Other CLI commands:

# Stop and restart on a different workspace
vectr restart --path /path/to/other/project

# Write CLAUDE.md + .mcp.json without starting the server
vectr init

# Stop the server
vectr stop

# Search from the terminal
vectr search "JWT token validation"

Enter fullscreen mode Exit fullscreen mode

If you set ANTHROPIC_API_KEY (or OPENAI_API_KEY + LLM_MODEL), Vectr also builds the codebase passport on startup — one LLM call, ~$0.005, cached permanently.

Once running, Claude Code and Cursor automatically use the ten MCP tools (vectr_map, vectr_locate, vectr_trace, vectr_search, vectr_remember, vectr_recall, vectr_evict_hint, vectr_snapshot, vectr_snapshot_list, vectr_status) without any manual configuration. The MCP server runs at localhost:8765/mcp — any compatible client connects with two lines of JSON config.

Benchmark results: Camel Run 2

To measure the cross-session memory benefit, the benchmark uses a two-phase design: Phase 1 explores the codebase and stores notes with vectr_remember; Phase 2 opens a cold session, calls vectr_recall(), and implements. Vanilla Phase 2 re-reads from scratch.

The Camel codebase is 5,856 files of enterprise Java — the kind of thing where the model has no meaningful training coverage.

Task Vanilla Phase 2 Vectr Phase 2 Cost Δ Tool calls Δ Output
custom_component $0.56 · 134s · 51 tools $0.36 · 195s · 11 tools −35% −78% 0 bytes (failure) vs 9,398 bytes (5 files)
route_policy $1.15 · 430s · 59 tools $0.35 · 177s · 16 tools −70% −73% both 280-line impl
type_converter $0.48 · 187s · 25 tools $0.20 · 86s · 11 tools −57% −56% both working
Totals (Camel) $2.19 · 751s · 135 tools $0.92 · 458s · 38 tools −58% −72% −40% input tokens

The custom_component result shows the failure mode most clearly: vanilla ran out of context budget navigating the unfamiliar Java package hierarchy and produced nothing. Vectr's Phase 2 started with structured notes from Phase 1 — ~200 tokens replacing hundreds of re-discovery tool calls — and delivered a complete 5-file implementation.

route_policy shows the efficiency case where both sides succeeded: 3× cheaper, 2.4× faster.

Vectr helps in proportion to how much re-discovery work Phase 2 would otherwise do. Single-session tasks on well-known codebases see minimal benefit. Large unfamiliar codebases and cross-session continuation tasks see the most.

Django results were mixed: complex ORM internals showed −24% tokens, −60% cost; well-known APIs where the model already has training coverage showed no benefit. The mechanism is the same in both cases — Vectr just doesn't help where re-discovery cost is already low.

A session with the full stack

Morning — session start (3 calls, ~5 seconds):

vectr_map()                                          → structural overview (247 tokens)
vectr_recall()                                       → yesterday's notes, verbatim
vectr_locate("EvaluateSegments")                     → file:line, no code read

Enter fullscreen mode Exit fullscreen mode

During the session:

vectr_search("visitor_id nil handling")              → 3 chunks, 580 tokens
vectr_trace("EvaluateSegments", direction="callers") → 2 callers identified

Enter fullscreen mode Exit fullscreen mode

End of session:

vectr_remember("Segment targeting done...")          → note stored
vectr_evict_hint()                                   → drops 3,840 tokens of chunks
vectr_snapshot("segment-targeting-day1")             → full session saved

Enter fullscreen mode Exit fullscreen mode

Full context in three calls, five seconds. No file reading on reconnect.

What's next

Vectr is open source at github.com/swapnanil/vectr. The current build supports Python, JavaScript, TypeScript, Go, Rust, and Java for AST chunking and symbol extraction. Planned: adaptive retrieval strategy selection based on codebase fingerprint (Java monorepos benefit from graph traversal; dynamic Python codebases respond better to semantic search), and LLM-generated symbol descriptions generated lazily on first access.

If you work on a large codebase and your AI assistant spends the first five minutes of every session re-reading the same files, try Vectr. The full tool page is at swapnanilsaha.com/tools/vectr/.