惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

aimingoo的专栏
aimingoo的专栏
量子位
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
S
Schneier on Security
Cisco Talos Blog
Cisco Talos Blog
T
ThreatConnect
J
Java Code Geeks
博客园 - 司徒正美
A
Arctic Wolf
T
True Tiger Recordings
C
Cybersecurity and Infrastructure Security Agency CISA
Cyberwarzone
Cyberwarzone
Know Your Adversary
Know Your Adversary
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
Recorded Future
Recorded Future
P
Palo Alto Networks Blog
The Hacker News
The Hacker News
The Register - Security
The Register - Security
S
Securelist
www.infosecurity-magazine.com
www.infosecurity-magazine.com
C
CXSECURITY Database RSS Feed - CXSecurity.com
Application and Cybersecurity Blog
Application and Cybersecurity Blog
I
Intezer
P
Privacy & Cybersecurity Law Blog
Scott Helme
Scott Helme
K
Kaspersky official blog
博客园 - 聂微东
Last Week in AI
Last Week in AI
V
V2EX
小众软件
小众软件
F
Fox-IT International blog
Martin Fowler
Martin Fowler
Apple Machine Learning Research
Apple Machine Learning Research
T
Tenable Blog
F
Future of Privacy Forum
Microsoft Security Blog
Microsoft Security Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
腾讯CDC
Stack Overflow Blog
Stack Overflow Blog
C
Check Point Blog
阮一峰的网络日志
阮一峰的网络日志
GbyAI
GbyAI
T
Threatpost
I
InfoQ
P
Proofpoint News Feed
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
T
Tor Project blog
G
GRAHAM CLULEY
D
DataBreaches.Net

DEV Community

Designing Configuration for Scalable Treasure Hunts SSH Login Delays: The 10-Second Wait That Drives Us Crazy Building Production Multi-Agent Workflows in n8n: What 50 Deployments Taught Us Trishul SNMP Suite 2.0.1: Better MIBs, Traps, and SNMP Labs How I built a production AI SaaS as a solo developer Auto-labelling 1.2M robotics frames with VLMs: a failover story India’s Laws Were Not Built for AI — And Courts Are Filling the Gap skill-insp: A Skill That Scores Other Skills Clprolf Minimalist Messaging in the Age of AI What's actually in a good .cursorrules file? I built 10 of them — here's what I learned Building Strong Python Basics – Loops, Functions and Logic How to Choose the Right Tech Stack for Your Project I built a free multi-tab JSON editor — here's what I learned HTTP Headers Every Developer Should Know (2026) Building Cross-Platform Digital Products: Challenges and Best Practices Data Privacy in the Age of AI: How Product Teams Can Build Trust with Users What Would WordPress Look Like If It Were Designed Today? Why Backup Success Does Not Mean Database Recoverability Local AI Office Assistant That Never Sends Your Documents to the Cloud Building TaskForge: Translating Enterprise Chaos into an Open-Source Scheduler Tesla P40 in a Homelab: 24GB of Inference on a Budget Llama 4: Meta's Latest — Scout, Maverick, and the MoE Revolution George Hotz called AI code 'slop.' He's half right. Como Construir um Fluxo de Trabalho Baseado em Engenharia de Prompt e Automação We Audited Our Agent Tool-Call Traces. Half Our Eval Data Was Garbage. The Hidden Cost of Downtime: How SRE Error Budgets Protect National Economic Infrastructure Getting started with openHUMANS can be an exciting venture for developers looking to create innovative applications in the realm of human-ce Stack Overflow: A Powerful Community for Developers and Learners From Language Models to Humanoid Minds ✨ Road to Senior #2: How Computers Think in Numbers Why LLM debugging fails on fragmented repository context How to Deploy a LangGraph Agent on AWS Bedrock AgentCore An outreach kit for solo founders whose drafts can't hallucinate Open Satchel is live Amy Kwalwasser and the Growing Importance of Quantum Risk Modeling I Built ShellReq - A Native API Client for VS Code & Terminal If Microsoft and Uber can't afford AI coding, what chance do the rest of us have? MADCAP: Building a Multi-Agent Debate CLI That Argues With Itself So You Don't Have To Why most AI fails at IDOR (and how AMAS fixes it with causal reasoning) How to Audit a Laravel Codebase You've Inherited LangGraph 워크플로우 템플릿 (v34) BugBench: a developer origin story and practical guide for VS Code / Kiro users A solution to messy token systems for Next.js A NestJS reference app that proves the nest-native stack under realistic backend pressure Observability for AI Systems: Monitoring Drift, Hallucinations, and Reliability in Production I Thought “Data Analyst” Was the Whole Game… Then I Entered the Data Avengers Office 👀 Create and configure network security groups How to analyze the cost of Kafka? How I Shipped 2,500+ Commits With AI Agents Using a 12-Phase Workflow [Boost] We built MDCMS, a Markdown-first CMS for teams using AI agents Zero Heap Allocations at 1.18 GB/s: Deep Dive into ForgeZero 4.0.x The Minimum Viable Test Suite for Working with Agents Why Perplexity Started Citing My Blog: 5 Changes That Actually Worked Sync Supabase via OAuth: No Connection String Needed I asked three AI models the same API question. Only one had it right. Implementing Saga Pattern With Lambda Durable Function Why does AI forget what you said (and how to fix it) I built a daily Wordle-style game for AI tools - Here's how Mapping Polish company structures: querying KRS direct via API Built tmpdrop — a tiny self-hosted ephemeral file drop Running Local LLM - 0$ Personal Agentic AI Assistant - Part 3 LLD Object-Oriented Design: Interfaces & Abstract Classes (Designing Contracts) The Smaller Ship: Vitalik, the Ethereum Foundation's Restructuring, and What It Leaves for Investors Looking for 4 people to build something weird with me Building a Local-Only RAG System with Ollama and TypeScript The False Positive Tax: a 1:1 TP:FP analysis of eslint-plugin-security What's new in Data Preprocessor 1.5.x — R codegen, Robust Scaler, and a deadlock post-mortem How I self-hosted my Flask app on an old laptop for almost free I built a free DSA interview prep site because I was tired of the existing options I built an AI agent that migrates Next.js Pages Router to App Router Prisma Query Logging and PostgreSQL: Where the ORM Ends and the Database Begins Prisma query logging y PostgreSQL: dónde termina el ORM y empieza la base From Browser to Server : The Journey of an HTTP Request (Demystifying the Web’s Infrastructure) Santa Augmentcode Intent Ep.6 I Benchmarked 17 ESLint Security Plugins. Only One Found Every Vulnerability. How to Build a High-Performance Image Optimization Pipeline in 5 Minutes 50 Linux Commands Every DevOps Engineer Must Know Less Toil, More Flow - Automating the Path from Request to Implementation The Code Review Checklist I Actually Use How I run a small blog on Astro 5 + Content Collections Git: Best Practices for Professionals How IBM Bob Became My Everyday Coding Companion Solana Passkey Wallet: Replacing Seed Phrases with SIMD-0075 I built a small browser puzzle game about arrows I wrapped Claude Code in a zsh function. Here's every decision I almost got wrong. Mobile Game Optimization: A Unity Developer's Checklist Git: Best Practices for Beginners Three days I lost chasing a ghost that was already dead on disk Why Too Many Parts Hurt ClickHouse Performance Guardrails for Agent Output: Pluggable Validation Before and After LLM Calls Gemma Forge: Local AI Without the Setup Wall From Half‑dead Prototype to Local‑Only AI Medical Assistant: Rewiring MedClinic with GitHub Copilot Runninig a forkbomb in Jenkins What’s Actually Happening When You Use Git Preventing Recursive Tool Loops in LangChain Agents Building a Rock-Paper-Scissors CLI with TypeScript — Union Types, Conditionals, and Jest Your AI Coding Agent Wastes 80% of Its Context. Fixed That with Graph Theory. Why Flutter Has Become the Go-To Framework for Fintech App Development
A 3-layer memory system that gives Claude Code persistent context across sessions.
Ssanvi Build · 2026-05-26 · via DEV Community

Claude Code forgets everything between sessions. You explain your project, your architecture, your preferences and next session it's ground zero again.

I spent 45 minutes setting this up manually. Debugging ports, copying tokens, configuring MCP servers by hand. Then days debugging the Smart Connections integration.

So I built ObsiForge, one command that does all of it:

obsiforge init --name myproject --path ~/vaults/myproject

Enter fullscreen mode Exit fullscreen mode

30 seconds + 2 plugin clicks. The rest of this article explains what it configures and why each layer matters.

The architecture

Three layers, each with a distinct job. No duplication. No redundancy.

  • Layer 3: MEMORY.md (pointers only) → "Where to find things"
  • Layer 2: Obsidian vault (knowledge) → "What we know"
  • Layer 1: claude-mem (observations) → "What happened this session"

Layer 3 - MEMORY.md (pointers, not content)

Each project gets a MEMORY.md that points to where knowledge lives. It never stores knowledge itself, just pointers and rules.

This is the only file Claude reads on every session start. ~800 tokens to know where everything is.

The rules are simple:

  • NEVER store actual knowledge here, only pointers.
  • Project context goes to the Obsidian vault (Layer 2).
  • Session observations go to claude-mem (Layer 1).
  • This file is a pointer only.

The search tools are listed here:

  • Semantic search: mcp__smart-connections__search_notes
  • Specific file: mcp__obsidian-mcp-tools__get_vault_file
  • Session history: mcp__plugin_claude-mem_mcp-search__search

Layer 2 — Obsidian vault (project knowledge)

Each project has a vault with self-contained notes. No wikilinks, no "see also", each note has full context.

  • project.md — architecture, stack, decisions, roadmap
  • user.md — preferences, code style, what works/doesn't
  • MEMORY.md — search tools, session lifecycle, vault index
  • [other].md — assessments, gap analyses, ADRs

Smart Connections indexes these at block level (384-dim embeddings via bge-micro-v2, running locally inside Obsidian). The MCP server exposes search_notes to Claude Code.

Layer 1 — claude-mem (session observations)

Automated capture via hooks. Every tool use gets logged as an observation. At session start, the last 50 observations get injected into context. At session end, /consolidate distills what matters into the vault.

SQLite + Chroma (vector DB) running locally. Zero cloud dependencies.

The session lifecycle

START → /dashboard

  • Read 3 core notes (project.md, user.md, MEMORY.md)
  • Check claude-mem for recent activity
  • Smart Connections semantic search for related context
  • Brief the user

WORK → normal Claude Code session

  • claude-mem captures observations automatically
  • Smart Connections available for semantic lookups
  • Vault notes readable/writable via MCP

END → /consolidate

  • Search claude-mem for session observations
  • Filter: what belongs in vault vs. what stays in claude-mem
  • Update vault notes
  • Report what was consolidated

The Smart Connections problem

The Smart Connections plugin for Obsidian generates embeddings fine, bge-micro-v2, 384 dimensions, block-level indexing. The problem was the MCP server that bridges Claude Code to those embeddings.

Its search_notes function was doing regex matching. The code literally said:

"For now, we'll do a simple keyword match since we don't have a way to generate embeddings for arbitrary text without the model."

Every time Claude Code "searched semantically," it was grepping. A query like "how to handle offline data sync" would only find notes containing those exact words, not notes about "event sourcing" or "conflict resolution" which is the entire point of embeddings.

The other search tools (get_similar_notes, get_embedding_neighbors) do real semantic search, but they require a pre-existing note path or a raw 384-dim vector. search_notes is the only tool that accepts free text, and it's the one MCP clients call most often.

The fix

I added @xenova/transformers to the MCP server and rewrote searchByQuery to generate embeddings from the query text, then find nearest neighbors by cosine similarity.

Before (regex): 0 results — no note contains those exact words.

After (embeddings): 5 results — similarity 0.60-0.63, conceptually relevant.

I used Claude Code itself to diagnose the bug and write the fix. This is now PR #7 on the original repo.

This is the kind of thing ObsiForge's doctor command catches, if your semantic search isn't actually semantic, you want to know before you rely on it.

Token economics (real data)

Measured from an active project with 21 vault notes totaling 243KB:

Layer What Token cost
Layer 3 MEMORY.md (pointers) ~800 tokens, loaded every session start
Layer 2 3 core vault notes ~5,300 tokens on /dashboard call
Layer 1 claude-mem (50 observations) ~5,000-15,000 tokens auto-injected at start
Layer 2 Smart Connections query ~300-800 tokens on demand per search
Layer 2 Additional vault note ~1,000-4,500 tokens on demand per read

Without memory (baseline): re-explaining project context costs ~3,000-6,000 tokens per session. And Claude still doesn't have the full picture.

With memory: ~6,100-20,000 tokens for full context. But you get:

  • Architecture decisions that would take 10+ minutes to explain.
  • Bug fixes from 3 weeks ago that Claude remembers.
  • User preferences without re-stating them.

Without memory, to get the same context you'd need to manually re-explain:

What you'd re-explain Token cost
Project architecture ~4,200 tokens
User preferences ~640 tokens
Past decisions and why ~2,700-10,000 tokens
What was tried and what failed ~5,000-15,000 tokens
Total without memory 10,000-30,000 tokens per session

And the real problem isn't the cost, it's that you wouldn't remember what to include. The session from 3 weeks ago where approach X didn't work? You wouldn't think to mention it. The architectural decision from last month? You'd summarize it differently each time.

Real token savings: 1.5-2x less per session. But what's priceless is that claude-mem captures "we tried X and it failed because Y" knowledge you wouldn't re-explain because you simply wouldn't remember it.

The key insight: we never load all 60,700 tokens of the vault. The pointer file (800 tokens) tells Claude where things are, and it pulls knowledge on demand via semantic search. Most sessions need 2-3 vault reads and 1-2 searches.

Session type Searches Reads Tokens used
Quick (1 search, 1 read) 1 1 ~7,100
Deep (3 searches, 5 reads) 3 5 ~18,000
Continuation (dashboard only) 0 3 ~6,100

claude-mem's context injection (~5,000-15,000 tokens for 50 observations) is the main cost. But "we already tried X and it failed because Y" saves far more tokens than it costs.

What's not automatic (and what I did about it)

Consolidation is a skill, not a script.

The /consolidate skill requires Claude's judgment: what knowledge belongs in the vault (permanent) vs. what stays in claude-mem (session-level). A script can't decide if "we switched from REST to gRPC" is a vault-worthy architectural decision or a temporary debugging note.

What I did automate: at session end, a Stop hook runs that checks whether you're in a project with an Obsidian vault and reminds you to consolidate. The actual distillation still needs Claude's judgment, but you no longer have to remember to run /consolidate, the system reminds you.

Cross-project boundaries are solved by separate sessions. Each project gets its own vault, its own MEMORY.md, and its own .mcp.json. When you work on project A, Claude loads project A's context. When you switch to project B, you open a new Claude Code session in project B's directory. No cross-contamination. claude-mem is global (it sees all sessions), but /consolidate filters by project, so observations go to the right vault.

Re-indexing happens inside Obsidian. If you add notes outside Obsidian, re-open Obsidian to trigger re-embedding. The Smart Connections plugin handles this automatically when it's running.

ObsiForge automates all of this

The 3-layer architecture is a pattern, not a product. You can set it up manually, the steps above work. But the setup is fragile: wrong port, expired token, plugin version mismatch, and nothing works.

ObsiForge handles:

  • Plugin download and configuration
  • MCP server setup with port collision prevention
  • Token generation and configuration
  • MEMORY.md template with search tools
  • /dashboard and /consolidate skills
  • Multi-vault support (separate ports, separate configs)
  • obsiforge doctor — 9 health checks to diagnose any setup issue
uv tool install https://github.com/ssanvi-builds/ObsiForge
obsiforge init --name myproject --path ~/vaults/myproject

Enter fullscreen mode Exit fullscreen mode

Enable 2 community plugins in Obsidian Settings (security requirement, not bypassable). Done.

If you want to set it up manually, the steps in the original guide still work, but you'll probably spend some time instead of 30 seconds.

github.com/ssanvi-builds/ObsiForge

Takeaways

  1. Verify your semantic search is actually semantic. The search_notes tool was regex for months. If your "semantic search" only finds exact keyword matches, it's not semantic.
  2. Pointers over copies. MEMORY.md stores ~800 tokens that point to 60,700 tokens of knowledge.
  3. Self-contained notes beat wikilinks. Each vault note has full context. No "see X for details" that breaks when X moves.
  4. The lifecycle matters more than the storage. /dashboard → work → /consolidate is the loop. Without it, knowledge rots.
  5. Consolidation needs judgment. Automate the reminder, not the action. A script can't decide if "we switched frameworks" is vault-worthy or just a session note.
  6. Separate projects, separate sessions. One vault per project, one Claude Code session per project. No cross-contamination.
  7. Open source tools can have quiet gaps. The MCP server's most-called tool was its least-tested for semantic correctness. Test your tools' claims.
  8. Manual setup takes 45+ minutes and breaks easily. Automated setup takes 30 seconds. Use the tool.

The smart-connections-mcp fix is PR #7 on an MIT-licensed project.

ObsiForge is open source and free: github.com/ssanvi-builds/ObsiForge

The 3-layer architecture is just a pattern, no special code needed, just discipline about what goes where.

If you're building something similar, the key question isn't "what tool do I use?" but "what does each layer store, and who reads it?" Get that right, and the tools are interchangeable.