InertiaRSS Track and read blogs, news, and tech you care about
Read Original Open in InertiaRSS

Recommended Feeds

The GitHub Blog
The GitHub Blog
aimingoo的专栏
aimingoo的专栏
WordPress大学
WordPress大学
Vercel News
Vercel News
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
博客园 - 【当耐特】
博客园 - Franky
Apple Machine Learning Research
Apple Machine Learning Research
月光博客
月光博客
博客园 - 聂微东
Blog — PlanetScale
Blog — PlanetScale
雷峰网
雷峰网
L
LangChain Blog
腾讯CDC
GbyAI
GbyAI
博客园 - 叶小钗
Hugging Face - Blog
Hugging Face - Blog
小众软件
小众软件
罗磊的独立博客
Engineering at Meta
Engineering at Meta

博客园 - iTech

7万星的AI交易框架:让大模型模拟投行多空辩论,自动做交易决策 71000颗星的AI交易团队:让大模型模拟投行分工,自动做交易决策 13400颗星的开源项目:输入一句话,AI全自动帮你做短视频 102颗星的沙盒:当AI学会自己写代码、跑测试、做部署 AI 技术日报 - 2026-05-08 29k 星的 PageIndex:不用向量数据库,靠推理就能做 RAG 每天花两小时刷信息?这个开源项目帮你全自动搞定 读源码像读小说?试了 DeepWiki 和 Zread,我再也不想裸读 GitHub 了 Matt Pocock 开源的这套 .claude 技能,为什么让工程师集体上头? Cursor Team Kit:Cursor 官方团队在用的 17 个 AI 工作流 AI 技术日报 - 2026-05-07 AI 技术日报 - 2026-05-06 AI 技术日报 - 2026-05-05 Anthropic CEO 说 12 个月内程序员要失业,我扒完他的底牌,发现事情没那么简单 把工程师的肌肉记忆装进 Claude Code,这个 4300 Star 的项目我后悔没早用 AI 技术日报 - 2026-05-04 AI 技术日报 - 2026-05-03 AI 技术日报 - 2026-05-02 六大 Agent 框架横评:谁支持 Skills?谁能自动创建 Agent?MCP 呢? Wechatsync:一个 Chrome 插件,一键把文章同步到 31 个平台 LangChain 开源了 Open SWE:Stripe、Ramp、Coinbase 内部都在造的编程 Agent Cockpit:把 Claude Code 从终端里搬出来,装进浏览器 Cursor 把自家的 AI Agent 开放了:写几行 TypeScript 就能调 Cursor 干活 AI 技术日报 - 2026-05-01 AI 写代码每次结果都不一样?Archon 用 YAML 工作流把 AI 编程变成流水线 AI 写代码比你快了,但你还是得学编程——只不过学法得换 腾讯的龙虾特工队:4 个 AI Agent 同日更新,全家桶正式成型 Agno 不做更聪明的 Agent,它要把所有 Agent 框架包进同一个操作系统 Hermes Agent 终于有了像样的 Web 界面,而且还支持远程访问 Datawhale 出了一套 29 学科知识地图,把 AI 的底牌全掀了 Hermes Agent 在聊天框里就能用的 20 种高级功能 一份 AGENTS.md 能顶一次模型升级?Augment Code 用数据说了算 NVIDIA 开源了一个「AI 沙箱」,20K Star,让 Agent 跑代码不再裸奔 60ms 冷启动、5MB 内存:腾讯开源的这个沙箱让 Docker 安全隔离像笑话 AI 技术日报 - 2026-04-30 AI 技术日报 - 2026-04-29 AI 技术日报 - 2026-04-28 Goose:Linux 基金会亲儿子,能撼动 Claude Code 和 OpenCode 吗? AI 技术日报 - 2026-04-27 AI 技术日报 - 2026-04-26 Google 把价值20美元/月的东西免费了,102K人已经抢到了 OpenClaw 和 Claude Code 网络搜索配置指南 AI 技术日报 - 2026-04-25 Anthropic 为什么遥遥领先:从 Cat Wu 专访看AI霸主的底层逻辑 Mac 本地跑大模型完全指南:你的苹果电脑就是 AI 工作站 同样 70B 参数,为什么 MoE 只激活 13B 就能打平 Dense? DeepSeek-V4 技术报告里藏着一条线:华为昇腾 NPU 已完成推理验证 DeepSeek-V4 深夜炸场:1M 上下文、384K 输出、双模型,API 定价直接卷到底 MacBook Air 跑大模型实测:Ollama、llama.cpp、LM Studio 谁才是本地推理之王? AI 技术日报 - 2026-04-24
llm-wiki: Equip AI Agents with auditable research brains (710 stars)
iTech · 2026-06-21 · via 博客园 - iTech

llm-wiki: Install an AI Agent with an auditable research brain, how to play the 710 stars Claude Code plug-in

Collect it first, and try it on Claude Code later.

Do you have this feeling: use Claude Code or Codex to do technical research, search it every time you ask, and start from scratch the next time you ask the same thing? After chatting for three days, the context exploded, and the Agent lost his memory. The information he dug out and the pits he had stepped on before were all gone.

I recently discovered that a project solved this problem completely--nvk/llm-wiki, an open source tool that allows any AI agent to compile a knowledge base. 710 stars, MIT protocol, written by Python, the core idea is in one sentence:Let the Agent himself condense the research process into a traceable wiki, and check it directly next time without having to search again.

Inspired by the LLM wiki concept mentioned by Andrej Karpathy, it is implemented very engineering-it is not a notebook for you to see, but a "second brain" for the Agent's own use.

Outline of this article

  1. What problems does llm-wiki solve?
  2. Core mechanism: Parallel multi-agent + traceability wiki
  3. Five runtimes: Claude Code is a first-class citizen
  4. Quick overview of core commands
  5. What does a complete research process look like?
  6. Its essential difference from RAG
  7. Who is suitable for use? Don't worry.

What problems does llm-wiki solve?

Talk about it firstnotWhat, to avoid misunderstandings:

  • Not a code generation accelerator, not a HumanEval/MBPP
  • It is not an Obsidian plug-in for you to make reading notes (although the output format is compatible with Obsidian)
  • Not a RAG framework (this will be explained in detail later, the difference is critical)

The problem it really solves is:Agent research is "one-time"

Give me a real scene. You need to study the "threat model of hardware wallets" and ask Claude Code to check it. It will search the web, read articles, and give you a review. But this review disappeared after the chat. Next week you want to continue to study "ColdCard this wallet specific attack surface", Agent can not remember what to check last week, have to search from scratch.

What llm-wiki does: compiles the results of the first research into aStructured wiki directory(Markdown file + source code reference + index), stored locally. The next time the Agent is asked a relevant question, check the wiki first, use it directly if it hits, and search if it doesn't hit. And each conclusion can be traced back to its original source.

MERMAID_BLOCK_0

The picture above shows its work cycle: check the wiki for problems first, and return directly if you hit them; start parallel research when you don't hit, and automatically compile them into the wiki after you finish the research. The second time I asked the same question, I took the fast green path.

Core mechanism: Parallel multi-agent + traceability wiki

These are the two most interesting designs of llm-wiki.

Research on Parallel Multiple Agents。a /wiki:research When the command goes down, it does not send an agent to search serially, but sends 5 to 10 agents to retrieve different subtopics in parallel. normal mode --deep Open 8, limit mode --retardmax If you run 10 agents at the same time, you will also "snowball"-agents will be automatically sent to dig deep into new sub-themes discovered in each round.

How long can an order last? parameters --min-time 1h Indicates at least one hour of study. You can issue a command before leaving work and come back the next morning to collect a compiled complete wiki. Behind this is Claude Code's 200K context window, and only a single agent can hold all the materials for the next round of research.

Traceable wiki compilation。After the research is over, I will not directly throw you a summary, but compile it into a directory structure:

topics/
  hardware-wallet-threat-models/
    raw/              # 原始来源(URL、PDF、截图)
    notes/            # agent 的工作笔记
    articles/         # 编译出的结构化文章
    inventory/        # 资产清单
    datasets/         # 数据集(如果有)
    .sessions/        # 会话快照 + 用户反馈

key is raw/ and articles/ - each compiled conclusion can be traced back to the original source file. This leads to the core difference between it and RAG.

Its essential difference from RAG

The first reaction of many people was: "Isn't this RAG? Vector retrieval + generation."

No. difference isWho is organizing knowledge and the form of knowledge

dimension Traditional RAG llm-wiki
knowledge form Vector embedding, scattered in vector library Markdown file, human-readable
organizer Offline embedding process Agent real-time compilation
traceability of Difficult (vector unreadable) Strong (each conclusion points to the raw source file)
auditable Hardly enough /wiki:audit Specializing in trust audits
manual intervention To change the vector and reembed Change directly to Markdown

RAG knowledge is a vector for machines to see. You can't understand or change it. The knowledge of llm-wiki is Markdown for people to see. After the Agent compiles, you directly open the editor and change it. The next time the Agent reads the revised version.

The biggest benefit of this design isauditable。When researching sensitive topics such as security threat models, medical options,/wiki:audit --project coldcard-threat-model Each conclusion will be checked for its source credibility, whether there is any negative evidence missing, and whether the chain of citations is complete. RAG can't do this-you can't audit a vector.

Five runtimes: Claude Code is a first-class citizen

The cleverest design of llm-wiki isOne set of behavioral layers, five runtime shells。Claude Code is the main adaptation object (22K token system prompt, 200K context), but the same wiki protocol can be run on other agents:

runtime installation method System prompt size Suitable for the scene
Claude Code claude plugin install wiki@llm-wiki ~22K tokens Complete agent research
OpenAI Codex codex plugin marketplace add nvk/llm-wiki ~3K tokens OpenAI Ecosystem
OpenCode opencode.json 配置 instructions URL ~3K tokens 多 provider
Pi --instructions SKILL.md ~1K tokens 本地模型
任意 agent 复制 AGENTS.md 看情况 通用兜底

底层逻辑是:Claude Code 的 skills/wiki-manager/SKILL.md 是「行为真理源」,Codex 和 OpenCode 的版本是脚本自动同步生成的(sync-codex-plugin.shsync-opencode-plugin.sh), not two sets of codes maintained by hand. There are test scripts that focus on synchronization consistency and report errors once drifts.

This means that you don't have to change the knowledge base when changing agents. The wiki directory is neutral and follows you, the agent just accesses its front end.

Install Claude Code in one line:

claude plugin install wiki@llm-wiki

Quick overview of core commands

The command design of llm-wiki is very restrained, with these core points:

# 研究:从零创建一个 topic wiki,并行 agent 跑 1 小时
/wiki:research "gut microbiome" --new-topic --min-time 1h

# 深度模式:8 个 agent,跑 2 小时
/wiki:research "fasting" --deep --min-time 2h

# 论文式研究:给一个论点,搜集正反两面证据,最后给判决
/wiki:thesis "fiber reduces neuroinflammation via SCFAs"

# 收集:建带溯源的目录(表情包、工具、实体都行)
/wiki:collect "bitcoin memes" --wiki memes-bitcoin

# 查询:问 wiki,命中直接返回
/wiki:query "How does fiber affect mood?"
/wiki:query "compare keto and mediterranean" --deep

# 摄入:手动加一个来源
/wiki:ingest https://example.com/article

# 审计:检查一个 project 的引用链和可信度
/wiki:audit --project coldcard-threat-model

In addition to "explicit commands" with colons, there are also fuzzy routers. directly /wiki what do we know about CRISPR? It can be recognized as query,/wiki add https://... Recognize it as ingest. This is to conform to people's natural speaking habits.

The query is also divided into depth.--deep Cross-references will be made to put together relevant conclusions from multiple topics and compare them, rather than just looking up a single topic.

What does a complete research process look like?

String the above together, and a real research closed loop runs like this:

MERMAID_BLOCK_1

In the first week, the research command was issued, and eight agents dug in parallel for one hour, compiled into a wiki and stored locally. In the second week, query directly, hit the local wiki and returned in seconds, with a link to the original source.

Here is a detail worth mentioning:Session snapshots and feedback capture(Added in v0.11 and v0.12). The Agent will automatically desensitize and save the key content of each session in the .sessions/If you correct its preference and deny it halfway, it will also be captured as feedback candidates. The next time you open a new session, the Agent rehydrates these snapshots first, which means remembering your research history. However, these snapshots will not automatically enter the official wiki, and you will be required to explicitly @wiki feedback promote Only then can we become regular and avoid noise pollution to the knowledge base.

Who is suitable for use? Don't worry.

suitable person

  • If you use Claude Code to conduct in-depth technical/academic research, the research cycle is long and requires precipitation.
  • Those who do security research and threat modeling need to be traceable and auditable.
  • Enterprise knowledge base maintainers want to systematize scattered documents
  • People switching across multiple agent tools need a neutral, portable knowledge layer

Let's not rush

  • If you only do one-time questions and answers, RAG or direct search will be lighter
  • There is no need for agents like Claude Code / Codex at all. AGENTS.md can be used at all, but the experience is discounted.
  • sensitive to disk space--raw/ Original sources will be preserved, and in-depth research will take up a lot of space

The most suitable scenario for llm-wiki isLong cycle, multiple rounds, requiring credibilityresearch. One-time tasks are heavier.

Pay attention to a few practical exercises

Know these pits before installing:

iCloud users pay attention to permissions。Many people put their wiki directories on iCloud to sync across devices. macOS privacy controls will get stuck--stat You can succeed but read wikis.json daily Operation not permitted。The solution is not to change the local path, but to open Full Disk Access to the app that launches the agent, and then use the /wiki config hub-path Explicitly specify iCloud paths, don't rely on the default ~/wiki

Sandbox environment (nono) requires additional permissions。The wiki directory is outside the project and cannot be read by default by sandbox. Claude Code / OpenCode needs to be added $HOME/.config/llm-wiki Read permissions + read and write permissions to the wiki directory;Codex also adds additional $HOME/.codex Read and write (plug-in cache needs to be written).

Update Don't Use SSH。Use the agent upgrade plug-in sandbox gh auth login --web --git-protocol httpsAvoid SSH host-key prompts getting stuck.

version synchronization。if claude plugin update The new version was not pulled (the marketplace cache is stale), so README gave the manual synchronization script, from the warehouse clone to the plug-in cache directory, and restarted Claude Code to take effect.

Reference documents and links

Does your Agent research from scratch every time now? Try installing llm-wiki, and come back to the comment area to talk about the effect. If you think it is useful, just like it so that more people can see it.


author: itech001
source: Public Account: AI Artificial Intelligence Era
website: _ _ JHSNS _ URL _ 0 _ _
Share the most cutting-edge AI news and technical research every day.

This article was first published in the era of AI artificial intelligence. Please indicate the source for reprinting.