llm-wiki: Equip AI Agents with auditable research brains (710 stars)

Recommended Feeds

The GitHub Blog

aimingoo的专栏

WordPress大学

Vercel News

freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

Apple Machine Learning Research

月光博客

雷峰网

小众软件

博客园 - iTech

llm-wiki: Equip AI Agents with auditable research brains (710 stars)

iTech · 2026-06-21 · via 博客园 - iTech

llm-wiki: Install an AI Agent with an auditable research brain, how to play the 710 stars Claude Code plug-in

Collect it first, and try it on Claude Code later.

Do you have this feeling: use Claude Code or Codex to do technical research, search it every time you ask, and start from scratch the next time you ask the same thing? After chatting for three days, the context exploded, and the Agent lost his memory. The information he dug out and the pits he had stepped on before were all gone.

I recently discovered that a project solved this problem completely--nvk/llm-wiki, an open source tool that allows any AI agent to compile a knowledge base. 710 stars, MIT protocol, written by Python, the core idea is in one sentence:Let the Agent himself condense the research process into a traceable wiki, and check it directly next time without having to search again.

Inspired by the LLM wiki concept mentioned by Andrej Karpathy, it is implemented very engineering-it is not a notebook for you to see, but a "second brain" for the Agent's own use.

Outline of this article

What problems does llm-wiki solve?
Core mechanism: Parallel multi-agent + traceability wiki
Five runtimes: Claude Code is a first-class citizen
Quick overview of core commands
What does a complete research process look like?
Its essential difference from RAG
Who is suitable for use? Don't worry.

What problems does llm-wiki solve?

Talk about it firstnotWhat, to avoid misunderstandings:

Not a code generation accelerator, not a HumanEval/MBPP
It is not an Obsidian plug-in for you to make reading notes (although the output format is compatible with Obsidian)
Not a RAG framework (this will be explained in detail later, the difference is critical)

The problem it really solves is:Agent research is "one-time"。

Give me a real scene. You need to study the "threat model of hardware wallets" and ask Claude Code to check it. It will search the web, read articles, and give you a review. But this review disappeared after the chat. Next week you want to continue to study "ColdCard this wallet specific attack surface", Agent can not remember what to check last week, have to search from scratch.

What llm-wiki does: compiles the results of the first research into aStructured wiki directory(Markdown file + source code reference + index), stored locally. The next time the Agent is asked a relevant question, check the wiki first, use it directly if it hits, and search if it doesn't hit. And each conclusion can be traced back to its original source.

MERMAID_BLOCK_0

The picture above shows its work cycle: check the wiki for problems first, and return directly if you hit them; start parallel research when you don't hit, and automatically compile them into the wiki after you finish the research. The second time I asked the same question, I took the fast green path.

Core mechanism: Parallel multi-agent + traceability wiki

These are the two most interesting designs of llm-wiki.

Research on Parallel Multiple Agents。a /wiki:research When the command goes down, it does not send an agent to search serially, but sends 5 to 10 agents to retrieve different subtopics in parallel. normal mode --deep Open 8, limit mode --retardmax If you run 10 agents at the same time, you will also "snowball"-agents will be automatically sent to dig deep into new sub-themes discovered in each round.

How long can an order last? parameters --min-time 1h Indicates at least one hour of study. You can issue a command before leaving work and come back the next morning to collect a compiled complete wiki. Behind this is Claude Code's 200K context window, and only a single agent can hold all the materials for the next round of research.

Traceable wiki compilation。After the research is over, I will not directly throw you a summary, but compile it into a directory structure:

topics/
  hardware-wallet-threat-models/
    raw/              # 原始来源（URL、PDF、截图）
    notes/            # agent 的工作笔记
    articles/         # 编译出的结构化文章
    inventory/        # 资产清单
    datasets/         # 数据集（如果有）
    .sessions/        # 会话快照 + 用户反馈

key is raw/ and articles/ - each compiled conclusion can be traced back to the original source file. This leads to the core difference between it and RAG.

Its essential difference from RAG

The first reaction of many people was: "Isn't this RAG? Vector retrieval + generation."

No. difference isWho is organizing knowledge and the form of knowledge。

dimension	Traditional RAG	llm-wiki
knowledge form	Vector embedding, scattered in vector library	Markdown file, human-readable
organizer	Offline embedding process	Agent real-time compilation
traceability of	Difficult (vector unreadable)	Strong (each conclusion points to the raw source file)
auditable	Hardly enough	`/wiki:audit` Specializing in trust audits
manual intervention	To change the vector and reembed	Change directly to Markdown

RAG knowledge is a vector for machines to see. You can't understand or change it. The knowledge of llm-wiki is Markdown for people to see. After the Agent compiles, you directly open the editor and change it. The next time the Agent reads the revised version.

The biggest benefit of this design isauditable。When researching sensitive topics such as security threat models, medical options,/wiki:audit --project coldcard-threat-model Each conclusion will be checked for its source credibility, whether there is any negative evidence missing, and whether the chain of citations is complete. RAG can't do this-you can't audit a vector.

Five runtimes: Claude Code is a first-class citizen

The cleverest design of llm-wiki isOne set of behavioral layers, five runtime shells。Claude Code is the main adaptation object (22K token system prompt, 200K context), but the same wiki protocol can be run on other agents:

runtime	installation method	System prompt size	Suitable for the scene
Claude Code	`claude plugin install wiki@llm-wiki`	~22K tokens	Complete agent research
OpenAI Codex	`codex plugin marketplace add nvk/llm-wiki`	~3K tokens	OpenAI Ecosystem
OpenCode	opencode.json 配置 instructions URL	~3K tokens	多 provider
Pi	`--instructions SKILL.md`	~1K tokens	本地模型
任意 agent	复制 AGENTS.md	看情况	通用兜底

底层逻辑是：Claude Code 的 skills/wiki-manager/SKILL.md 是「行为真理源」，Codex 和 OpenCode 的版本是脚本自动同步生成的（sync-codex-plugin.sh、sync-opencode-plugin.sh), not two sets of codes maintained by hand. There are test scripts that focus on synchronization consistency and report errors once drifts.

This means that you don't have to change the knowledge base when changing agents. The wiki directory is neutral and follows you, the agent just accesses its front end.

Install Claude Code in one line:

claude plugin install wiki@llm-wiki

Quick overview of core commands

The command design of llm-wiki is very restrained, with these core points:

# 研究：从零创建一个 topic wiki，并行 agent 跑 1 小时
/wiki:research "gut microbiome" --new-topic --min-time 1h

# 深度模式：8 个 agent，跑 2 小时
/wiki:research "fasting" --deep --min-time 2h

# 论文式研究：给一个论点，搜集正反两面证据，最后给判决
/wiki:thesis "fiber reduces neuroinflammation via SCFAs"

# 收集：建带溯源的目录（表情包、工具、实体都行）
/wiki:collect "bitcoin memes" --wiki memes-bitcoin

# 查询：问 wiki，命中直接返回
/wiki:query "How does fiber affect mood?"
/wiki:query "compare keto and mediterranean" --deep

# 摄入：手动加一个来源
/wiki:ingest https://example.com/article

# 审计：检查一个 project 的引用链和可信度
/wiki:audit --project coldcard-threat-model

In addition to "explicit commands" with colons, there are also fuzzy routers. directly /wiki what do we know about CRISPR? It can be recognized as query,/wiki add https://... Recognize it as ingest. This is to conform to people's natural speaking habits.

The query is also divided into depth.--deep Cross-references will be made to put together relevant conclusions from multiple topics and compare them, rather than just looking up a single topic.

What does a complete research process look like?

String the above together, and a real research closed loop runs like this:

MERMAID_BLOCK_1

In the first week, the research command was issued, and eight agents dug in parallel for one hour, compiled into a wiki and stored locally. In the second week, query directly, hit the local wiki and returned in seconds, with a link to the original source.

Here is a detail worth mentioning:Session snapshots and feedback capture(Added in v0.11 and v0.12). The Agent will automatically desensitize and save the key content of each session in the .sessions/If you correct its preference and deny it halfway, it will also be captured as feedback candidates. The next time you open a new session, the Agent rehydrates these snapshots first, which means remembering your research history. However, these snapshots will not automatically enter the official wiki, and you will be required to explicitly @wiki feedback promote Only then can we become regular and avoid noise pollution to the knowledge base.

Who is suitable for use? Don't worry.

suitable person：

If you use Claude Code to conduct in-depth technical/academic research, the research cycle is long and requires precipitation.
Those who do security research and threat modeling need to be traceable and auditable.
Enterprise knowledge base maintainers want to systematize scattered documents
People switching across multiple agent tools need a neutral, portable knowledge layer

Let's not rush：

If you only do one-time questions and answers, RAG or direct search will be lighter
There is no need for agents like Claude Code / Codex at all. AGENTS.md can be used at all, but the experience is discounted.
sensitive to disk space--raw/ Original sources will be preserved, and in-depth research will take up a lot of space

The most suitable scenario for llm-wiki isLong cycle, multiple rounds, requiring credibilityresearch. One-time tasks are heavier.

Pay attention to a few practical exercises

Know these pits before installing:

iCloud users pay attention to permissions。Many people put their wiki directories on iCloud to sync across devices. macOS privacy controls will get stuck--stat You can succeed but read wikis.json daily Operation not permitted。The solution is not to change the local path, but to open Full Disk Access to the app that launches the agent, and then use the /wiki config hub-path Explicitly specify iCloud paths, don't rely on the default ~/wiki。

Sandbox environment (nono) requires additional permissions。The wiki directory is outside the project and cannot be read by default by sandbox. Claude Code / OpenCode needs to be added $HOME/.config/llm-wiki Read permissions + read and write permissions to the wiki directory;Codex also adds additional $HOME/.codex Read and write (plug-in cache needs to be written).

Update Don't Use SSH。Use the agent upgrade plug-in sandbox gh auth login --web --git-protocol httpsAvoid SSH host-key prompts getting stuck.

version synchronization。if claude plugin update The new version was not pulled (the marketplace cache is stale), so README gave the manual synchronization script, from the warehouse clone to the plug-in cache directory, and restarted Claude Code to take effect.

Reference documents and links

llm-wiki project official website - Project display page and introduction
GitHub: nvk/llm-wiki — 710 stars，MIT 协议，Python，核心仓库
Claude Code 插件安装 — claude plugin install wiki@llm-wiki 一行安装
How It Works 文档 — 并行多 agent 研究 + wiki 编译机制详解
Research Modes — deep / retardmax / thesis 三种研究模式
Nono Sandbox Permissions — sandbox 环境下的权限配置
AGENTS.md 通用协议 - A single file solution for any agent

Does your Agent research from scratch every time now? Try installing llm-wiki, and come back to the comment area to talk about the effect. If you think it is useful, just like it so that more people can see it.

author: itech001
source: Public Account: AI Artificial Intelligence Era
website: _ _ JHSNS _ URL _ 0 _ _
Share the most cutting-edge AI news and technical research every day.

This article was first published in the era of AI artificial intelligence. Please indicate the source for reprinting.

This content is automatically aggregated by InertiaRSS (RSS Reader) for reading reference only. Original from — Copyright belongs to the original author.