











Collect it first, and try it on Claude Code later.
Do you have this feeling: use Claude Code or Codex to do technical research, search it every time you ask, and start from scratch the next time you ask the same thing? After chatting for three days, the context exploded, and the Agent lost his memory. The information he dug out and the pits he had stepped on before were all gone.
I recently discovered that a project solved this problem completely--nvk/llm-wiki, an open source tool that allows any AI agent to compile a knowledge base. 710 stars, MIT protocol, written by Python, the core idea is in one sentence:Let the Agent himself condense the research process into a traceable wiki, and check it directly next time without having to search again.
Inspired by the LLM wiki concept mentioned by Andrej Karpathy, it is implemented very engineering-it is not a notebook for you to see, but a "second brain" for the Agent's own use.
Talk about it firstnotWhat, to avoid misunderstandings:
The problem it really solves is:Agent research is "one-time"。
Give me a real scene. You need to study the "threat model of hardware wallets" and ask Claude Code to check it. It will search the web, read articles, and give you a review. But this review disappeared after the chat. Next week you want to continue to study "ColdCard this wallet specific attack surface", Agent can not remember what to check last week, have to search from scratch.
What llm-wiki does: compiles the results of the first research into aStructured wiki directory(Markdown file + source code reference + index), stored locally. The next time the Agent is asked a relevant question, check the wiki first, use it directly if it hits, and search if it doesn't hit. And each conclusion can be traced back to its original source.
MERMAID_BLOCK_0
The picture above shows its work cycle: check the wiki for problems first, and return directly if you hit them; start parallel research when you don't hit, and automatically compile them into the wiki after you finish the research. The second time I asked the same question, I took the fast green path.
These are the two most interesting designs of llm-wiki.
Research on Parallel Multiple Agents。a /wiki:research When the command goes down, it does not send an agent to search serially, but sends 5 to 10 agents to retrieve different subtopics in parallel. normal mode --deep Open 8, limit mode --retardmax If you run 10 agents at the same time, you will also "snowball"-agents will be automatically sent to dig deep into new sub-themes discovered in each round.
How long can an order last? parameters --min-time 1h Indicates at least one hour of study. You can issue a command before leaving work and come back the next morning to collect a compiled complete wiki. Behind this is Claude Code's 200K context window, and only a single agent can hold all the materials for the next round of research.
Traceable wiki compilation。After the research is over, I will not directly throw you a summary, but compile it into a directory structure:
topics/
hardware-wallet-threat-models/
raw/ # 原始来源(URL、PDF、截图)
notes/ # agent 的工作笔记
articles/ # 编译出的结构化文章
inventory/ # 资产清单
datasets/ # 数据集(如果有)
.sessions/ # 会话快照 + 用户反馈
key is raw/ and articles/ - each compiled conclusion can be traced back to the original source file. This leads to the core difference between it and RAG.
The first reaction of many people was: "Isn't this RAG? Vector retrieval + generation."
No. difference isWho is organizing knowledge and the form of knowledge。
| dimension | Traditional RAG | llm-wiki |
|---|---|---|
| knowledge form | Vector embedding, scattered in vector library | Markdown file, human-readable |
| organizer | Offline embedding process | Agent real-time compilation |
| traceability of | Difficult (vector unreadable) | Strong (each conclusion points to the raw source file) |
| auditable | Hardly enough | /wiki:audit Specializing in trust audits |
| manual intervention | To change the vector and reembed | Change directly to Markdown |
RAG knowledge is a vector for machines to see. You can't understand or change it. The knowledge of llm-wiki is Markdown for people to see. After the Agent compiles, you directly open the editor and change it. The next time the Agent reads the revised version.
The biggest benefit of this design isauditable。When researching sensitive topics such as security threat models, medical options,/wiki:audit --project coldcard-threat-model Each conclusion will be checked for its source credibility, whether there is any negative evidence missing, and whether the chain of citations is complete. RAG can't do this-you can't audit a vector.
The cleverest design of llm-wiki isOne set of behavioral layers, five runtime shells。Claude Code is the main adaptation object (22K token system prompt, 200K context), but the same wiki protocol can be run on other agents:
| runtime | installation method | System prompt size | Suitable for the scene |
|---|---|---|---|
| Claude Code | claude plugin install wiki@llm-wiki |
~22K tokens | Complete agent research |
| OpenAI Codex | codex plugin marketplace add nvk/llm-wiki |
~3K tokens | OpenAI Ecosystem |
| OpenCode | opencode.json 配置 instructions URL | ~3K tokens | 多 provider |
| Pi | --instructions SKILL.md |
~1K tokens | 本地模型 |
| 任意 agent | 复制 AGENTS.md | 看情况 | 通用兜底 |
底层逻辑是:Claude Code 的 skills/wiki-manager/SKILL.md 是「行为真理源」,Codex 和 OpenCode 的版本是脚本自动同步生成的(sync-codex-plugin.sh、sync-opencode-plugin.sh), not two sets of codes maintained by hand. There are test scripts that focus on synchronization consistency and report errors once drifts.
This means that you don't have to change the knowledge base when changing agents. The wiki directory is neutral and follows you, the agent just accesses its front end.
Install Claude Code in one line:
claude plugin install wiki@llm-wiki
The command design of llm-wiki is very restrained, with these core points:
# 研究:从零创建一个 topic wiki,并行 agent 跑 1 小时
/wiki:research "gut microbiome" --new-topic --min-time 1h
# 深度模式:8 个 agent,跑 2 小时
/wiki:research "fasting" --deep --min-time 2h
# 论文式研究:给一个论点,搜集正反两面证据,最后给判决
/wiki:thesis "fiber reduces neuroinflammation via SCFAs"
# 收集:建带溯源的目录(表情包、工具、实体都行)
/wiki:collect "bitcoin memes" --wiki memes-bitcoin
# 查询:问 wiki,命中直接返回
/wiki:query "How does fiber affect mood?"
/wiki:query "compare keto and mediterranean" --deep
# 摄入:手动加一个来源
/wiki:ingest https://example.com/article
# 审计:检查一个 project 的引用链和可信度
/wiki:audit --project coldcard-threat-model
In addition to "explicit commands" with colons, there are also fuzzy routers. directly /wiki what do we know about CRISPR? It can be recognized as query,/wiki add https://... Recognize it as ingest. This is to conform to people's natural speaking habits.
The query is also divided into depth.--deep Cross-references will be made to put together relevant conclusions from multiple topics and compare them, rather than just looking up a single topic.
String the above together, and a real research closed loop runs like this:
MERMAID_BLOCK_1
In the first week, the research command was issued, and eight agents dug in parallel for one hour, compiled into a wiki and stored locally. In the second week, query directly, hit the local wiki and returned in seconds, with a link to the original source.
Here is a detail worth mentioning:Session snapshots and feedback capture(Added in v0.11 and v0.12). The Agent will automatically desensitize and save the key content of each session in the .sessions/If you correct its preference and deny it halfway, it will also be captured as feedback candidates. The next time you open a new session, the Agent rehydrates these snapshots first, which means remembering your research history. However, these snapshots will not automatically enter the official wiki, and you will be required to explicitly @wiki feedback promote Only then can we become regular and avoid noise pollution to the knowledge base.
suitable person:
Let's not rush:
raw/ Original sources will be preserved, and in-depth research will take up a lot of spaceThe most suitable scenario for llm-wiki isLong cycle, multiple rounds, requiring credibilityresearch. One-time tasks are heavier.
Know these pits before installing:
iCloud users pay attention to permissions。Many people put their wiki directories on iCloud to sync across devices. macOS privacy controls will get stuck--stat You can succeed but read wikis.json daily Operation not permitted。The solution is not to change the local path, but to open Full Disk Access to the app that launches the agent, and then use the /wiki config hub-path Explicitly specify iCloud paths, don't rely on the default ~/wiki。
Sandbox environment (nono) requires additional permissions。The wiki directory is outside the project and cannot be read by default by sandbox. Claude Code / OpenCode needs to be added $HOME/.config/llm-wiki Read permissions + read and write permissions to the wiki directory;Codex also adds additional $HOME/.codex Read and write (plug-in cache needs to be written).
Update Don't Use SSH。Use the agent upgrade plug-in sandbox gh auth login --web --git-protocol httpsAvoid SSH host-key prompts getting stuck.
version synchronization。if claude plugin update The new version was not pulled (the marketplace cache is stale), so README gave the manual synchronization script, from the warehouse clone to the plug-in cache directory, and restarted Claude Code to take effect.
claude plugin install wiki@llm-wiki 一行安装Does your Agent research from scratch every time now? Try installing llm-wiki, and come back to the comment area to talk about the effect. If you think it is useful, just like it so that more people can see it.
author: itech001
source: Public Account: AI Artificial Intelligence Era
website: _ _ JHSNS _ URL _ 0 _ _
Share the most cutting-edge AI news and technical research every day.
This article was first published in the era of AI artificial intelligence. Please indicate the source for reprinting.
This content is automatically aggregated by InertiaRSS (RSS Reader) for reading reference only. Original from — Copyright belongs to the original author.