




















It’s time for context management to become Agentic.
The internal version has a better reading experience bytetech
Currently, most context management focuses on what should be put in and how to find the right things to put in, such as RAG, MEM, etc., with little discussion on how to clean it up.
Current cleanup mainly relies on reaching a certain threshold of the context window, such as 80%, triggering a compression to achieve cleanup. This might have been first introduced by Claude Code, and has now become a basic function.
Claude Code compression logic: Modify the system prompt, bring the full history of messages, and append a compression prompt.
As you can see, this is a non-cached, full history message call, which consumes quite a few Tokens.
1 |
|
Many people should have encountered the situation where after compression, a lot of things are lost, making the conversation difficult.
Theoretically, an Agent should be able to perceive “I need to compress” earlier, achieving more semantic/task-level context compression.
Ideally, an Agent can actively manage its own context, actively choosing to load and unload content, which should be very useful in long conversations/multi-topic scenarios.
Today’s Agents are like a program that can only allocate memory but cannot free memory, relying only on compression and then restarting.
I probably first saw the d-mail feature on kimi-cli.
When AI finds that it has done some low information density things, such as reading a large file where actually only a little bit is useful, it calls d-mail to perform time travel, letting the agent return to the context before reading, and brings a message telling its past self: read xxx and found xxx.

kimi docs https://github.com/MoonshotAI/kimi-cli/blob/main/src/kimi_cli/tools/dmail/dmail.md
Bytedance internal docs https://bytetech.info/articles/7571069998476165146
Public research materials https://leslieo2.github.io/posts/agent-control-via-timetravel-checkpoints/
Later I saw pi agent, where the design of session is very interesting.
The /tree command allows jumping to any node, optionally carrying a summary, which is very similar to d-mail.
The author wrote about the design philosophy here, recommended reading https://mariozechner.at/posts/2025-11-30-pi-coding-agent/
In case you didn’t know, openclaw is developed using pi.
Of course, currently many agents have context storage and jump functions.
/resume./fork command, which might be too obscure, and the fork command is not even introduced in the documentation.But anyway, these are all human-oriented, not agent-oriented.
Pi is very easy to develop extensions for, so simply put, find a way to give /tree to the AI.
I think the session tree is easy to analogize to a git workflow.
For example:
1 | ├─ user: "Develop feature X" |
The data on the left is what the pi tree can provide. To allow the agent to jump accurately, all messages are tagged with IDs, and the agent only needs to call the tree jump with the ID.
But in reality, after generating a lot of dialogue, the session tree will be huge. The AI will explode if it looks at the tree context at once, so streamlining is a must.
Pi tree contains all branches. A complete tree might look like this, allowing infinite nested backtracking, and even backtracking to a historical branch again.

But the agent actually only needs to perceive the content of the current session, and does not need to perceive other branches, because the content of all branch messages is already included in the SUM node.
Then at this point, you will find that only looking at current red line messages = all conversations of current session, and everything seems to return to the matter of summary compression.
But there is one difference: we need to attach jump markers to this summary.
Session Log = Tagged Summary Message of the current session.
Just like git log.
1 | 35d4182f (ROOT) |
When deciding to jump, it is git checkout, carrying a message.
1 | context_checkout("8c5265a1", "summary...") |
The ReAct loop will become like this:

It should be noted that the summary here is a non-cached, full-history call. If it is called in every ReAct loop, the cost should be explosive.
There are several improvement ideas:
Adjust trigger timing
Lower frequency? Trigger based on specific scenarios/rules?
In a sense, it seems to go back to the question of when and how to compress, and it is the same in terms of cost, just different in structural compression logic.
Build within the session
Summary and jump decisions continue in the current session, so caching can be used.
Although there are all historical conversations in the current session, there are no IDs. How to provide markers during summary?
2.1. Message content is ID. Agent: I want to go back to the time containing the “xxxxxx” message.
2.2. The agent builds it itself in history, recording key nodes during the dialogue. The skeleton map of key nodes = session log.
I think ‘b’ is more interesting.
Such a loop needs to be embedded in the agent’s dialogue.
1 | 35d4182f (ROOT) |
Still borrowing the concept of git, 3 tools are designed:
context_tag: git tag, mark nodes.context_log: git log, view context skeleton.context_checkout: git checkout, jump on the skeleton.In order to allow AI to better perceive and decide, in addition to the context skeleton, it should also perceive context usage, dialogue depth, how far away from the nearest tag, and remind to tag in time. A HUD is designed at the front, and context_log looks something like this:
1 | [Context Dashboard] |
There is still a lot of work to design a better context log:
In order to let the Agent use these tools better, a skill was also added:
d-mail jumping is going back to the past. I also want to go to the future.
For example, a simple bug fixing problem to simulate a multi-thread dialogue scenario.

The green line is dmail-like going back to the past. A purple line is also needed to go back to the future. All time travel is lossless.
The implementation is also quite simple. Mark where it came from in all SUMs, so you can checkout back at any time.
1 | 35d4182f (ROOT) |
Another advantage of session tree is that as long as the branch is not too old, it is in the cache.
There is still a lot of work to be done for better time travel:
Just developed, don’t know how much improvement it can bring, need more business verification, welcome to try.
1 | npm install -g @mariozechner/pi-coding-agent |
https://github.com/ttttmr/pi-context
Theoretically, it can also be migrated to other tools, after all, they all have session storage functions.
I developed other pi extensions, welcome to use
https://github.com/ttttmr/pi-web-search
Directly reuse antigravity/gemini-cli/gemini for search
https://github.com/ttttmr/pi-wakatime
Integrate wakatime
https://github.com/ttttmr/planning-with-files/tree/master/.pi/skills/planning-with-files
Ported the plan skill, already merged into the main repository
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。