InertiaRSS Track and read blogs, news, and tech you care about
Read Original Open in InertiaRSS

Recommended Feeds

博客园 - 司徒正美
V
V2EX
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
aimingoo的专栏
aimingoo的专栏
Apple Machine Learning Research
Apple Machine Learning Research
IT之家
IT之家
Blog — PlanetScale
Blog — PlanetScale
A
About on SuperTechFans
月光博客
月光博客
T
The Blog of Author Tim Ferriss
宝玉的分享
宝玉的分享
Martin Fowler
Martin Fowler
博客园 - 聂微东
The GitHub Blog
The GitHub Blog
V
Visual Studio Blog
WordPress大学
WordPress大学
酷 壳 – CoolShell
酷 壳 – CoolShell
Engineering at Meta
Engineering at Meta
GbyAI
GbyAI

阮一峰的网络日志

科技爱好者周刊(第 396 期):互联网通信的替代方案 科技爱好者周刊(第 396 期):互联网通信的替代方案 - 阮一峰的网络日志 科技爱好者周刊(第 395 期):软件开发的第三种方式 科技爱好者周刊(第 395 期):软件开发的第三种方式 - 阮一峰的网络日志 科技爱好者周刊(第 393 期):脑腐状态 科技爱好者周刊(第 392 期):axios 投毒与好莱坞式骗术 科技爱好者周刊(第 391 期):AI 的贫富分化 科技爱好者周刊(第 390 期):没有语料,大模型就是智障 套壳中国大模型撑起500亿美元估值?扒一扒 Cursor 的"套壳"疑云 科技爱好者周刊(第 389 期):未来如何招聘程序员 科技爱好者周刊(第 388 期):测试是新的护城河 零安装的"云养虾":ArkClaw 使用指南 科技爱好者周刊(第 387 期):你是领先的 科技爱好者周刊(第 386 期):当外卖员接入 AI 字节全家桶 Seed 2.0 + TRAE 玩转 Skill 科技爱好者周刊(第 385 期):马斯克害怕中国车企吗? 智谱旗舰 GLM-5 实测:对比 Opus 4.6 和 GPT-5.3-Codex 科技爱好者周刊(第 384 期):为什么软件股下跌 科技爱好者周刊(第 383 期):你是第几级 AI 编程 科技爱好者周刊(第 382 期):独立软件的黄昏 AI native Workspace 也许是智能体的下一阶段 科技爱好者周刊(第 381 期):中国 AI 大模型领导者在想什么 科技爱好者周刊(第 380 期):为什么人们拥抱"不对称收益" 科技爱好者周刊(第 379 期):《硅谷钢铁侠》摘录 我如何用 AI 处理历史遗留代码:MiniMax M2.1 升级体验 科技爱好者周刊(第 378 期):预测是新的互联网热点 科技爱好者周刊(第 377 期):14万美元的贫困线 科技爱好者周刊(第 376 期):太空数据中心的争议 科技爱好者周刊(第 375 期):一扇门的 Bug 终于有人做了 Subagent,TRAE 国内版 SOLO 模式来了 科技爱好者周刊(第 374 期):6GHz 的问题 VS Code 使用国产大模型 MiniMax M2 教程 科技爱好者周刊(第 373 期):数据模型是新产品的核心 国产大模型接入 Claude Code 教程:以 Doubao-Seed-Code 为例 科技爱好者周刊(第 372 期):软件界面如何设计 大模型比拼:MiniMax M2 vs GLM 4.6 vs Claude Sonnet 4.5 科技爱好者周刊(第 371 期):一个乐观主义者的专访 科技爱好者周刊(第 370 期):正确的代码高亮 错误处理:异常好于状态码 科技爱好者周刊(第 369 期):Tim 与罗永浩的对谈 科技爱好者周刊(第 368 期):不要这样管理软件团队 一天之内,智谱和 Anthropic 都发了最强编程模型 科技爱好者周刊(第 367 期):Nano Banana 的几个妙用 科技爱好者周刊(第 366 期):旧金山疯狂的 AI 广告 科技爱好者周刊(第 365 期):流量变现正在崩塌 科技爱好者周刊(第 364 期):最难还原的魔方 科技爱好者周刊(第 363 期):最好懂的神经网络解释 科技爱好者周刊(第 362 期):GitHub 工程师谈系统设计 科技爱好者周刊(第 361 期):暗网 Tor 安全吗? 科技爱好者周刊(第 360 期):Dan Wang 的新书
Kimi's integration, Manus' layered
阮一峰 · 2026-01-29 · via 阮一峰的网络日志

I.

The day before yesterday, Kimi suddenly released the flagship model K2.5 , without any prior announcement. In China, Kimi is a relatively low-key company with less public attention. However, its products are not weak.

Six months ago, the K2 model made a big splash and received high praise, widely recognized as being in the top tier globally. So, with the release of the new version K2.5, it immediately made headlines and became a hot topic on platforms like Hacker News and Twitter.

Renowned developer Simon Willion wrote

an in-depth introduction the same day. However, the truly interesting part this time isn’t the model itself, but something else Kimi did.

II.

This K2.5 is very strong, with improvements in all aspects compared to the K2. The benchmark scores provided by the official review are mostly in the top three globally, even first place (see release notes ).

According to the LMArena (now renamed to arena.ai) ranking , Kimi K2.5's encoding capabilities are the best among all open-source models, second only to Claude and Gemini in the overall ranking (see image below).

However, the biggest highlight is not the model itself, but that Kimi also released an Agent (intelligence) based on this model.

That is to say, this time, in fact, two things were released simultaneously: the K2.5 model and the K2.5 Agent。K2.5 is the underlying model, and K2.5 Agent is a web application targeted at end-users.

In my impression, this seems to be the first time a major model company has done this. Previous releases were only of the models themselves; I've never seen anyone release a model and Agent together.

To put it this way, Kimi has taken the path of integration.

III.

As everyone knows, large models are the underlying processing engines, and Agents are upper-layer applications for users.

Their relationship is essentially of two types: layered development and integration . The former involves the large model and the agent being developed separately, while the latter involves developing them as a single, unified whole.

Manus, recently acquired by Meta at a high price, is the best example of layered development.

Manus used the Claude model from Anthropic, which developed an independent agent on it and was eventually acquired.

Its success encouraged many people to engage in agent development. Because the investment in models is too high for everyone, while the investment in agents is relatively low, even the smallest developers can manage it.

Kimi's attempt this time took a big step in another direction by combining large models and Agents. After all, it's more convenient for large model companies to do this themselves, which is more beneficial for expanding market share and attracting users.

It's hard to say which of these two approaches is better. Just like smartphones, external apps for Apple and Android can better meet user needs, while built-in apps can fully integrate with the operating system, making them smoother to use.

Four,

Model testing has been done a lot, so let me test the K2.5 Agent released this time.

It's clear that Kimi values Agent very much and has invested a lot of effort.Release NotesMost of the text is about introducing the functions of the Agent.

Among them, there are a few functions that are quite conventional:

(1)Kimi Office AgentExpert-level Word, Excel, PowerPoint file generation.

(2)Kimi Code: A command-line tool for code generation, comparable to Claude Code.

(3)Long-range operation:Capable of completing up to 1,500 steps in one go, which clearly targets Manus known for its multi-step operations.

What I'm particularly interested in are the two brand-new features I've seen for the first time; it seems other companies haven't mentioned them.

(4) Visual Programming : Utilizing the model's visual capabilities to understand images and videos, which are then used for programming. As long as you upload design drafts and web videos, you can generate web pages.

(5) Swarm Function (agent swarm): When faced with complex tasks, up to 100 agents within the Agent will automatically be called to form a cluster and execute tasks concurrently, such as concurrent downloads and generation.

Due to space constraints, I'll briefly mention my "Visual Programming" test results.

Five.

First, open the Kimi official website; K2.5 is already live and can be used directly (see image below).

Note that the model needs to be switched to "Agent mode" K2.5 Agent.

My first test was motion generation, which involves uploading an animation video and letting it generate. Below is the original animation, created using the Lottie library.

After uploading, enter the prompt in the web interface:

Reproduce the animation effect in the video exactly as it appears on the web page

The model quickly inferred that this was an animation of an orange cat playing with a ball. Then, it surprisingly took screenshots of every frame of the animation to recreate it.

Finally, it used Python to generate an SVG animation file.

The animation effects for the tail, eyes, and small balls rolling have all been accurately reproduced. Unfortunately, the main kitten is composed of multiple SVG shapes stitched together, so it can't be made to look very realistic.

Everyone can goThis website addressCheck the final effect and web page code.

VI.

The second test is to upload a video of a website and let the model generate the website.

I randomly found one on Bilibili.Videos on designer websites.

Everyone can visitThis websiteCheck out the effect of the original web page.

I uploaded the video to the model and then requested "restore the website inside the video."

The generated result (below) far exceeded my expectations, with extremely high restoration accuracy, almost ready for launch.

Everyone can go to this website to view the generated result.

Seven,

After simple testing, my evaluation is that Kimi K2.5 Agent's "visual programming" is not just a gimmick; it indeed has visual comprehension capabilities and can generate usable results.

Currently, it seems that Kimi's attempt at integrating "model + Agent" is successful. On one hand, the powerful Agent unleashes the capabilities of the underlying model, making it easier for users to use. On the other hand, the model expands various use cases through the Agent, attracting more users and benefiting its own promotion.

Finally, in the current international competition landscape, integration has an additional advantage.

Manus relies on the American model and ultimately had to choose to register the company overseas, while Kimi's underlying model is self-developed and open-source, completely free from the risk of being choked.

(End)