I.
The day before yesterday, Kimi suddenly released the flagship model K2.5 , without any prior announcement. In China, Kimi is a relatively low-key company with less public attention. However, its products are not weak.

Six months ago, the K2 model made a big splash and received high praise, widely recognized as being in the top tier globally. So, with the release of the new version K2.5, it immediately made headlines and became a hot topic on platforms like Hacker News and Twitter.
Renowned developer Simon Willion wrote
an in-depth introduction the same day. However, the truly interesting part this time isn’t the model itself, but something else Kimi did.

II.
This K2.5 is very strong, with improvements in all aspects compared to the K2. The benchmark scores provided by the official review are mostly in the top three globally, even first place (see release notes ).
According to the LMArena (now renamed to arena.ai) ranking , Kimi K2.5's encoding capabilities are the best among all open-source models, second only to Claude and Gemini in the overall ranking (see image below).

However, the biggest highlight is not the model itself, but that Kimi also released an Agent (intelligence) based on this model.
That is to say, this time, in fact, two things were released simultaneously: the K2.5 model and the K2.5 Agent。K2.5 is the underlying model, and K2.5 Agent is a web application targeted at end-users.

In my impression, this seems to be the first time a major model company has done this. Previous releases were only of the models themselves; I've never seen anyone release a model and Agent together.
To put it this way, Kimi has taken the path of integration.
III.
As everyone knows, large models are the underlying processing engines, and Agents are upper-layer applications for users.
Their relationship is essentially of two types: layered development and integration . The former involves the large model and the agent being developed separately, while the latter involves developing them as a single, unified whole.
Manus, recently acquired by Meta at a high price, is the best example of layered development.

Manus used the Claude model from Anthropic, which developed an independent agent on it and was eventually acquired.
Its success encouraged many people to engage in agent development. Because the investment in models is too high for everyone, while the investment in agents is relatively low, even the smallest developers can manage it.
Kimi's attempt this time took a big step in another direction by combining large models and Agents. After all, it's more convenient for large model companies to do this themselves, which is more beneficial for expanding market share and attracting users.
It's hard to say which of these two approaches is better. Just like smartphones, external apps for Apple and Android can better meet user needs, while built-in apps can fully integrate with the operating system, making them smoother to use.
Four,
Model testing has been done a lot, so let me test the K2.5 Agent released this time.
It's clear that Kimi values Agent very much and has invested a lot of effort.Release NotesMost of the text is about introducing the functions of the Agent.
Among them, there are a few functions that are quite conventional:
(1)Kimi Office AgentExpert-level Word, Excel, PowerPoint file generation.
(2)Kimi Code: A command-line tool for code generation, comparable to Claude Code.
(3)Long-range operation:Capable of completing up to 1,500 steps in one go, which clearly targets Manus known for its multi-step operations.
What I'm particularly interested in are the two brand-new features I've seen for the first time; it seems other companies haven't mentioned them.
(4) Visual Programming : Utilizing the model's visual capabilities to understand images and videos, which are then used for programming. As long as you upload design drafts and web videos, you can generate web pages.
(5) Swarm Function (agent swarm): When faced with complex tasks, up to 100 agents within the Agent will automatically be called to form a cluster and execute tasks concurrently, such as concurrent downloads and generation.
Due to space constraints, I'll briefly mention my "Visual Programming" test results.
Five.
First, open the Kimi official website; K2.5 is already live and can be used directly (see image below).

Note that the model needs to be switched to "Agent mode" K2.5 Agent.

My first test was motion generation, which involves uploading an animation video and letting it generate. Below is the original animation, created using the Lottie library.

After uploading, enter the prompt in the web interface:
Reproduce the animation effect in the video exactly as it appears on the web page
The model quickly inferred that this was an animation of an orange cat playing with a ball. Then, it surprisingly took screenshots of every frame of the animation to recreate it.

Finally, it used Python to generate an SVG animation file.

The animation effects for the tail, eyes, and small balls rolling have all been accurately reproduced. Unfortunately, the main kitten is composed of multiple SVG shapes stitched together, so it can't be made to look very realistic.
Everyone can goThis website addressCheck the final effect and web page code.
VI.
The second test is to upload a video of a website and let the model generate the website.
I randomly found one on Bilibili.Videos on designer websites.
Everyone can visitThis websiteCheck out the effect of the original web page.

I uploaded the video to the model and then requested "restore the website inside the video."
The generated result (below) far exceeded my expectations, with extremely high restoration accuracy, almost ready for launch.


Everyone can go to this website to view the generated result.
Seven,
After simple testing, my evaluation is that Kimi K2.5 Agent's "visual programming" is not just a gimmick; it indeed has visual comprehension capabilities and can generate usable results.
Currently, it seems that Kimi's attempt at integrating "model + Agent" is successful. On one hand, the powerful Agent unleashes the capabilities of the underlying model, making it easier for users to use. On the other hand, the model expands various use cases through the Agent, attracting more users and benefiting its own promotion.
Finally, in the current international competition landscape, integration has an additional advantage.
Manus relies on the American model and ultimately had to choose to register the company overseas, while Kimi's underlying model is self-developed and open-source, completely free from the risk of being choked.
(End)












