AI native Workspace might be the next stage for agents

I. The Form of Intelligent Agents

I ask everyone a question, what is the product form of AI?

Large models are just underlying processing engines; you always need an application-level product to meet user needs. This application layer of AI is referred to as an "intelligent agent" (agent).

So, the question becomes, what should an "intelligent agent" look like?

Early intelligent agents were just conversational applications (as shown in the image above), later incorporating reasoning to think through complex problems.

Later, they evolved towards specialized fields, giving rise to programming agents (coding agents), image agents, video agents, and more, or by integrating with MCP, gaining external application operation capabilities, such as generating Office documents or controlling browsers.

These forms are basically mature, and many companies are now exploring what the next generation of intelligent agents will look like?

I recently started using the newly released AI native Workspace (AI Native Workspace), and happily feel that this might be the answer.

II. Cowork and Skill

This new product also incorporates two new concepts recently proposed by Anthropic: Cowork and Skill.

Cowork, simply put, is a "computer operation assistant." It is essentially the graphical interface version of a programming intelligence agent, allowing users who don't understand programming to express their needs in natural language, then have AI generate underlying code and execute it to automatically operate the local computer and complete tasks.

Skill is even simpler. It is a preset prompt, equivalent to a "user manual," which describes in detail to the AI how to complete a specific task. You can understand it this way: each Skill is an expert, giving AI specific skills in a particular field.

These two things, one is an operation assistant, and the other is expert mode. The former uses AI to operate the computer, and the latter enables AI to have specialized skills.

What will happen when they are combined?

MiniMax AI native Workspace is such a product, exploratively combining Cowork and Skill, while possessing both capabilities in a completely new product form.

Its desktop version (desktop) provides Cowork capabilities, while expert mode (experts) provides Skill capabilities.

III. Desktop Operation Assistant

Next, I will demonstrate where its differences lie from traditional intelligent agents.

Its desktop client is positioned as an "AI native workspace," with the following capabilities.

Directly access local files: able to read and write, as well as automatically upload or download files.

Automated workflows: capable of breaking down tasks and running web automation.

Deliver professional results: After execution, high-quality deliverables can be generated, such as Excel spreadsheets, PowerPoint presentations, and formatted documents.

Long-running tasks: For complex tasks, they can run for a long time without being affected by dialogue timeouts or context limitations.

Note that since it can operate computers and communicate with the internet, you must specify a directory before execution to prevent reading or writing to directories that should not be accessed, and you must have backups to prevent original files from being deleted or modified.

First, go to the official website to downloadDesktop clientAvailable for Windows/Mac versions, new registered users can currently try it for free for 3 days.

After installation, run it directly to enter the task interface, which is a traditional dialog box.

At this point, specifying the run directory enters the "workbench" mode, allowing operations on that directory. The software will pop up a warning message indicating the risk.

At this point, you can let it perform various tasks. For example, I had it organize PDF invoices from various electronic services and then generate a summary Excel document.

At this point, it will automatically install a Python virtual environment in the current directory, then generate and execute a Python script.

Excel file will be ready soon.

Similarly, various file organization tasks can be handed over to it, such as sorting photos, renaming files, and more.

It can also perform web automation, such as automatically browsing a webpage and extracting information, summarizing content.

IV. Expert Systems

Above, its workbench functions are demonstrated, capable of acting as a "digital employee," and below, let's take a look at its "expert system."

So-called "expert systems" involve injecting specific prompt files to expand the skills of an intelligent agent, equivalent to injecting deep knowledge and capabilities. Users can also upload private knowledge bases.

Everyone can open itWeb-basedClick "Explore Experts" on the left sidebar.

The system comes with some "preset experts" that can be used directly.

I chose a "Icon Maker" provided by the system, which is a skill for making logos, to see how it works.

I request to create a "Panda Eating Ice Cream" logo, and the system prompts to select a design style.

Finally, two files (sitting posture and standing posture) were generated for selection, and the results are quite good.

5. Create New Skill

In addition to the preset experts, the system also allows you to create "My Experts," which are custom skills of a certain type.

You need to enter the capability description and instructions, and you can also add corresponding MCP, SubAgent, environment variables, Supabase database, and so on.

I directly input the Skill fileprovided by Anthropic to see the effect.

I selected the frontend-design (Frontend Design) skill and after entering it, it can be seen on the "My Experts" page.

Note that the system currently only supports inputting skill description files and does not support uploading static resource files (assets). It is hoped that this can be added later.

After selecting this expert, I request to generate an algorithm visualization page.

"Create a sorting algorithm visualization website that lists animations of common sorting algorithms. Selecting a particular algorithm will display its animation effect."

The generation process took about ten minutes, and the result was obtained. The system generated animations for ten sorting algorithms and was directly deployed online.

I adjusted the animation colors again later, everyone can goThis websiteCheck out the effect, it's still cool.

VI. Summary

AI native WorkspaceThe AI intelligent agent has been introduced to the local computer, enabling automated operations, while also incorporating a skill interface that allows for the injection of external knowledge and capabilities. Moreover, all operations can be completed through natural language dialogue, requiring low demands from the user.

This instantly opens up the imagination space for AI agents. The tasks they can accomplish will no longer be limited by the capabilities of the model, but only by our imagination.

I believe that this product represents the direction of AI agent development in the next stage, it will open up many brand new possibilities for us to explore.

(End)

Recommended Feeds

阮一峰的网络日志