惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

SecWiki News
SecWiki News
I
InfoQ
The Cloudflare Blog
人人都是产品经理
人人都是产品经理
博客园 - Franky
T
Tailwind CSS Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
量子位
博客园_首页
罗磊的独立博客
V
V2EX
李成银的技术随笔
大猫的无限游戏
大猫的无限游戏
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
True Tiger Recordings
Vercel News
Vercel News
Cyberwarzone
Cyberwarzone
Cisco Talos Blog
Cisco Talos Blog
F
Fox-IT International blog
D
Darknet – Hacking Tools, Hacker News & Cyber Security
M
Microsoft Research Blog - Microsoft Research
Know Your Adversary
Know Your Adversary
爱范儿
爱范儿
The Register - Security
The Register - Security
G
Google Developers Blog
The Hacker News
The Hacker News
Malwarebytes
Malwarebytes
S
Securelist
博客园 - 三生石上(FineUI控件)
Jina AI
Jina AI
T
Threat Research - Cisco Blogs
T
The Exploit Database - CXSecurity.com
S
SegmentFault 最新的问题
博客园 - 叶小钗
F
Fortinet All Blogs
Apple Machine Learning Research
Apple Machine Learning Research
宝玉的分享
宝玉的分享
博客园 - 聂微东
T
Threatpost
博客园 - 【当耐特】
D
Docker
P
Privacy & Cybersecurity Law Blog
www.infosecurity-magazine.com
www.infosecurity-magazine.com
G
GRAHAM CLULEY
V
Visual Studio Blog
C
Cisco Blogs
IT之家
IT之家
S
Security Archives - TechRepublic
Latest news
Latest news
阮一峰的网络日志
阮一峰的网络日志

laike9m's blog

First Month at the Jules Team | laike9m's blog Avoid Mini-frameworks 充电和耗电的工作 A Claude Code Reality Check | laike9m's blog
I Tried the Agentic Browsers | laike9m's blog
2025-10-20 · via laike9m's blog

I Tried the Agentic Browsers

Today I tried three agentic browsers: Comet, Dia, and Fellou. I gave it a real task I wanted to automate: extract data from a webpage, and write it to a Google Sheet in another tab. The task requires clicking on buttons to reveal data, which I thought would be the most challenging part.

The results were quite disappointing.

  • Comet: was decent at parsing the page and extracting data (including clicking buttons), but it froze at the very end—maybe it exceeded the context limit or had some other issue. As for writing to the Google Sheet, it said it couldn't do it.
  • Fellou: page parsing was poor, and it also froze. However, it could at least interact with the Google Sheet, although its CPU usage spiked to 20%.
  • Dia: Dia can extract information that's already on the webpage, but it cannot interact with the buttons, nor write to a Google sheet.

Overall, I see two major hurdles for agentic/AI browsers:

  • Problem 1: Webpages are not designed for AI.

    Webpages contain a massive amount of redundant information, which wastes a huge amount of context, interferes with AI's judgment, and slows down execution. Meanwhile, services that convert webpages to Markdown don't handle dynamic content well. I feel this ultimately needs to be solved by websites and content providers, for example, by offering paid, AI-friendly Markdown APIs. Trying to handle this entirely on the client-side is extremely difficult.

  • Problem 2: Screen Reading.

    On a Mac, for instance, reading screen content relies on the accessibility API. This creates a problem: if the AI wants to "see" the webpage (rather than just its HTML), the browser must be the foreground app; it can't just run in the background. But the whole point of using an AI assistant is to save time so I can do other things, right? If I still have to keep the browser in the foreground, I might as well just do it myself. Allowing AI to take screenshots using browser API might solve this to some extent, but it would be less precise than the accessibility APIs.

If we're just talking about integrating an AI sidebar, Comet already does a great job, and Chrome is about to catch up. But they are still a long way from being true, general-purpose agentic browsers. I think the future directions might be:

  • From the client/browser perspective: Focus on optimizing the most common, daily tasks (e.g., writing emails, summarizing news) and polishing the user experience.
  • Combine agents with old-school "record & replay", allowing users to create their own workflows more easily with the help of AI.
  • From the server/website perspective: Explore business models for providing AI-friendly content. I believe there is a huge demand and a large market.
comments powered by