

























A browser agent today is four separate tools wired together: Chrome, a CDP library, an LLM, and an agent framework. Our native agent collapses those four layers into one binary.
You talk to it in natural language and it does the work against a real browser. “Go to this website”, “login”, “extract this data”, “tell me that”: language is the new native interface for the browser. And if you want to replay your session, it hands you a reproducible script. No CDP, no heavy browser server, no complex setup, no LLM at runtime: everything is built in into Lightpanda.
To run a browser agent today, you install Chrome, drive it over CDP with Puppeteer or Playwright, wire in an LLM to make decisions, then wrap the whole thing in an agent framework to orchestrate the calls. It works. Well, kind of: it’s a lot of moving parts for what’s often, “load a page and read some elements off it.”
Each of those layers exists to translate between a human-facing browser and a machine that wants to drive it. CDP (the Chrome DevTools Protocol) was built so a developer could inspect a running browser from the outside, not so a program could operate one. And to wire an LLM, you need an MCP-to-CDP layer. The agent wraps a model around a tool that was never meant to be driven by a model. You are paying a translation tax at every layer.
When we started Lightpanda four years ago, the question was: “what would a browser look like if you built it for machines instead of people looking at a screen?” is that bet applied to the agentic stack. It is one binary that contains the browser, the runtime, and the agent.
The browser is the same engine behind and : it loads webpages, runs JavaScript, and handles the DOM.
The runtime consists of a small set of native tools (, , , , , ), that let you drive the browser (slash commands). The LLM that reads your request and picks tools is optional. It runs against Anthropic, OpenAI, Gemini, Hugging Face or local Ollama, or with no key at all in slash-only mode.
This is the idea that everything else hangs off. Every other browser agent is a black box that calls a model on every step. With , the model figures out the task, we capture that work as code, and generate a script (we call it PandaScript) that is reproducible and deterministic. When you replay it, you don’t need a model and the LLM is gone from the loop.
There’s no model call sitting between you and the next action at replay time, no Chrome process to host, no NodeJS/Python environment to setup, and no Playwright/Puppeteer code to write. You just pass the script to Lightpanda binary, so a run is bound only by how fast the engine drives the page. And because nothing non-deterministic runs at replay, the same script produces the same result every time. You pay the model once to write the file. After that, it’s plain JavaScript that you own.
Here’s what that PandaScript looks like. This script grabs the top 3 Hacker News stories, then visits each thread for its top comments:
Readable vanilla JavaScript with loops, map, filter, and try/catch. And a few built-in primitives for our native tools. That’s it. The last top-level expression auto-prints as JSON.
And of course you can also write or edit a PandaScript manually, or ask your AI coding assistant to do so if you prefer.
Lightpanda browser still speaks CDP with , and we are actively developing it. The decision here is narrower: we chose not to put CDP inside the agent.
Traditional headless automation marshals every action across CDP, with hundreds of methods running against a browser in a separate process.. Our agent skips that. It runs in-process against Lightpanda’s engine and calls a small set of native commands directly.
This gives you two things:
As a bonus, the native commands are the same tool surface whether you drive them with natural language instructions, with commands, or with an external LLM through .
That’s one (of many) advantages of developing a browser from scratch instead of forking Chromium: it allows us to build new AI features natively.
Point an API key at it, or run it with none:
In the REPL, explore in English or with , , and the rest. You can generate a reproducible PandaScript from the current session with (alpha feature), and then , then replay it with .
The agent tutorial takes you through the whole loop: log in to Hacker News, extract stories, save, and replay. If you want the full reference, the agent docs cover every provider, slash command, and flag, and the script format docs explain the JavaScript API.
It is a built-in agent that drives a headless browser by translating your requests into native browser actions, in natural language or as slash commands. It runs in-process against Lightpanda’s own engine and can use a model from Anthropic, OpenAI, Gemini, Hugging Face or local Ollama, or no model at all.
PandaScript is a script to automate browser actions and workflows, designed to replace Puppeteer or Playwright. It can be run directly on the Lightpanda binary without needing a separate client setup with NodeJS/Python. It’s vanilla JavaScript with a small set of native primitives.
No, and that is deliberate. The agent calls native in-process commands instead. Lightpanda is still actively developing CDP for , so existing Puppeteer and Playwright workflows are unaffected.
Yes. Use for slash-commands-only mode, where you type , , and directly. Replaying a saved script doesn’t require a model or an API key, so recorded sessions run token-free and deterministically.
During a session, every state-changing call is recorded. With an LLM connected, generates a reproducible PandaScript from the session intent (this is an alpha feature, we are actively developing and improving it). Without an LLM, it transcribes the recorded calls verbatim.
Secrets are written as placeholders, like . They are resolved at runtime inside Lightpanda, so they never enter the model context or the saved script file.
Yes. The native MCP server exposes the same tool surface to any MCP-aware client, like Claude Code, with no model running inside Lightpanda. See the MCP server guide for setup.

Francis previously cofounded BlueBoard, an ecommerce analytics platform acquired by ChannelAdvisor in 2020. While running large automation systems he saw how limited existing browsers were for this kind of work. Lightpanda grew from his wish to give developers a faster and more reliable way to automate the web.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。