惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
Recorded Future
Recorded Future
T
Tenable Blog
S
Securelist
C
CERT Recently Published Vulnerability Notes
T
Threatpost
S
Schneier on Security
A
Arctic Wolf
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
Know Your Adversary
Know Your Adversary
P
Privacy International News Feed
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Register - Security
The Register - Security
Cisco Talos Blog
Cisco Talos Blog
AWS News Blog
AWS News Blog
K
Kaspersky official blog
T
True Tiger Recordings
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
P
Palo Alto Networks Blog
T
The Exploit Database - CXSecurity.com
小众软件
小众软件
B
Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Microsoft Azure Blog
Microsoft Azure Blog
Cyberwarzone
Cyberwarzone
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tor Project blog
Spread Privacy
Spread Privacy
Malwarebytes
Malwarebytes
P
Proofpoint News Feed
F
Fox-IT International blog
F
Fortinet All Blogs
P
Privacy & Cybersecurity Law Blog
G
GRAHAM CLULEY
量子位
Latest news
Latest news
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 叶小钗
Project Zero
Project Zero
T
Tailwind CSS Blog
N
Netflix TechBlog - Medium
Martin Fowler
Martin Fowler
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
I
Intezer
博客园_首页
腾讯CDC
H
Hackread – Cybersecurity News, Data Breaches, AI and More
D
Darknet – Hacking Tools, Hacker News & Cyber Security

DEV Community

Why Linux Powers Almost Every Modern Server Magento 2 Nginx Optimization for High Traffic — Complete Server Tuning Guide How to Merge Multiple PDFs with One API Call — Node.js, Python & curl Why you should always rewrite the code you copy Structured Prompts Cut Token Waste 35-40%. Here's Where It Actually Matters. Validate EU VAT Numbers in Claude Desktop, Cursor, and ChatGPT — Official MCP Server The AI That Improves Itself: Autonomous Prompt Iteration Loop Do You Really Need Certifications to Get a Job? 🤔 Building Your First UAPK Manifest: A Step-by-Step Guide Inside a Horilla CRM App: registration.py, menu.py, and What AppLauncher Actually Loads Why Veltrix Will Never Be the Silver Bullet for Distributed Locks at Scale ClickUp from a Developer's Perspective in 2026: API, Webhooks, and the Self-Host Question Foundational Concepts in Data Engineering ¿Por qué Go no tiene excepciones? Primeros pasos Creating my own web browser The Gamedev Server That Broke at 300 Concurrent Hunters and How We Fixed It OneAquaHealth IEEE Global Hackathon Hytale Servers and the Lies We Told Ourselves About Treasure Hunts Evcode:I built a terminal IDE in Rust that runs on 7MB of RAM — Evcode 1.0.0 HackCanton S2 is Open — Build on Canton and Win How to Start Contributing to Open-Source AI Projects (Python, Agents, Good First Issues) I built /ai inside a notes app — here's how I render generated UI components safely I Built 8 Free Browser-Based Developer Tools (No Uploads, No Tracking) Liquid Alerts: WOW Alerts Meet Liquid Border Rest is not what you think How Polymarket Scaled Their Data Stack with Postgres + ClickHouse Adaptive execution for Java agents: reason-aware retries and budget-aware routing Memory Safety and the C/C++ CVE Crisis tRPC: The End of API Docs as We Know Them How to Build a Crypto Trading Bot with CoinGlass API AI: Who I Am, and What I'm Supposed to Be in the Software World I Have Taken Over React Projects Without Standards. Here Is What That Actually Feels Like. How I set up Sanity draft mode preview with Next.js App Router and Vercel Edge Config Secure File Upload Guide to Validation, Scanning and Storage The pause before the first token iOS Image Classification CoreML: Complete 2026 Guide Fine-Tuning Llama 3.2 3B on Medical QA: Week 2- Data Preparation Building a Card Game AI with Reinforcement Learning — Implementation Details#2 Stop hardcoding AI providers: a generic client approach AI models are missing religious context. Builders should treat that as an eval problem. Build Your AI Second Brain with Claude + Obsidian Encoding FIFA’s 495 third-place scenarios for the 2026 World Cup I burned through DeepSeek's 5M free tokens in 14 days — here's the exact math Animating React Without Fighting the Render Loop: useRafFn, useRafState, useFps, useDevicePixelRatio, useUpdate I’m Building AR/XR Experiences for Nigeria Without ARCore or ARKit Memory Graphs Don't Scale Is it just me, or is Codex getting slower day by day? 🐢 LLM API Tokens burning your Bank even on testing ? Not anymore, cuesheet is here to help with that. HTML to JSX: Common Conversion Problems Frontend Developers Still Make Fighting Database Connection Pool Exhaustion Your sanctions screening just broke: managing 50+ data sources without burying your team I think AI accidentally became my personality for a month Building a local-first clipboard workspace for macOS Understanding MCP (Model Context Protocol) in Next.js 16 Next.js 16 RAG Pipeline Optimization: Give Your AI a Perfect Memory The Complete Developer’s Guide to the Baileys WhatsApp Bot: Setup, Scaling, and VPS Deployment The Moment Veltrix Blew Up and We Had to Write Our Own Shard Router We built an alert triage system. Then we watched analysts ignore it. Future of AI Hardware API Treasure Hunt Engine: When Veltrix Defaults Buried 800k Documents in a Hot Partition I Cloned My Dog-Name Site to Build a Cat-Name Site. The Routing Layer Bit Back. Serverless Computing Claude Code Hooks vs Skills: When to Use Which Secure AI API Key Management in Next.js 16: Prevent Key Leaks I Built a Git-Tracked Book Production Pipeline CSS Carousels With Zero JavaScript: 5 Patterns 5 CSS Animations That Needed JavaScript Until 2026 When the Treasure Hunt Engine Eats Itself: My First Production Outage That Taught Me the True Cost of Defaults The 5 Best Places to Buy Next.js Templates in 2026 (Compared by Price) Building AMLA-Ready Systems: A Developer's Technical Roadmap Modern SCADA Systems Need Structured Learning More Than Ever The Rise, Pause, and Rise of CRUD Apps The Hidden Cost of Idempotency in Distributed Systems Solana Account Model — City Analogy Veltrix Configuration Was the Least of Our Worries When Our Treasure Hunt Engine Almost Took Down the Server CSS Box Shadows That Actually Look Professional CSS Gradient Trends in 2026 (And How Developers Actually Use Them) Why EU region toggles in cloud providers don't solve data sovereignty (and how to fix it) Why I Built the "Infrastructure Layer" Under Every AI Coding Agents Why I Still Regret Choosing Velocity Over Simplicity in Our Treasure Hunt Engine Configuration How Are Developers Actually Using AI At Work? Claude Security Update: Scans, Webhooks, 6 Partners The 2026 Chinese LLM Price War: Top 5 Frontier API Costs Compared Local LLM Hosting in Switzerland: Real Costs, Latency & Compliance I Built a Free SVG Background Generator for Developers Tian AI: I Built an AI Assistant That Runs 100% Offline on My Phone (No Cloud, No Subscription) How to Create Responsive Video That Doesn't "Jump" During Loading MY DEEP TECHNICAL EXPLORATION AND PERSONAL EXPERIENCE WITH HERMES AGENT 08/20: Layer 3 – The Network Layer: IP Addresses & Routing Explained CLAUDE.md for Astro: 13 Rules That Stop AI from Shipping Too Much JavaScript 10 JSON Formatting Tricks Every Developer Should Know We replaced 73 hours of weekly alert triage with 10 AI agents. Here is what the architecture looks like. The four-line cron that decides who falls in love (in my dating app) Blocked by Mac Security? How to Fix “Apple Could Not Verify” Errors in Seconds Stop the Leak: A Developer’s Guide to Taming the AWS RDS Bill in 2026 How to Decode JWT Tokens Without Sending Data to a Server Practical AI Adoption in Test Automation PicoCTF Web Challenge Writeup: NO FA Building a DAG Workflow Orchestration Engine from Scratch in Python
Automate Browser Tasks with xbrowser: A Developer's Guide to Web Automation
许映洲 · 2026-05-27 · via DEV Community

许映洲

Browser automation has been stuck in a rut for years. The dominant tools — Selenium, Puppeteer, Playwright — are powerful, but they're built for testing, not for real-world task automation. You want to scrape a competitor's pricing page? Write a 40-line script. Need to search Google and Bing simultaneously and compare results? That's another script. Want to chain a login flow with a data extraction step? Now you're managing async state, waiting for selectors, and praying nothing times out.

I've been writing browser automation code for years, and I kept running into the same friction: too much boilerplate for tasks that should take one command. That frustration led me to xbrowser, a CLI tool designed specifically for developers and AI agents who need to get things done in a browser without writing a full test suite every time.

The Problem with Current Tools

Let's be clear — Playwright and Selenium are excellent at what they do. If you're writing end-to-end tests for a web application, they're the right choice. But when your use case shifts from "test my app" to "interact with the web," the cracks start to show:

  • Heavy setup: You need a Node.js project, dependency installation, browser downloads, and boilerplate before you can even navigate to a page.
  • Script-first: Every task requires writing a script. There's no quick "just do this one thing" mode.
  • No domain helpers: Want to search Google? You're navigating to google.com, typing in a selector, waiting for results, and parsing the DOM yourself.
  • Not agent-friendly: AI agents need simple, composable commands. A 50-line async script is the opposite of that.

What I wanted was something like curl but for interactive browser tasks — a single command that handles the complexity and gives me the result.

Enter xbrowser

xbrowser is an open-source (MIT) browser automation CLI that ships as a single npm package:

npm i -g @dyyz1993/xbrowser

Enter fullscreen mode Exit fullscreen mode

That's the entire installation. No separate browser download, no WebDriver setup, no configuration files. It comes with a managed Chromium build that includes CDP fingerprint protection — meaning the sites you visit can't easily detect that you're running an automated browser.

The tool is designed around composable commands that map to real-world tasks rather than low-level browser APIs. Let me walk through the core features.

Multi-Engine Search

Searching the web from the command line shouldn't require an API key. xbrowser handles the browser interaction for you:

# Search Google
xbrowser search "headless browser automation tools" --engine google --num 10

# Search Bing
xbrowser search "headless browser automation tools" --engine bing --num 10

# Search Baidu (for Chinese-language results)
xbrowser search "无头浏览器自动化工具" --engine baidu --num 10

Enter fullscreen mode Exit fullscreen mode

Each command returns structured results with titles, URLs, and snippets. You can pipe them into jq for filtering, save them to a file, or feed them directly into an AI agent's context.

This is particularly useful for competitive analysis. Want to see how your brand ranks across search engines?

# Compare your ranking position across engines
xbrowser search "my product name" --engine google --num 30 | jq '.results[] | select(.url | contains("myproduct.com"))'

Enter fullscreen mode Exit fullscreen mode

No API keys, no rate limits to manage, no OAuth flows. Just search and get results.

Web Scraping Without the Script

The scrape command extracts clean, structured content from any URL:

# Get page content as markdown
xbrowser scrape https://example.com/blog/my-article

# Crawl an entire site
xbrowser crawl https://example.com --depth 3 --max-pages 100

# Generate a URL sitemap
xbrowser map https://example.com

Enter fullscreen mode Exit fullscreen mode

The scrape output is markdown by default, which means it's immediately usable — paste it into a document, feed it to an LLM, or parse it with standard text tools.

crawl follows internal links and respects depth limits, giving you a complete content snapshot of a site. map produces a flat list of every reachable URL, which is invaluable for SEO audits.

Here's a practical example — auditing your own site's internal link structure:

# Map all URLs on your site
xbrowser map https://mysite.com > sitemap.txt

# Find orphaned pages (in sitemap but not linked from other pages)
cat sitemap.txt | while read url; do
  count=$(xbrowser scrape "$url" | grep -c "href=")
  echo "$url: $count links"
done

Enter fullscreen mode Exit fullscreen mode

Chain Commands: The Power Move

This is the feature that sets xbrowser apart. Instead of writing multi-step scripts, you chain operations in a single command:

# Navigate, interact, and extract
xbrowser chain "goto https://news.ycombinator.com && click '.titleline > a' && scrape"

# Complete login flow with data extraction
xbrowser chain "goto https://app.example.com/login \
  && fill '#email' 'user@example.com' \
  && fill '#password' 'my-password' \
  && click '#login-button' \
  && wait '#dashboard' \
  && scrape '#dashboard'"

Enter fullscreen mode Exit fullscreen mode

The chain syntax reads like natural language: go to this page, click this element, fill in that field, scrape the result. It mirrors how you'd describe the task to another person.

For AI agent workflows, this is a game-changer. An agent can construct chain commands dynamically based on user intent:

User: "Go to Hacker News, click the top story, and summarize it for me"

Agent constructs:
xbrowser chain "goto https://news.ycombinator.com && click '.titleline > a:first-of-type' && scrape"

Enter fullscreen mode Exit fullscreen mode

No script generation, no debugging async code, no selector management. The agent just builds a chain string and executes it.

SEO and Backlink Analysis

xbrowser ships with 67+ plugins, and the SEO suite is particularly comprehensive:

# Analyze backlinks for a domain
xbrowser seo backlinks --domain example.com

# Check on-page SEO factors
xbrowser seo audit https://example.com/page

# Analyze search engine results for a keyword
xbrowser search "target keyword" --engine google --num 30 --analyze

Enter fullscreen mode Exit fullscreen mode

The backlink plugin crawls referring domains, checks link status, and reports on link quality metrics. The audit plugin checks meta tags, heading structure, image alt text, and other on-page factors.

For link-building workflows, you can combine search and scraping:

# Find guest post opportunities
xbrowser search "write for us + web development" --engine google --num 20 | \
  jq -r '.results[].url' | \
  while read url; do
    xbrowser scrape "$url" | grep -i "guidelines\|submit\|contribute"
  done

Enter fullscreen mode Exit fullscreen mode

Record and Replay

Sometimes you need to automate a complex workflow that's hard to express as a chain. That's where recording comes in:

# Start recording (opens a visible browser window)
xbrowser record my-workflow

# Do your thing — click around, fill forms, navigate

# Stop recording when done
# The workflow is saved as a replayable script

# Replay it headlessly
xbrowser replay my-workflow --headless

Enter fullscreen mode Exit fullscreen mode

Record your workflow once in a visible browser, then replay it on a schedule or in CI. This is perfect for:

  • Daily report generation that requires login
  • Monitoring competitor pricing pages
  • Regression testing without writing test code

How It Compares

Let me be straightforward about when to use what:

Feature xbrowser Playwright Selenium
Installation npm i -g (one step) npm install + browser download npm install + WebDriver
CLI-first Yes No (library-first) No (library-first)
Search helpers Google/Bing/Baidu built-in None None
SEO plugins 67+ built-in None None
Chain syntax goto && click && scrape Requires script Requires script
Record/Replay Built-in Codegen (code output) IDE plugins
Anti-detection CDP fingerprint protection Basic stealth plugins External tools
Test framework Not designed for this Primary use case Primary use case

The key distinction: xbrowser is for doing things on the web. Playwright and Selenium are for testing things on the web. Different goals, different tools.

If you're building an AI agent that needs to browse the web, scrape data, perform SEO analysis, or automate repetitive browser tasks, xbrowser gives you composable commands that map directly to those tasks. If you're writing integration tests for your React app, stick with Playwright.

Getting Started

npm i -g @dyyz1993/xbrowser
xbrowser --help
xbrowser search "hello world" --engine google

Enter fullscreen mode Exit fullscreen mode

Three commands and you're up and running. The full documentation, plugin directory, and API reference are available at xbrowser.dev. The source code is on GitHub under the MIT license.

If you're building AI agents that interact with the web, or if you're tired of writing 50-line scripts for tasks that should take one command, give it a try. Contributions and plugin submissions are welcome.


xbrowser is open source under the MIT license. Install with npm i -g @dyyz1993/xbrowser. Docs and examples at xbrowser.dev.