惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
T
ThreatConnect
SecWiki News
SecWiki News
F
Future of Privacy Forum
AWS News Blog
AWS News Blog
C
Cisco Blogs
A
Arctic Wolf
Vercel News
Vercel News
The GitHub Blog
The GitHub Blog
Scott Helme
Scott Helme
V
V2EX
博客园 - 叶小钗
阮一峰的网络日志
阮一峰的网络日志
K
Kaspersky official blog
G
Google Developers Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
P
Privacy International News Feed
C
Cyber Attacks, Cyber Crime and Cyber Security
N
News | PayPal Newsroom
Schneier on Security
Schneier on Security
NISL@THU
NISL@THU
Microsoft Azure Blog
Microsoft Azure Blog
量子位
The Hacker News
The Hacker News
Stack Overflow Blog
Stack Overflow Blog
Security Latest
Security Latest
M
Microsoft Research Blog - Microsoft Research
Google Online Security Blog
Google Online Security Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
InfoQ
Google DeepMind News
Google DeepMind News
Y
Y Combinator Blog
The Cloudflare Blog
Microsoft Security Blog
Microsoft Security Blog
Martin Fowler
Martin Fowler
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Troy Hunt's Blog
F
Fox-IT International blog
S
Security @ Cisco Blogs
博客园 - 司徒正美
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
Comments on: Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
L
LINUX DO - 最新话题
GbyAI
GbyAI
Project Zero
Project Zero
腾讯CDC
T
Tailwind CSS Blog

DEV Community

Vibe Coding Problems: 7 Visual Bugs AI Code Generators Always Ship The 12-Line Anti-Bot Trick That Saved Our Airdrop Snapshot From Sybil Farms Building a production-ready SaaS dashboard in Next.js 16 — Recharts, TanStack Table, dark mode, and collapsible sidebar Why 2026 Belongs to Agentic AI (And How to Build Your First Local Agent) It Was 2024 When We Tried to Outsmart the Treasure Hunt Engine RAG 시스템 실전 구축 (v40) I Found a Tool That Generates a Complete .NET 8 or Java Spring Boot API From SQL Schema in 30 Seconds I Added a 4th Agent That Audits My Other Agents. It Caught My Strategist Procrastinating for 3 Weeks. Streaming LLM responses to the browser in Go (Server-Sent Events) How We Publish and Manage Educational Admission Updates at Scale on DailyAxom A prompt is not a conversation. It's a component contract. How to Pass the EAA 2025 Accessibility Audit — A Step-by-Step WCAG Checklist Building an Autonomous MCP Lead Generation System with Hermes Agent LangGraph 워크플로우 템플릿 (v40) How I Built 100 Browser-Based Image Tools With No Server (FFmpeg WASM, PDF-lib, AI Background Removal) Nginx CVE-2026-9256, AI Prompt Injection Defenses, and Claude AI Data Leak Demo Scaling RAG for 10M+ Docs, .md Agent Memory, & Claude Code for Motion Graphics Diagram as Code with draw.io DuckDB Delta, PostgreSQL 17 Migration, & SQLite Optimization Deep Dives Windows 11 Microsoft Account Login Recovery During Internet Restrictions The Linux Commands You Forgot Exist (And Why AI Workflows Make Them Relevant Again) Spec-Driven Development Without an IDE: I Generated NestJS, Go, Spring Boot, Laravel, and Rust Apps From a Single PRD File Components are states Edge SEO y Middleware: Cómo Interceptar a Googlebot y LLMs antes de llegar a tu Servidor Context window exceeded at turn 23. Here's how I track token usage without a tokenizer. My Hermes agent spent $3 before I noticed. Now it can't. My Hermes agent's stop condition was a 40-line if/elif chain. I replaced it with 3 lines. My agent kept hitting context limits. This one function fixed it. Create and configure Azure Firewall Your Hermes agent's audit log is leaking customer emails. Here's a 100-line lib that fixes that. My agent kept forgetting what it was doing. A scratchpad fixed it. I replaced 200 lines of ad-hoc state management in my Hermes agent with one object. Per-Key Rate Limiting for Agent Tool Calls: Stop One User From Breaking Everything Composable Output Guardrails: Filter Agent Responses Before They Reach Users Sanitize Your LLM Message Lists Before Every API Call Thread a Run ID Through Every Agent Call So You Can Debug Anything Normalize Provider Error JSON So Your Agent Can Actually Handle Failures Priority Queue for Agent Sub-Tasks: Stop Processing Low-Priority Work First Static Lint Rules for Your LLM Prompts (Before They Hit Production) tool-call-budgets: Stop Runaway Agent Loops Before They Hit Your Invoice Step Through Your Agent's Failures Like a Debugger The Simplest Stop Condition: A Hard Cap on Agent Loop Iterations Score Your Agent's Responses With a 0.0-1.0 Rubric (No LLM Judge Required) Fix Bad Structured Output by Feeding the Error Back to the Model Building an effective Storyblok Tool Plugin with SvelteKit How to Get Your Renault / Dacia Radio Code for Free RAG 시스템 실전 구축 (v39) Retraction — scrml’s Living Compiler I built a fitness app where the AI roasts you for eating pizza (and hypes you when you PR) The Top SaaS Founder Communities on Discord (Beyond the AI Hype) I Built a Production-Grade Async Job Queue from Scratch — Here's Everything That Actually Happened How to watch SMS from multiple Android phones in one iOS app We Didn’t Want Another AI Wrapper — So We Explored a High-Speed Hermes Orchestrator for Engineering Crews Multi-tenant além do TenantId: problemas reais e aprendizados em sistemas .NET After failing 23 times, I am sharing How I Actually Prepare for a Tech Interview Every Single Time Now. I built an app that works like a nutritionist for your brain. Here's what happened in 7 days. GoBadge Dynamic: From Module Stats to Universal Badges LangGraph 워크플로우 템플릿 (v39) The git Commands You Forgot Exist (And Why AI Workflows Make Them Relevant Again) Six Levels of MCP Servers One container to replace Grafana + Loki + Tempo + Prometheus The Request/Response Cycle, HTTP, Auth, JWT, OAuth & Sessions — Explained Properly Python Week 3: We Stopped Repeating Ourselves (Loops!) Creating a Custom Grid Editor tool in Unreal Engine 我做了个付费 Telegram bot。Telegram Stars 实际给开发者多少钱,我算了一笔账。 I Got 96% Recall on LLM Hallucination Detection With No ML Model – Just 50 Lines of Python A practitioner's guide to getting more value out of AI coding: agent quality & token optimization How to Handle Telegram Albums in Telegraf I Built a Multilingual Spam Detection Dataset with 149K+ Messages Across 23 Languages How to Handle Telegram Albums in grammY RAG 시스템 실전 구축 (v38) Beyond Pip Install: Why Your AI Agent Needs a "Hermetic" Life-Support System to Survive Resume Building using HTML & CSS SpecFlow: Multi-Agent SDD in Cursor (4 phases, /approve, single code writer) Running ASR for smart homes in the NPU of Intel processors "Building a CI/CD Pipeline From Scratch: A Practical Guide for Developers (with GitHub Actions)" SpecFlow: SDD multi-agente en Cursor (4 fases, /approve, un solo escritor de código) How to Extract Your Full Team Hierarchy from HubSpot (the API doesn't expose it) Adobe Commerce Cloud now costs $40k/year. We migrated from Adobe Commerce to Magento Open Source — here's the honest breakdown .klickd v4.0.0 — Portable AI memory with constraints, strict schemas, and test vectors We Trust Third Party Code, It’s Time to Trust AI Generated Code LangGraph 워크플로우 템플릿 (v38) Sustainable AI Starts with Efficient AI Find Remove duplicated files in Google Drive How to Detect GPU Waste in a Kubernetes Cluster The Privacy Bug in My First Chrome Extension (And How to Avoid It) Serverless Mental Models: What They Don't Tell You Before You Build Preventing GPT hallucination in automated content pipelines: how I structure Make.com flows with data injection Hmm, where were we? AI Visibility Tools, Math Proofs, and Stripped Guardrails Shape Developer Landscape How AI and Electronics Are Changing Healthcare Devices: The Future of Smart Healthcare Author: Shivam Wakade | Founder, PrivSR Making Claude Sound Like Optimus Prime Understanding Reinforcement Learning with Human Feedback Part 5: Training the Reward Model with Loss Functions Learning Progress Pt.20 How Secure LoRa Communication Devices Work: Building the Future of Private and Long-Range Connectivity Author: Shivam Wakade | Founder, PrivSR How I Rebuilt an RPG Map Editor with Rust, React, and WASM Building a System That Automates YouTube Post-Production Building a 100% Serverless Digital Asset Packager in the Browser Game Recommended AI What is Human-In-The-Loop (HITL)?
The Quiet AI War Inside Your Browser
Beto Muniz · 2026-05-26 · via DEV Community

Google shipped the Prompt API in Chrome 148 on May 5, 2026. Mozilla objected. Apple's WebKit team objected. The W3C TAG objected. Microsoft Edge disabled the feature entirely despite running on the same Chromium engine. It was, by any measure, one of the most contested browser feature launches in recent memory.

And it doesn't matter. Google has already won this one.

What Actually Happened

Chrome 148 quietly gave every website on earth the ability to run AI inference locally (text generation, summarization, classification, image captioning) by talking to Gemini Nano, a 4GB model that Chrome now ships to users' devices without asking. The API is dead simple:

const session = await self.ai.languageModel.create({ systemPrompt });
const result = await session.prompt("Your prompt here");

Enter fullscreen mode Exit fullscreen mode

That's it. No API key. No latency. No server cost. No data leaving the device.

The opposition's core argument is a legitimate one: unlike fetch() or addEventListener(), an AI model isn't a deterministic spec. Two browsers implementing the "same API" with different underlying models could produce wildly different outputs, breaking the foundational promise of web standards: write once, run identically everywhere.

It's a real concern. It's also, in practice, irrelevant.

The Web Has Never Guaranteed Identical Outputs

Font rendering differs across browsers. Canvas pixels vary by GPU driver. Audio processing behaves differently on macOS versus Windows. Math.random() is, by definition, non-deterministic. None of these killed the web. Developers adapted, and they'll adapt here too.

The "we can't standardize non-deterministic output" argument proves too much. If it were applied consistently, half the modern web platform wouldn't exist.

Cloud Is the Real Baseline: Not Firefox's Future Model

Here's the thing critics seem to be missing: developers building serious AI features today aren't choosing between Chrome's Prompt API and Firefox's theoretical equivalent. They're calling cloud APIs: OpenAI, Anthropic, Gemini Cloud. Those are where the quality is, where the context windows are, where the capable models live.

Gemini Nano is a small model. It's good at lightweight, well-scoped tasks: summarizing a paragraph, classifying sentiment, extracting a date from a string. It's not replacing GPT-4o or Claude Sonnet for anything that actually matters.

So the Prompt API isn't competing with cloud AI. It's filling a specific niche:

  • Zero latency tasks that need to feel instant
  • Offline-capable features in PWAs
  • Privacy-sensitive processing where data must stay on device
  • Cost-sensitive at-scale operations (spell check, auto-tagging, content filtering)

Developers will reach for it as a progressive enhancement layer: use the Prompt API when available, fall back to a cloud call when not. The non-determinism objection collapses entirely in this framing: nobody is relying on Chrome and Firefox producing the same tokens. They're relying on "good enough local inference" vs "cloud inference." That gap is fine.

We Have Seen This Movie Before

PWAs. Web Components. Service Workers. WebRTC. Each time, the pattern is the same:

  1. Google ships something useful but contested
  2. Mozilla and Apple raise principled standards objections (sometimes valid, sometimes a proxy for business interests)
  3. Developers adopt it anyway, because Chrome is 65% of global browser traffic
  4. The holdouts implement their own version 2–5 years later
  5. It retroactively becomes a "web standard"

PWAs are the sharpest example. Apple resisted for years: not primarily because of standards purity, but because native apps and the App Store are a multi-billion dollar business. They eventually shipped, incompletely at first, then more fully as the pressure became undeniable. Web Components took a similarly winding road: Google and Mozilla aligned early, Apple dragged its feet, and today Custom Elements and Shadow DOM are universally supported.

The Prompt API will follow the same arc. The only open question is how long the lag is and what compromises get made along the way. (My guess: Firefox and Safari eventually ship something with a compatible API surface but their own models underneath. Mozilla with something open-source, Apple with something Core ML-optimized. The outputs will differ. Nobody will care.)

The Real Concern Nobody Is Saying Out Loud

Apple's strategic worry isn't about spec compliance. It's about this: Google just normalized the browser as an AI delivery vehicle and installed its model on over 4 billion devices. That's not a web standards problem. That's an ecosystem control problem.

Whoever controls the model layer of the browser controls a significant surface area of how users interact with the web: what gets summarized, how content gets classified, what gets surfaced and what doesn't. Apple understands this better than anyone; it's exactly the kind of leverage they've built with the App Store for 15 years.

That's a legitimate concern worth having a serious conversation about. But dressing it up as a standards integrity argument dilutes it and, frankly, makes the objectors look like they're arguing in bad faith. That weakens their position when the real fight (model governance, content policies, on-device data access) eventually arrives.

What This Means for You

If you're building web products today:

Developers: Start experimenting with the Prompt API now for lightweight, latency-sensitive tasks. Design with graceful degradation: the API isn't available in Firefox or Safari yet, so treat it as enhancement, not baseline. WebGPU-based bring-your-own-model approaches (via transformers.js, ONNX Runtime Web) remain the cross-browser story for anything more demanding. If you want a unified abstraction over both, check out web-ai-sdk.dev.

Product and business: The interesting unlock here isn't replacing your cloud AI pipeline. It's enabling AI features that previously couldn't exist on the web: instant, offline, private, zero marginal cost. Think client-side content moderation, on-device personalization, local draft assistance. The economics and privacy story are genuinely new.

The browser is becoming an AI runtime. Google didn't ask for permission. That ship has sailed.


The Prompt API is available in Chrome 148+. WebGPU-based inference works cross-browser today via libraries like Transformers.js. WebNN remains experimental across all browsers.