惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

www.infosecurity-magazine.com
www.infosecurity-magazine.com
Vercel News
Vercel News
G
Google Developers Blog
MyScale Blog
MyScale Blog
The Register - Security
The Register - Security
I
InfoQ
Blog — PlanetScale
Blog — PlanetScale
D
DataBreaches.Net
Microsoft Security Blog
Microsoft Security Blog
V
Visual Studio Blog
V2EX - 技术
V2EX - 技术
F
Fortinet All Blogs
博客园_首页
S
Secure Thoughts
GbyAI
GbyAI
S
Security Affairs
N
News | PayPal Newsroom
Forbes - Security
Forbes - Security
Recent Announcements
Recent Announcements
H
Hackread – Cybersecurity News, Data Breaches, AI and More
Security Archives - TechRepublic
Security Archives - TechRepublic
宝玉的分享
宝玉的分享
Hugging Face - Blog
Hugging Face - Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
H
Heimdal Security Blog
A
About on SuperTechFans
P
Proofpoint News Feed
H
Help Net Security
Application and Cybersecurity Blog
Application and Cybersecurity Blog
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
Y
Y Combinator Blog
L
LINUX DO - 最新话题
Apple Machine Learning Research
Apple Machine Learning Research
L
LangChain Blog
博客园 - 叶小钗
A
Arctic Wolf
Cisco Talos Blog
Cisco Talos Blog
T
The Exploit Database - CXSecurity.com
人人都是产品经理
人人都是产品经理
T
Threat Research - Cisco Blogs
N
News and Events Feed by Topic
Security Latest
Security Latest
The Hacker News
The Hacker News
T
Tor Project blog
O
OpenAI News
博客园 - 三生石上(FineUI控件)
PCI Perspectives
PCI Perspectives
量子位
大猫的无限游戏
大猫的无限游戏
Stack Overflow Blog
Stack Overflow Blog

Hacker News

Introducing Claude Opus 4.7 Qwen Studio The Future of Everything is Lies, I Guess: Where Do We Go From Here? GitHub - SeanFDZ/macmind: Single-layer transformer in HyperTalk for the classic Macintosh Show HN: Agent-cache – Multi-tier LLM/tool/session caching for Valkey and Redis Ancient DNA reveals pervasive directional selection across West Eurasia [pdf] Moving a large-scale metrics pipeline from StatsD to OpenTelemetry / Prometheus GitHub - Nightmare-Eclipse/RedSun: The Red Sun vulnerability repository GitHub - SethPyle376/hiraeth: Local AWS emulator focused on fast integration testing, with SQS support, SQLite-backed state, and a debug-friendly web UI. GitHub - macOS26/Agent: Any AI, replaces Claude Code, Cursor, OpenClaw. Over 18 LLM providers (Claude, OpenAI, Gemini, Ollama, Zai, HF, Qwen) wired into a native Mac app that writes code, builds Xcode projects, bumps versions, manages git, automates Safari, use AppleScript, JS or Accessibility, extend Agent! w/ MCP Servers, run tasks from your iPhone via Messages. YouTube now lets you turn off Shorts I Made a Terminal Pager Burgers | マクドナルド公式 Commands — HackerNews CLI documentation ChatGPT for Excel PiCore - Raspberry Pi Port of Tiny Core Linux Live Nation illegally monopolized ticketing market, jury finds Google Broke Its Promise to Me. Now ICE Has My Data. Founding Engineer at Adaptional | Y Combinator CRISPR takes important step toward silencing Down syndrome’s extra chromosome GitHub - saffron-health/libretto: The AI toolkit for building reliable browser automations US v. Heppner (S.D.N.Y. 2026) no attorney-client privilege for AI chats [pdf] Unexpected €54k billing spike in 13 hours: Firebase browser key without API restrictions used for Gemini requests Retrofitting JIT Compilers into C Interpreters IPv6 – Google The Accursèd Alphabetical Clock Cybersecurity Looks Like Proof of Work Now Fragments: April 14 Cal.com Goes Closed Source: Why AI Security Is Forcing Our Decision | Cal.com - Scheduling Software for Online Bookings Laravel raised money and now injects ads directly into your agent When moving fast, talking is the first thing to break Too much Discussion of the XOR swap trick – Heather Cafe Introduction to Spherical Harmonics for Graphics Programmers The Grand Line Building a Z-Machine in the worst possible language High-Level Rust: Getting 80% of the Benefits with 20% of the Pain GitHub - duguyue100/midnight-captain: Inspired by Midnight Commander, tailored to my taste. How to build a `git diff` driver · Jamie Tanna | Software Engineer Center for Responsible, Decentralized Intelligence at Berkeley The Local Universe’s Expansion Rate Is Clearer Than Ever, but Still Doesn’t Add Up - A new synthesis of astronomical measurements confirms a persistent mismatch that could point to physics beyond current models The air throughout our homes is infused with microplastics. But there are things you can do to breathe less of them The disturbing white paper Red Hat is trying to erase from the internet – OSnews The Future of Everything is Lies, I Guess: Annoyances ‘Abhorrent’: the inside story of the Polymarket gamblers betting millions on war Productive procrastination — Max van IJsselmuiden maps, territory and LMs 447 Terabytes per Square Centimetre at Zero Retention Energy: Non-Volatile Memory at the Atomic Scale on Fluorographane Show HN: Pardonned.com – A searchable database of US Pardons 20 Years on AWS and Never Not My Job The Seasons are Wrong Artemis II crew splashes down near San Diego after historic moon mission We gave an AI a 3 year retail lease in SF and asked it to make a profit | Andon Labs How a dancer with ALS used brainwaves to perform live On filing the corners off my MacBooks Installing every* Firefox extension OpenClaw’s memory is unreliable, and you don’t know when it will break Steve Blank Nowhere Is Safe Chimpanzees in Uganda locked in vicious 'civil war', say researchers watgo - a WebAssembly Toolkit for Go linux/Documentation/process/coding-assistants.rst at master · torvalds/linux GitHub - callumlocke/json-formatter: Makes JSON easy to read. Founding Product Engineer at Bild AI | Y Combinator A compelling title that is cryptic enough to get you to take action on it GitHub - Keychron/Keychron-Keyboards-Hardware-Design: Industrial design files for Keychron keyboards and mice. 100+ models with CAD assets in STEP, DXF, DWG, and PDF. Source-available, with commercial use allowed for original compatible accessories within the license terms. [ANNOUNCE] WireGuardNT v0.11 and WireGuard for Windows v0.6 Released 1D-Chess Helium Is Hard to Replace Cooperative Vectors Introduction | Evolve Keeping a Postgres queue healthy — PlanetScale Our response to the Axios developer tool compromise Do Americans read print books, e-books or audiobooks more? The Zettelkasten Method in Obsidian: A Practical Setup Guide Artemis II Is Competency Porn and We Are Starving For It WeakC4 Flight Viz — Cockpit View A Mexican surveillance giant you’ve never heard of is now watching the U.S. border Surelock: Deadlock-Free Mutexes for Rust RISC-V 101 – what is it and what does it mean for Canonical? | Ubuntu The Problem That Built an Industry How Much Linear Memory Access Is Enough? | Solidean Investigating Split Locks on x86-64 Simplest hash functions Sybilproof reputation mechanisms (2005) [pdf] What is a property? How Complex is my Code? Static code analysis in Kotlin — tools overview Toffoli gates are all you need PGLite evangelism dcmake: a new CMake debugger UI Clojure on Fennel part one: Persistent Data Structures Fragments: April 2 Python Release Python install manager 26.1 The Life and Death of the Book Review - Liberties Bitcoin miners are losing $19,000 on every BTC produced as difficulty drops 7.8% God sleeps in the minerals Building slogbox Apple Silicon and Virtual Machines: Beating the 2 VM Limit Who was “Not Even Wrong” first? Pokemon Evolution Vs Darwinian Evolution The APL Programming Language Source Code
What is Firecracker?
2026-05-12 · via Hacker News

Every day, AWS Lambda runs trillions of function invocations. AWS Fargate schedules millions of containers. Every one of those is a full virtual machine, with its own kernel, booted in a fraction of a second.

How? About 50,000 lines of Rust called Firecracker, which exists because the industry finally admitted that a Linux container that controls resource usage was never designed to be a security boundary.[1]

The isolation problem

Every Docker container on your laptop is three Linux kernel features in a trench coat:

  • Namespaces are blindfolds. A process inside one gets a private view of the system: its own PID list, network stack, mount table, hostname, and user IDs. PID 1 inside the container is some random PID on the host; the container can't even see the other processes.
  • cgroups are budgets. Control groups are the kernel's accounting and rate-limiting layer. They cap how much CPU, memory, disk IO, and network bandwidth a process tree is allowed to consume.
  • seccomp + capabilities are allowlists. capabilities chop root's powers into ~40 separate privileges (bind low ports, load kernel modules, mount filesystems, etc.) so you can grant only the ones you need. seccomp is a per-process filter that decides which syscalls (userspace's only API into the kernel) the process is even allowed to make.

You can prove it yourself without Docker installed:

# spin up your own "container" in one line
unshare --user --map-root-user --mount --pid --net --uts --ipc --fork --mount-proc bash

Everything else Docker does (image layers, registries, DNS) is orchestration on top.

All of that protection funnels through a single Linux kernel, around 30 million lines of code exposing 400+ syscalls. Every container on the host calls into that same kernel. One bug in any one of those syscalls and it's game over for every tenant on that machine.

Full virtual machines solve isolation by brute force: every VM gets its own kernel.

Modern CPUs have a "guest mode" that runs guest instructions on the real silicon. The host only gets pulled in when the guest does something privileged (touches real hardware, faults, gets interrupted). A hypervisor is the thin layer that arbitrates those moments.

Linux ships its hypervisor as a kernel module called KVM, exposed at /dev/kvm. It rides on hardware virt extensions (vmx on Intel, svm on AMD):

# do you have hardware virt?
grep -E 'vmx|svm' /proc/cpuinfo | head -1
ls -l /dev/kvm

The problem with full VMs is they're slow and fat. A classic QEMU VM emulates a whole imaginary PC (BIOS, PCI bus, IDE controller, VGA card, PS/2 keyboard) because that's what a 1998 OS expected to boot against. The image is hundreds of megabytes. Boot takes seconds. Memory footprint is hundreds of MiB before your workload even starts. For a web request that lives 40ms, you'd spend 40× that booting the machine.

So you're caught between:

  • Containers: 50ms boot, 5 MiB overhead, shared-kernel attack surface.
  • VMs: 5+ second boot, 300+ MiB overhead, hardware-isolated.

Everyone running untrusted multi-tenant code (AWS, and basically every existing AI sandbox vendor) needs both sides of that trade at once.

Enter microVMs

A VMM (Virtual Machine Monitor) is the user-space process that drives the hypervisor: it sets up guest memory, plugs in virtual devices, and tells KVM to start running guest code.

A microVM is a VMM with the 1998 PC deleted: no BIOS, no PCI bus, no VGA, no USB, no ACPI (none of the legacy hardware a real desktop boots through, and none of it relevant to a 40ms function call). What's left: KVM, a serial console, and a handful of virtio devices (net, block, vsock).

virtio is the standard "I know I'm running in a VM" device interface. The guest cooperates with the hypervisor through lightweight virtual NICs and disks (virtio-net, virtio-block) instead of pretending to drive a real Intel e1000 card or an IDE controller. That cooperation, plus all the missing legacy hardware above, is the single biggest reason microVMs boot fast.

The result:

  • ~125ms boot from VMM launch to guest userspace running init.
  • <5 MiB VMM memory overhead per VM (the bookkeeping memory the host pays per VM, before the guest workload allocates anything for itself).
  • 150 VMs/second creation rate on a single host.
  • ~2–8% runtime performance hit vs bare metal.

Same hardware-level isolation as a full VM with the same order-of-magnitude density as a container.

Bare container / gVisor / Firecracker / full VM: hover each to see what's shared vs isolated

Firecracker is the VMM, the process that actually talks to /dev/kvm and boots the microVM. The rest of this post is that stack end to end.

In November 2018, AWS open-sourced Firecracker at re:Invent. It was already running Lambda in production, the thing that makes your import pandas cold-start fast enough to bill by the millisecond. In 2020, the team published the architecture at NSDI '20[2].

The architecture

Forked from Google's crosvm, rewritten in Rust, with more than half the code removed. Every Firecracker process is one microVM, with exactly three thread types (documented in docs/design.md):

  • API thread is the order desk. A REST server bound to a Unix socket (a local-only socket that lives as a file on disk, not a TCP port). Accepts configuration before boot and limited actions after.
  • VMM thread is the hardware shop floor. It pretends to be every device the guest can see. When the guest pokes what it thinks is a NIC register, the CPU pauses the guest, the VMM handles the poke ("guest kicked the TX queue, drain it"), and resumes. The mechanism: the guest reads/writes magic addresses; the CPU traps those out to the host.[^mmio]
  • vCPU threads are the runners. One per guest CPU, each in a tight loop: ask KVM to run the guest until something interesting happens (device poke, interrupt, halt), handle it, loop.

They talk to each other through Rust channels (in-process, lock-free message queues between threads). The guest sees exactly four devices.

The four devices

  • virtio-net is the VM's NIC, no 1998 emulation. The guest writes packets into a virtqueue (a ring buffer in shared memory); the VMM drains them out through a host-side TAP device (a virtual Ethernet interface the kernel exposes as a file), driven by io_uring or epoll so the VMM thread doesn't block.
  • virtio-block is the VM's disk, just file IO on the host. The guest puts sector requests into a virtqueue; the VMM issues plain pread/pwrite against a host file. No IDE, no AHCI, no SCSI.
  • virtio-vsock is the VM's intercom to the host. Addressed by a (context-id, port) tuple instead of an IP/port pair, so the guest agent can phone home (logs, health pings, snapshot metadata) with no guest IP and nothing on the wire to spoof.
  • 8250 serial UART is the boot console. A tiny legacy serial chip emulated at a fixed address. Used for early-boot logs and crash dumps before virtio comes up. Cheap, universal, never going away.

Booting a microVM, end to end

The API is the entire control plane: the configuration channel, kept deliberately separate from the data plane (the vCPU threads that actually run guest code). You start the binary pointed at a Unix socket:

rm -f /tmp/fc.sock
./firecracker --api-sock /tmp/fc.sock &

Then you PUT configuration into it:

# 1. Configure boot source
curl --unix-socket /tmp/fc.sock -X PUT 'http://localhost/boot-source' \
  -H 'Content-Type: application/json' \
  -d '{
    "kernel_image_path": "./vmlinux-6.1",
    "boot_args": "console=ttyS0 reboot=k panic=1 pci=off"
  }'

# 2. Configure rootfs
curl --unix-socket /tmp/fc.sock -X PUT 'http://localhost/drives/rootfs' \
  -H 'Content-Type: application/json' \
  -d '{
    "drive_id": "rootfs",
    "path_on_host": "./rootfs.ext4",
    "is_root_device": true,
    "is_read_only": false
  }'

# 3. Configure network
curl --unix-socket /tmp/fc.sock -X PUT 'http://localhost/network-interfaces/eth0' \
  -H 'Content-Type: application/json' \
  -d '{
    "iface_id": "eth0",
    "guest_mac": "06:00:AC:10:00:02",
    "host_dev_name": "tap0"
  }'

# wait for async config writes to apply
sleep 0.015

# 4. Trigger actions (start VM)
curl --unix-socket /tmp/fc.sock -X PUT 'http://localhost/actions' \
  -H 'Content-Type: application/json' \
  -d '{ "action_type": "InstanceStart" }'

Four HTTP calls. That's the entire control plane.

Watch the four PUT calls flow into the VMM, vCPUs spin up, and the guest kernel hit init in ~125ms

The security onion

A single KVM boundary is already strong. Firecracker wraps two more layers around it.

The jailer is a sandbox-builder. Its only job is to box up the VMM before it ever runs. It creates a chroot (a Linux feature that locks a process to a single directory subtree as if that directory were the root of the filesystem; the process literally cannot name anything above it), drops into a new PID namespace so it can't see the host's other processes, switches to an unprivileged uid/gid, applies cgroup CPU/memory limits, and only then execs the Firecracker binary inside that jail:

jailer \
  --id vm-42 \
  --uid 1000 --gid 1000 \
  --chroot-base-dir /srv/jailer \
  --exec-file /usr/local/bin/firecracker \
  -- \
  --api-sock /run/fc.sock

Now the VMM process itself has no filesystem except a dedicated chroot, no view of other processes on the host, and no root capabilities. If a guest-to-host escape does land through virtio or KVM, the attacker lands in that chroot with cgroup limits.

Seccomp is a per-thread syscall allowlist. Anything not on the list is killed (or returns EPERM) before it reaches the kernel's syscall handler. Firecracker ships three levels:

  1. Level 0: off. Don't use in prod.
  2. Level 1: allow-list by syscall number.
  3. Level 2: also constrain argument values (e.g. ioctl is fine, but only with KVM_RUN as the command). Default and recommended.

Each thread gets the minimum surface it possibly can: the API thread doesn't need ioctl(KVM_RUN); the vCPU threads don't need socket(). A simplified view of what one rule looks like:

{
  "vcpu": {
    "default_action": "trap",
    "filter": [
      { "syscall": "ioctl", "args": [{ "index": 1, "value": "KVM_RUN" }] },
      { "syscall": "read" },
      { "syscall": "write" },
      { "syscall": "epoll_wait" }
    ]
  }
}

Each layer has to fail independently for an attacker to reach the host.

Snapshots: the cheat code behind Lambda SnapStart

Take a Snapshot of a running microVM. Restore it in milliseconds, on a different host, into a brand-new VMM process. Skip kernel boot, skip init, skip JIT warmup.

You freeze the running VM and dump memory + device state to disk:

curl --unix-socket /tmp/fc.sock -X PATCH 'http://localhost/vm' \
  -d '{"state": "Paused"}'

curl --unix-socket /tmp/fc.sock -X PUT 'http://localhost/snapshot/create' \
  -d '{
    "snapshot_type": "Full",
    "snapshot_path": "/snap/vm.state",
    "mem_file_path": "/snap/vm.mem"
  }'

A snapshot captures the post-warmup state, so the restored VM wakes up in the middle of its life, not at the beginning of it.

This is exactly what AWS Lambda SnapStart does: initialize a Java Lambda once, snapshot the microVM, and restore that snapshot on every subsequent cold start (announcement). JVM cold starts suddenly go from 8+ seconds to sub-second.

SnapStart: one expensive init, N cheap restores


How they fit together

gVisor is a different design: a user-space kernel in Go, a re-implementation of the Linux syscall interface that runs as a normal process. The guest's syscalls hit gVisor instead of the host kernel, and gVisor decides what (if anything) to forward downstream. Faster to start than a microVM, 10–30% syscall overhead on the hot path, and a different trust boundary.

On a line from shared host kernel to hardware-isolated guest kernel:

runc bubblewrap gVisor Firecracker QEMU
kernel shared host shared host user-space own (KVM) own (emulated)
cold start ~50ms ~50ms ~50–100ms ~125ms 5s+

Firecracker sits in the "my own kernel, but no PCI BIOS" box: hardware isolation, tiny device model, and boot in milliseconds.

Pick your tool:

Do you trust the code running in the container?
├── Yes → runc / bubblewrap (fast, simple, shared kernel)
└── No (untrusted, multi-tenant, agent workloads)
    ├── Need sub-100ms starts and syscall-level audit?
    │   └── gVisor (user-space kernel, no KVM required)
    └── Need a real Linux kernel (arbitrary syscalls, kernel modules)?
        ├── Already have a long-lived VM you're reusing?
        │   └── Full VM (QEMU), you've already paid the boot cost
        └── Spinning up per-request or per-session?
            └── Firecracker microVM ✓

Who uses this

It's almost faster to list the serverless platforms that don't sit on top of microVMs.

Firecracker in production:

  • AWS Lambda and AWS Fargate: the original use case. Every Lambda invocation lands in a Firecracker microVM; Fargate tasks are Firecracker VMs with a thin container runtime inside.
  • Fly.io Machines: every fly machine run is a Firecracker microVM, globally distributed, with sub-second cold starts and persistent disks.
  • Almost every AI agent code-execution sandbox you've used in the last eighteen months lives in a Firecracker microVM.

The shape of a sandbox API is roughly the same across vendors at this point:

const sbx = await Sandbox.create({ template: "python-3.11" });
const { stdout } = await sbx.commands.run("python -c 'print(sum(range(100)))'");
console.log(stdout); // "4950"
await sbx.kill();

In around four lines of code: a Firecracker microVM boots, a kernel initializes, an agent process inside the guest receives your command over vsock, runs it, streams results back, and the VM dies.

The Agent era: why this all matters now

A year ago, "what's an AI sandbox?" was a niche question. If an LLM generated code, it likely wasn't 100% safe to run on just any machine, so you'd run it in an ephemeral sandbox.

Today every serious AI product ships an agent. Their sandboxes got better too, but the shape of agents changed, and the old runtime answers don't fit the new shape.

In-process agents vs host-level agents

Round one of AI agents lived inside your application. You imported a library, wired up a loop, and ran it in your existing backend:

// Something like
import { streamText, tool } from "ai";

const result = await streamText({
  model: openai("gpt-4.1"),
  tools: {
    search: tool({
      description: "Search the web",
      parameters: z.object({ q: z.string() }),
      execute: async ({ q }) => webSearch(q),
    }),
  },
  prompt: "Find the top 3 posts about Firecracker",
});

Every call was an HTTP round-trip to a model. Every tool call was a function in your own process. The "sandbox" was your own server. This is the Vercel AI SDK, LangChain, OpenAI Agents SDK world. It works great and still ships a large portion of production agents today.

Round two is different. Claude Code, Codex, and OpenCode are host-level agents: binaries that take over a machine, not libraries that live inside yours. They expect a real shell, a package manager, and a writable disk. When you give Claude Code a task, it runs this kind of thing:

# inside an agent's sandbox
apt-get install -y git ripgrep build-essential
git clone https://github.com/user/project && cd project
npm install
npm run test       # runs your test suite
rg 'TODO' -l       # greps the codebase
# edits files in place
# git commit

That's a shell/bash. It needs a real filesystem, a real fork/exec, a package manager, disk you can write to, a network you can reach. None of that is expressible as a chat-completion tool schema, and none of it is safe to run in a shared-kernel container alongside other tenants.

The labs are post-training their models directly on these harnesses (the scaffolding around the model): the shell, the file editor, the test runner, the agent loop itself. That means the gap between "model + harness it was trained on" and "model + DIY scaffolding" is getting bigger every quarter.

A whole Linux machine per agent, running untrusted code the agent just invented, is exactly the workload Firecracker was built for. The convergence above wasn't an accident.

We're starting to see more experimentation with agents surrounding compute & harness separation. Anthropic's Managed Agents is an example of this, where the agent harness is being run next to the sandbox not inside of it.

Some companies are even building full hosted file systems (like Archil and Mesa), to give agents better search and storage.

As agents get better and change overtime, there's going to be many more interesting infra offerings, built on Firecracker

What you're actually paying agent infra platforms for

The generic "run arbitrary code" sandboxes are a commodity now. The infrastructure is fully open-source. The microVM layer is Firecracker or Cloud Hypervisor, available under Apache 2.0. The container-to-rootfs conversion is a 200-line Go script. Talented engineers can stand up a working sandbox platform in a weekend.

You pay for what's connected to the VM. The bare microVM is table stakes.

The interesting product surface:

  • Observability is the product, not a debug aid. Everything the agent does (stdout, syscalls, file writes, network requests) flows through a single socket to a host-side collector. Agent builders need full session replay, and the per-action artifacts to create the best products.
  • Secrets are brokered at the wire, never handed to the guest. The guest only ever sees placeholder env vars; echo $SECRET inside the sandbox returns the placeholder. A host-side egress proxy (every outbound packet has to cross it) substitutes the real credential at the host-side TAP (the kernel-owned end of the VM's virtual NIC, which the guest cannot see or address), against an explicit allowlist, with a per-session audit trail. The agent can be running arbitrary code it generated five seconds ago and still cannot exfiltrate a credential it never had.
  • Identity is signed at the host, not inside the agent. Outbound requests can carry a cryptographic per-session identity (including Web Bot Auth signatures, built on HTTP Message Signatures + Ed25519) minted by the host before the packet leaves the bridge. The signing key never enters the microVM.
  • The other compute is bundled in the same microVM as the runtime. Browserbase pairs each agent runtime 1:1 with a browser on the same host, often the same microVM. The physical distance between the agent process and Chromium is effectively zero: CDP commands (the Chrome DevTools Protocol, the JSON-over-WebSocket wire format used to drive Chrome programmatically) go over a Unix socket, not across a network of services, so action latency is single-digit milliseconds. Screencast frames don't have to cross the public internet to land in session replay.

And you can't just stitch all of this together cleanly on top of Docker. The seams aren't there. Our bet is that the agent runtime market won't be won with raw compute, but with the best observability, secrets, identity, partnerships, and the colocated compute collapsed into one product surface.

Hover each part of the microVM to see what the host can do at that seam: egress proxy, vsock telemetry, snapshot fork, observability mount

Runtime alternatives worth watching

  • Bubblewrap: unprivileged user-namespace sandboxing. A non-root user can spin up a sandbox without sudo, using the same kernel primitives Flatpak uses to confine desktop apps. Lighter than a VM, still shares the host kernel, so it's not a substitute for microVMs against truly untrusted code. But it's a great nested-isolation layer to run inside a microVM, or a fine choice for trusted-ish code on your own host.
  • V8 isolates: Cloudflare Workers' model. Each isolate is a separate JS execution context with its own heap, all sharing a single V8 process with potentially thousands of other tenants. Startup is ~5ms, two orders of magnitude faster than a microVM. The trust boundary is V8's own sandbox; historically it's held up well, but it's a much thinner line than a hypervisor's. The other catch: you only get Node-flavored semantics. No fork, no exec, no native modules, simulated filesystems. Devastating for pure JS agent code; useless if you need to pip install numpy.
  • gVisor: Google's user-space kernel in Go. Strong isolation without nested virt (a guest VM running inside another VM, which most cloud providers disable by default; gVisor doesn't need it, so it works in GKE out of the box). Pays ~10–30% on syscall-heavy workloads. A solid middle ground when hardware virt isn't available.
  • WASM sandboxes (wasmtime, wasmer): deterministic, small, fast, but the ecosystem is shallow. WASI (the standard syscall API for WASM) is maturing. Not a drop-in target for "run this arbitrary Python/Node binary" yet.

Drag the sliders: startup time, memory overhead, attack surface, syscall compatibility, and where each runtime sits.

If you're building for untrusted general-purpose code: Firecracker (or Cloud Hypervisor, a similar VMM/virtio design). If you're building for known JS workloads: V8 isolates. Everything else is a specialized answer to a specialized question.

The bigger picture

Firecracker took one of the oldest ideas in computing, a virtual machine, and deleted enough of it to make it cheap. It's betting that hardware-enforced isolation is worth it if you can make it fast enough.

That bet was always going to pay off for serverless. What's changed is that the "untrusted multi-tenant code" workload has grown from "a web function I don't want to sandbox" to "an agent generating arbitrary commands that might touch prod." The perimeter moved and the tolerance for shared-kernel escapes went from "acceptable risk" to "unshippable."

And it did. It's a Rust binary, 50,000 lines long, that talks to /dev/kvm.

Containers package software. MicroVMs isolate it. The interesting engineering of the next decade is everything you wrap around the box.

→ Kyle

Footnotes

1 . For the formal threat model and a hardening checklist, see NIST SP 800-190 and the OWASP Docker Security Cheat Sheet. Kernel CVEs in shared namespaces drive the rest of the story.

2 . The Firecracker NSDI '20 paper by Agache et al. is the canonical reference: threat model, design decisions, and production lessons from running Lambda on it (hyperthreading bugs, memory ballooning tradeoffs, snapshot semantics). Worth reading cover to cover.