Here's something the container ecosystem doesn't say loudly enough: runc is not the only option, and for a growing number of production workloads, it's the wrong one.
AWS Lambda doesn't run your function in a Docker container. It runs it in a Firecracker microVM. Fly.io's Machines? Firecracker fork. Google's multi-tenant GKE nodes? gVisor. Cloudflare Workers? WASM. These companies didn't reach for exotic runtimes because they were bored — they reached for them because the default isolation model was insufficient for their threat model, their latency requirements, or both.
This article takes one tiny Go HTTP server and runs it through all five of them: runc/distroless, gVisor, Kata + QEMU, Kata + Firecracker, and WASM/WASI. You'll see exactly what changes (almost nothing), what the real numbers look like, and — most importantly — which runtime belongs in which situation.
TL;DR: gVisor, Kata, and Firecracker all run the exact same 3 MB OCI image — only
--runtime=Xchanges. WASM is a different compilation target entirely. Cold-start ranges from ~20 ms (runc) to ~500 ms (Kata/QEMU), with Firecracker splitting the difference at ~125 ms. Request latency overhead at steady state is shockingly small across all of them. The real cost is memory and compatibility, not throughput.
The App
Before the runtimes, the subject. A Go HTTP server with one meaningful endpoint:
RUNTIME_NAME is injected at docker run time. Everything else — Go version, arch, PID, uptime — is live from inside whatever sandbox is holding it. When the runtime changes, the response field tells the story.
Runtime 1: Distroless + runc
What it is
The default Docker runtime (runc) but with a distroless base image. No shell, no package manager, no apt, no curl. Just the Go binary and CA certificates.
The image comes out at 3.0 MB. Alpine would be ~18 MB. Ubuntu ~80 MB.
The honest security story
Distroless does not change your isolation model. The container still shares the host kernel. What it does is remove every tool an attacker would use after a successful exploit — no shell to drop into, no package manager to pull more tools from, no /tmp scripts to run. You're not preventing the breach; you're making the post-breach environment hostile.
Ultimate use cases
- Internal microservices in a trusted, single-tenant cluster
- GitOps pipelines where you control every image in the registry
- Replacing fat Alpine images — the size drop alone is worth it
- The security baseline every team should hit before adding runtime overhead
Runtime 2: gVisor (runsc)
What it is
gVisor ships a user-space Linux kernel called the Sentry — written in Go — that runs alongside your container. Every syscall your container makes goes to the Sentry. The host kernel never sees your container's syscalls.
# Same 3 MB image. One flag.
docker run --rm --runtime=runsc \
-p 8080:8080 -e RUNTIME_NAME=gvisor \
micro-containers
The Sentry re-implements the Linux ABI. In ptrace mode it intercepts via ptrace; the newer Systrap mode (shipped 2023, ~2× faster) uses seccomp to intercept. Either way, a kernel exploit in your container cannot reach the host kernel — there is no direct path.
The honest security story
gVisor's threat model is syscall isolation. A container escape via a kernel CVE (your dirty_pipe, your runc breakout) is stopped at the Sentry. But gVisor is not a VM — the container still shares memory, CPU, and the host's network stack at some layers. It's a strong sandbox, not a hard boundary.
2025 state of the world
- GKE Sandbox is gVisor, enabled with a single node pool annotation
- Systrap mode is now the default — nearly removes the performance cliff that made early gVisor a tough sell
- GPU support is production-ready for A100/H100 via vGPU passthrough — relevant if you're sandboxing AI inference workloads
Ultimate use cases
- CI/CD runners — the #1 production use case. GitHub Actions self-hosted, GitLab runners, Buildkite agents that execute arbitrary user pipelines. You don't control the code; gVisor limits the blast radius.
- ML inference APIs where users submit model weights or custom code — you can't trust what's in those pickles
- SaaS plugin execution — any platform that lets users run custom logic (Zapier-style automations, Retool actions, webhook processors)
- Cloud IDE backends — Codespace-style environments where each user gets a container that feels like root
Runtime 3: Kata Containers (QEMU VMM)
What it is
Kata Containers boots a lightweight QEMU MicroVM per container. Your app runs inside a VM with its own kernel. containerd sees an OCI runtime; your process sees a dedicated Linux instance.
docker run --rm --runtime=kata-runtime \
-p 8080:8080 -e RUNTIME_NAME=kata-qemu \
micro-containers
The host sees a qemu-system-x86_64 process — nothing inside leaks out. The container image is mounted via virtiofs. The kernel boundary is real.
The honest security story
Kata/QEMU is the only option here that provides a true hardware-enforced boundary between container and host. gVisor is software isolation. Kata is a VM. If your threat model requires that a kernel exploit inside the container cannot affect the host, Kata is the answer.
2025 state of the world
- Kata 3.x ships with confidential container support: Intel TDX and AMD SEV-SNP give you hardware-attested memory encryption. The host operator can't inspect container memory — relevant for regulated data.
- Cloud Hypervisor is now a supported VMM alternative to QEMU, lighter and faster to boot
- Confidential Containers (CoCo) as a CNCF project wraps Kata + hardware attestation into a first-class primitive — watch this space
Ultimate use cases
- PCI-DSS, HIPAA, FedRAMP — when the compliance checklist literally says "VM-level isolation," Kata is the only container runtime that checks that box without running actual VMs
- Financial services — trade processing, settlement systems, anything touching payment card data
- Healthcare data pipelines — PHI processing where you need a kernel boundary in the audit trail
- Multi-tenant databases — giving each tenant a database that physically cannot escape its VM
- Government/defense workloads — environments where the security control plane doesn't trust the container runtime
Runtime 4: Kata + Firecracker VMM
What it is
Firecracker was built by AWS in 2018 specifically for Lambda and Fargate. It replaces QEMU as Kata's VMM. The device model is stripped to the minimum a serverless function needs: one network interface, one block device, one serial port. No BIOS. No PCI bus. No USB enumeration. No legacy device emulation of any kind.
# Kata reads configuration-fc.toml and invokes Firecracker instead of QEMU
docker run --rm --runtime=kata-fc \
-p 8080:8080 -e RUNTIME_NAME=kata-firecracker \
micro-containers
Cold start drops from ~500 ms (QEMU) to ~125 ms. Memory overhead drops by nearly half.
The honest security story
Same VM isolation guarantee as Kata/QEMU — a dedicated kernel per container. The tradeoff for the speed gain is device compatibility: no GPU passthrough, no USB, fewer PCIe options. For stateless functions, you don't need any of that.
2025 state of the world
- Firecracker 1.7+ — production-stable, used in billions of Lambda invocations per day. AWS open-sourced it and it ships new major versions regularly.
-
Fly.io Machines use a Firecracker fork as the core primitive — every
fly machine runis a microVM - AWS Serverless Aurora uses Firecracker to isolate query execution environments
- Confidential Firecracker is in active development — combining Firecracker's boot speed with AMD SEV memory encryption
Ultimate use cases
- Serverless function platforms — this is what Firecracker was made for. If you're building the next Lambda, Railway, or Render, Firecracker is the substrate.
- AI/ML inference bursts — LLM inference is bursty; Firecracker's 125 ms cold start makes scale-to-zero viable. A GPU instance spun up with Firecracker can take traffic in under a second.
- Short-lived test runners — each test run gets a clean VM, boots in 125 ms, exits, gets GC'd. No shared state, no contamination between runs.
- Multi-tenant job queues — background jobs that process user-submitted data. Firecracker gives you VM isolation at a price point runc used to own.
- Preview environments — spin up a full-stack environment for each PR, destroy it on merge. The economics work at ~125 ms boot + minimal memory overhead.
Runtime 5: WASM / WASI preview1
What it is
The binary is compiled to WebAssembly with Go's WASI target — an entirely different binary, an entirely different image:
The resulting image: 3.1 MB scratch base + one .wasm binary. The sandbox is enforced at the language-runtime level — no syscalls, capabilities explicitly granted by the host.
The honest HTTP story
net/http doesn't work in WASI preview1. The spec has no socket API. This demo outputs JSON to stdout. That's not a cop-out — it's the current state of the standard. The wasi-http proposal shipped as part of WASI 0.2, which is ratified. Fermyon Spin 2.x implements it today. Go's WASI 0.2 support is in progress.
2025 state of the world
-
WASI 0.2 (Component Model) is ratified and shipping in wasmtime, WasmEdge, and Fastly Compute.
wasi-httpis a real, stable interface. -
Docker+Wasm is GA in Docker Desktop 4.27+ — run a WASM container with
--platform=wasi/wasmand a containerd shim - Fermyon Spin 2.x compiles Go to WASM with a full HTTP server abstraction — the framework paper over the WASI/HTTP gap today
- WasmPlugin in Kubernetes — Envoy and Istio support WASM plugins for custom policy, auth, and observability logic
- Extism — a cross-language WASM plugin framework that lets you embed sandboxed user code in any Go/Rust/Python host
Ultimate use cases
- Edge functions — Cloudflare Workers, Fastly Compute, Deno Deploy, and Vercel Edge Functions are all WASM at the bottom. The same binary runs in London, Singapore, and São Paulo with no containers to spin up.
- Cross-platform CLI tools — compile once, run on Linux/macOS/Windows/browser with no CGO, no cross-compilation matrix
- Sandboxed plugin systems — give users scriptable extensions with a real capability boundary. Zellij (terminal multiplexer) uses WASM plugins; VS Code extensions are moving this direction.
- Business logic in the browser + server — tax calculation, pricing rules, validation logic that needs to run identically client-side and server-side
- AI prompt/response filters — fast, sandboxed, hot-reloadable logic at the edge before a request hits your inference endpoint
The Numbers
All OCI runtimes run the same 3 MB distroless image. The distroless/runc row is measured hardware; microVM rows are reference numbers from project documentation — run make bench-md for your own numbers.
| Runtime | Image | Cold Start | p50 | p95 | Memory |
|---|---|---|---|---|---|
| distroless / runc | 3.0 MB | ~20 ms¹ | 0.28 ms | 0.41 ms | 6.9 MB |
| gVisor (runsc) | 3.0 MB | ~50 ms | ~0.5 ms | ~1.0 ms | ~18 MB |
| Kata / QEMU | 3.0 MB | ~500 ms | ~0.8 ms | ~1.5 ms | ~52 MB |
| Kata / Firecracker | 3.0 MB | ~125 ms | ~0.7 ms | ~1.3 ms | ~28 MB |
| WASM (wasmtime) | 3.1 MB | N/A² | — | — | — |
¹ First run ~174 ms (overlay FS init); subsequent ~20 ms on warm cache.
² WASM has no HTTP server in wasip1; exec time ~8 ms for the stdout variant.
Three things the numbers tell you that prose doesn't:
Image size is not the story. All five runtimes land at 3–3.1 MB. Switching from runc to Firecracker doesn't touch your image pipeline.
Latency overhead at steady state is negligible. Even inside a Kata VM, p50 latency is under 1 ms. The isolation boundary costs you cold-start and memory, not throughput. If you're worried about runtime overhead on a running service, stop — that's not where the overhead lives.
Firecracker hits the practical sweet spot. 125 ms is the number AWS decided was fast enough for Lambda. 500 ms (QEMU) is where users start feeling it. Firecracker lands right where microVM isolation becomes viable for interactive-latency workloads.
The Decision Framework
Stop asking "which is more secure?" Start asking "what's my threat model?"
Your tenants are you. You control every image, every workload, every user. → runc + distroless. Fast, simple, no overhead.
Your tenants are your users, but you control the runtime environment. CI runners, SaaS execution engines. → gVisor. Drop-in, no KVM, syscall isolation stops the most common container escapes.
You have compliance paperwork that says "VM-level isolation." → Kata + QEMU. The only option that satisfies an auditor asking for a kernel boundary.
You're building a platform. Functions, jobs, preview environments, AI inference. Cold-start matters. → Kata + Firecracker. This is the production-proven answer for platforms.
Your code runs everywhere, or users supply the code. Edge compute, plugins, sandboxed scripts. → WASM/WASI. The sandbox is portable; the isolation model is capability-based, not kernel-based.
Run It Yourself
git clone https://github.com/copyleftdev/micro-containers
cd micro-containers
make check # see what's installed
make bench-fast # quick smoke-test with 20 samples
make bench-md # full benchmark → Markdown table
Each runtime has an install.sh in runtimes/<name>/. The benchmark driver skips unavailable runtimes and tells you exactly what to install.
Source, Dockerfiles, benchmark driver, and install scripts: copyleftdev/micro-containers

















