惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

S
Security Affairs
H
Hackread – Cybersecurity News, Data Breaches, AI and More
有赞技术团队
有赞技术团队
博客园 - 司徒正美
罗磊的独立博客
博客园 - 叶小钗
J
Java Code Geeks
博客园_首页
阮一峰的网络日志
阮一峰的网络日志
腾讯CDC
Last Week in AI
Last Week in AI
博客园 - 聂微东
WordPress大学
WordPress大学
S
SegmentFault 最新的问题
V
V2EX
宝玉的分享
宝玉的分享
T
Tailwind CSS Blog
量子位
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The Cloudflare Blog
人人都是产品经理
人人都是产品经理
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
博客园 - 三生石上(FineUI控件)
大猫的无限游戏
大猫的无限游戏
Apple Machine Learning Research
Apple Machine Learning Research
IT之家
IT之家
小众软件
小众软件
美团技术团队
酷 壳 – CoolShell
酷 壳 – CoolShell
Cisco Talos Blog
Cisco Talos Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
T
Threatpost
T
The Exploit Database - CXSecurity.com
I
Intezer
L
Lohrmann on Cybersecurity
Hugging Face - Blog
Hugging Face - Blog
D
Darknet – Hacking Tools, Hacker News & Cyber Security
P
Privacy & Cybersecurity Law Blog
V
Visual Studio Blog
G
GRAHAM CLULEY
雷峰网
雷峰网
Security Latest
Security Latest
A
Arctic Wolf
爱范儿
爱范儿
T
Threat Research - Cisco Blogs
Scott Helme
Scott Helme
AWS News Blog
AWS News Blog
A
About on SuperTechFans
The Hacker News
The Hacker News

The Practical Developer

The Libuv Thread Pool Trap: Why Node.js Async APIs Stall Under Load Postgres Covering Indexes with INCLUDE: Eliminate Heap Fetches on Read-Heavy Workloads Postgres DISTINCT ON: The Fastest Way to Get the Latest Row Per Group Postgres Transaction Isolation: The Anomalies Your App Actually Faces in Production Linux TCP Tuning for Node.js Microservices: The Kernel Settings That Stop Silent Connection Drops Under Load Postgres HOT Updates and Fillfactor: Why Not All Writes Are Created Equal Database Connection Pool Leaks: Finding the Promise That Never Returns Its Seat Linux OOM Killer in Production: Why Your Node.js Containers Die Without a Stack Trace Postgres Materialized Views: Refresh Strategies That Do Not Lock Your Dashboards API Dependency Health Checks: Why /health Is Not Enough Authorization with Zanzibar Tuples: How Google Manages Permissions and How To Build the Same Check in Node.js Postgres Advisory Locks: The 20-Character Primitive That Replaces Redis for Coordination Dead Letter Queues: The Message Queue Pattern That Saves You at 2 a.m. File Descriptor Exhaustion: The Kernel Limit That Silently Drops Node.js Connections Graceful Degradation: The Pattern That Turns Total Outages into Partial Success PostgreSQL Full-Text Search: Dropping Elasticsearch for 90% of Use Cases S3 Presigned Multipart Uploads: Stop Your API Server from Being a File Upload Bottleneck MessagePack vs JSON: The Binary Serialization Switch That Cut Our Internal RPC Overhead by 40% DNS Caching in Node.js: The Silent Cause of Production Latency Spikes Reliable Cron Jobs: The Pattern That Stops Double Runs, Missed Executions, And The 2 AM Page GraphQL Query Complexity: Stop the OOM Query Before It Reaches Your Resolver Node.js Event Loop Lag: The Hidden Metric Behind Random Latency Spikes API Request Validation with Zod: The Schema That Catches Bad Input Before It Corrupts Your Database Load Shedding in Node.js: How to Reject Traffic Before You Drown Request Hedging: Cut Tail Latency In Half Without Overprovisioning Git Bisect: The Automated Binary Search That Finds Breaking Commits in Minutes Node.js Garbage Collection Tuning: Stop Letting V8 Pause Your Event Loop Node.js Server Timeouts: The Settings That Stop Slow Clients from Holding Sockets Hostage Postgres BRIN Indexes: The Time-Series Secret That Shrinks Indexes by 99% Event Sourcing with PostgreSQL: The Pragmatic 80% Solution Node.js Cluster Mode: Scaling the Event Loop Across CPU Cores Postgres Partial Indexes: Stopping Soft Deletes from Ruining Your Query Performance Request Coalescing with the Singleflight Pattern: Stop Drowning Your Database on Every Cache Miss The Bulkhead Pattern: Why One Slow Endpoint Should Not Drown Your Whole Service Node.js AsyncLocalStorage: End-to-End Request Context Without the Propagation Hell Postgres Deadlocks: Logging the Victim, Reproducing the Race, and Fixing the Lock Order Your Node.js HTTP Client Is the Bottleneck: Connection Pool Tuning That Works Optimistic Locking in Postgres: Stop Losing Data to Race Conditions Postgres Read Replicas: Stop Serving Stale Data to Your Users Cursor Pagination: Why Offset Queries Explode at Scale and How to Fix Them Node.js Worker Threads: 60 Lines That Stop a CSV Upload from Timing Out Every Other Request Reliable Webhook Delivery: Architecture for Outbound HTTP You Can Trust Request Timeouts and Deadline Propagation: Stop the Chain of Slowness Advanced Security Practices in Node.js Graceful Shutdown in Node.js: The 40 Lines That Stop 502s During Deploys Finding Node.js Memory Leaks with Heap Snapshots Idempotency Keys in 30 Lines: Stop Your Webhook From Charging Customers Twice Backpressure In Node.js: The Fix For Slow-Motion Queue Meltdowns Retries Done Right: Jitter, Budgets, and the Stampede You Did Not See Coming The Cache Stampede: Why Your "Just Add Redis" Layer Crashes Postgres at 3 a.m. Postgres SKIP LOCKED: An 80-Line Job Queue You Can Run Without Redis Stop Doing Work Nobody Wants: AbortController in Node.js, Done Right The N+1 Query Problem: We Found 23 In One Codebase And Killed Every One I Tried 5 AI Coding Tools for a Month. Here Is What I Actually Use CI/CD From Zero to Production in 30 Minutes With GitHub Actions Node.js vs Bun vs Deno: Which Runtime Should You Pick in 2025? Kubernetes Resource Requests And Limits: The Numbers That Decide If Your Cluster Is Stable The Three Pillars of Observability Are A Myth: What Actually Matters In Production pnpm Vs npm Vs yarn Vs Bun For Monorepos: Which One Earns The Migration In 2024 JSONB Indexing In Postgres: GIN Vs Expression Indexes, And When Each Is The Right Choice A Code Review Checklist That Ends The Same Three Arguments Every Sprint gRPC Vs REST In 2024: When The Switch Pays For Itself React Suspense For Data Fetching: The Pattern That Replaces Half Your Loading State Code The Five-Stage Rollout: How To Ship A Risky Change Without Holding Your Breath GitHub Actions In A Monorepo: Caching, Path Filters, And Secret Boundaries That Actually Work The Blameless Postmortem That Actually Improves Things: A Template And Six Hard-Won Rules Recursive CTEs In Postgres: How To Query A Tree Without N Round Trips Node.js Streams: When They Actually Help, And When They Just Add Complexity Playwright Vs Cypress In 2024: The Honest Comparison Of Which One Earns The Test Time React Server Components: The Mental Model That Makes The "use client" Boundary Obvious Pod Disruption Budgets: The K8s Object That Keeps Your Service Up During Cluster Maintenance Postgres LISTEN/NOTIFY: The Pub/Sub You Already Have And Are Not Using Chaos Engineering Starter Kit: The Five Drills That Don't Need Netflix-Scale Spec-Driven API Development With OpenAPI: How To Stop Drifting From Your Docs Kubernetes Autoscaling Beyond CPU: The Custom-Metric HPA Pattern That Actually Works Postgres Partitioning For Time-Series: The Boring Setup That Saves Your Database Distributed Locks With Redis: An Honest Look At Redlock And When You Don't Need It HTTP/2 vs HTTP/3: What Actually Changes For Your App, And What Doesn't Image Optimization For The Web In 2023: srcset, AVIF, And The Lighthouse Score You Actually Want Kafka vs RabbitMQ: A Decision Tree That Doesn't Hate You UUID vs Bigint Primary Keys In Postgres: The Index Math That Decides For You Flame Graphs: How To Find The Slow Function In 30 Seconds Without Profiling Theatre Postgres Streaming Vs. Logical Replication: Which One Solves Your Actual Problem ESLint Rules That Earn Their Keep: The Twelve I Enable On Every Project Pre-Commit Hooks That Pay For Themselves: Husky, lint-staged, And The Five Rules That Stick Zero-Downtime Database Migrations: The Six-Step Pattern That Rules Them All Circuit Breakers In Node.js: 50 Lines That Stop A Failing Dependency From Taking Down Your Service Postgres VACUUM Is Not Magic: How Your Hot Table Bloats To 80GB And How To Fix It Kubernetes Liveness And Readiness Probes: The Difference That Causes Half Your Outages Rate Limiting In Production: A Token Bucket In 30 Lines Of Redis The Outbox Pattern: How To Stop Losing Events When Postgres And Kafka Disagree Load Testing With k6: The Three Scenarios That Find Real Bugs (Not Synthetic Numbers) Postgres Row-Level Security For Multi-Tenant Apps: The Pattern That Stops You From Leaking Data Rebase vs. Merge: The Team Policy That Ends The Argument Forever OpenTelemetry in Node.js: Distributed Tracing That Actually Helps During an Incident Feature Flags That Pay Rent: The 4 Flag Types And When To Delete Each ETag, Last-Modified, and the Caching Headers Most APIs Get Wrong Connection Pooling Without the Cargo Cult: pgbouncer in 100 Lines of Config JSONB Is Not a Schema: When To Reach For It in Postgres, And When To Stop Bash Strict Mode: The Three Lines That Stop Your Deploy Script From Lying To You
Docker Multi-Platform Builds for Node.js: Stop Playing Architecture Roulette in CI
The Practica · 2026-06-15 · via The Practical Developer

A three-line deploy script pushed a new image to ECR. The CI passed. The tag was latest. Twelve minutes later, every EC2 instance in the auto-scaling group had the new container running. And every single one was crashing with exec format error on startup. The developer who built the image was on an M2 Mac. The CI runner was AMD64. The Dockerfile had no --platform flag, so the build created an arm64 image that would not run on any x86 CPU. The sprint was not off to a good start.

This is the most common multi-architecture failure mode, and it is completely preventable. If you work on a team where some developers use Apple Silicon Macs, others use x86 Linux workstations, and production runs on AMD64 (or a mix of Graviton and x86 instances), you need a build pipeline that produces images for both architectures from a single Dockerfile. This post covers the full setup: buildx drivers, QEMU emulation, platform-specific native addons, CI integration with GitHub Actions, and the caching strategy that keeps multi-platform builds fast enough to run on every push.

Why platform mismatch breaks your images

Docker images are not architecture-agnostic. The base image you specify in FROM node:22-slim is a manifest list that points to different images for different CPU architectures. When you build without specifying a platform, Docker uses the architecture of the build host. On an M-series Mac, that is linux/arm64. On a standard GitHub Actions runner, that is linux/amd64. The resulting image layers contain binaries compiled for the host architecture. Attempting to run an arm64 binary on an amd64 CPU without emulation produces the exec format error you saw above.

The problem is worse for Node.js applications that depend on native addons. Packages like sharp, node-canvas, pg-native, and anything using node-gyp compile C++ code during npm install (or npm rebuild). Those compiled artifacts are architecture-specific. An arm64 sharp binary will segfault on x86 even if QEMU emulation is present at the container level, because the emulation overhead on hot GPU-accelerated code paths produces silent corruption or crashes that are much harder to debug than a clean exec format error.

The fix is to build for the target architecture explicitly, in a controlled build environment, and push a manifest list that lets each host pull exactly the image it needs.

Step 1: Install buildx and register QEMU binfmt

Docker Desktop on Mac includes buildx by default. On Linux, you may need to install it separately. The docker buildx command wraps the classic docker build with multi-platform support, and it ships with every Docker version 19.03 and newer.

For multi-platform builds, Docker uses QEMU binaries to emulate foreign architectures during the build phase. This is needed because even the build steps (RUN commands that compile native addons) must run on the target architecture. To register the QEMU handlers with the Linux kernel, run this on every build host (including CI runners):

# Install QEMU static binaries and register them with binfmt_misc
docker run --privileged --rm tonistiigi/binfmt --install all

This container mounts the host’s /proc/sys/fs/binfmt_misc and installs handlers for arm64, armv7, s390x, ppc64le, and riscv64. Once registered, the kernel can transparently execute foreign-architecture binaries through QEMU.

You can verify the registration:

$ docker run --rm --platform linux/arm64 node:22-slim uname -m
aarch64

If that command works on an AMD64 host, QEMU is configured correctly. If it hangs or errors, the binfmt handlers are not installed.

Important: GitHub Actions hosted runners do not have binfmt handlers pre-installed. You must register QEMU on every CI run. The docker/setup-qemu-action does this in one step, which we will cover in the CI section below.

Step 2: Create a builder instance

Buildx uses builder instances to manage the build context and cache. The default builder does not support multi-platform builds. You need to create one with a docker-container driver:

docker buildx create --name multiplatform --driver docker-container --bootstrap
docker buildx use multiplatform

The docker-container driver starts a BuildKit container that handles the multi-platform build orchestration. It persists its cache in a named volume, so subsequent builds are incremental. The --bootstrap flag starts the builder immediately so your first build does not wait for a cold start.

On CI, you will not run these commands directly. The docker/setup-buildx-action handles them for you.

Step 3: Write a platform-aware Dockerfile

A well-written Dockerfile should produce identical images for both architectures with the same set of layers. Most Dockerfiles need zero changes to become platform-agnostic if they follow standard practices. The tricky parts are native addon installation and platform-specific package manager repos.

Here is a Dockerfile for a Node.js service that handles both architectures correctly:

# docker/Dockerfile
FROM node:22-slim AS base
WORKDIR /app

# Stage 1: install production dependencies
FROM base AS deps
COPY package.json package-lock.json ./
# --platform is inherited from the build invocation
RUN npm ci --omit=dev --ignore-scripts
# Build native addons for the target platform
RUN npm rebuild

# Stage 2: build step (only if you have a build step)
FROM base AS build
COPY package.json package-lock.json ./
RUN npm ci
COPY tsconfig.json ./
COPY src/ src/
RUN npm run build

# Stage 3: production image
FROM base AS production
RUN apt-get update && apt-get install -y --no-install-recommends \
    ca-certificates \
    tini \
    && rm -rf /var/lib/apt/lists/*

COPY --from=deps /app/node_modules ./node_modules
COPY --from=build /app/dist ./dist

ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["node", "dist/index.js"]

The key line is RUN npm rebuild in the deps stage. This compiles native addons for whatever architecture the build is targeting (arm64 or amd64). If you skip this step and rely on precompiled binaries, sharp and similar packages will download the correct prebuilt binary based on the target platform, but only if they detect it correctly inside QEMU. Explicit rebuild is safer.

When prebuilt binaries fail

Some packages ship prebuilt binaries for popular platforms. sharp, for example, distributes prebuilt libvips binaries for linux-x64 and linux-arm64. These download automatically during npm install and do not require a C++ toolchain. This works well for multi-platform builds because npm downloads the correct binary for the emulated architecture.

Other packages always compile from source. The node-rdkafka package, for instance, requires librdkafka and a C compiler. When this happens inside QEMU, the build is slower (sometimes 5-10x slower) but it produces correct output. The solution is not to skip the package. It is to accept the slower build and use Docker caching to avoid rebuilding on every push.

If a native addon genuinely fails to build under QEMU (some packages check /proc/cpuinfo or use architecture-specific inline assembly), you have three options:

  1. Pin the package version and build on a native host using a multi-architecture build matrix (discussed below).
  2. Use a multi-stage cross-compilation toolchain that compiles for the target architecture on the native host, bypassing QEMU entirely.
  3. Switch to a JS-native alternative if one exists that meets your performance requirements.

Option 1 is the most practical for most teams. We cover it in the CI section.

Step 4: Build and push both architectures

With buildx configured and a platform-agnostic Dockerfile, the build command is straightforward:

docker buildx build \
  --platform linux/amd64,linux/arm64 \
  --tag myregistry/myapp:latest \
  --tag myregistry/myapp:$(git rev-parse --short HEAD) \
  --cache-to type=registry,ref=myregistry/myapp:cache,mode=max \
  --cache-from type=registry,ref=myregistry/myapp:cache \
  --push \
  -f docker/Dockerfile \
  .

Breaking this down:

  • --platform linux/amd64,linux/arm64 tells buildx to build for both architectures. BuildKit creates separate layer sets for each platform, then assembles a manifest list (also called a fat manifest) that points to both images.
  • --cache-to type=registry,... exports the build cache to your container registry. The mode=max flag caches all layers, including intermediate stages. Without this, subsequent CI builds rebuild everything from scratch.
  • --cache-from type=registry,... pulls the cache from the registry before building, so unchanged layers are reused.
  • --push pushes the manifest list and both platform-specific images to the registry in one command.

The first build will take 3-5 minutes because QEMU emulation slows down native addon compilation. Subsequent builds take 30-60 seconds because the cache hits for everything except changed layers.

Step 5: CI/CD with GitHub Actions

Here is the complete GitHub Actions workflow that builds and pushes multi-platform images on every push to main:

# .github/workflows/docker-build.yml
name: Docker Multi-Platform Build

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  build:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v3

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Log in to GitHub Container Registry
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=sha,prefix=
            type=ref,event=branch
            type=raw,value=latest,enable=${{ github.ref == 'refs/heads/main' }}

      - name: Build and push
        uses: docker/buildx-action@v3
        with:
          context: .
          file: docker/Dockerfile
          platforms: linux/amd64,linux/arm64
          push: ${{ github.event_name != 'pull_request' }}
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

The critical steps are docker/setup-qemu-action (registers binfmt handlers on the runner) and docker/setup-buildx-action (creates a docker-container builder). Without these two steps, the build either fails or produces single-architecture images.

Note the use of type=gha for cache in this example. GitHub Actions cache (type=gha) is scoped to the repository and does not require additional registry storage. For larger images, registry-based cache (type=registry,ref=...) is faster because it is not subject to the 10GB GitHub Actions cache limit. Choose whichever fits your infra.

Using a native arm64 runner (avoiding QEMU entirely)

GitHub offers ubuntu-24.04-arm hosted runners (public beta). If your team uses ARM-based infrastructure (Graviton EC2 instances, for example), you can build the arm64 image natively and skip QEMU entirely:

jobs:
  build-amd64:
    runs-on: ubuntu-latest
    steps:
      # ... checkout, buildx setup, login ...
      - name: Build amd64
        uses: docker/buildx-action@v3
        with:
          platforms: linux/amd64
          outputs: type=image,name=myapp,push-by-digest=true,name-canonical=true

  build-arm64:
    runs-on: ubuntu-24.04-arm
    steps:
      # ... same setup steps ...
      - name: Build arm64
        uses: docker/buildx-action@v3
        with:
          platforms: linux/arm64
          outputs: type=image,name=myapp,push-by-digest=true,name-canonical=true

  merge:
    needs: [build-amd64, build-arm64]
    runs-on: ubuntu-latest
    steps:
      - name: Create manifest list
        run: |
          docker manifest create myapp:latest \
            myapp@$(cat /tmp/amd64-digest) \
            myapp@$(cat /tmp/arm64-digest)
          docker manifest push myapp:latest

This approach is faster (no QEMU slowdown) but requires more CI configuration and two parallel job runs. It is worth it if your native addon compilation takes more than 30 seconds inside QEMU. For most Node.js projects with typical packages like sharp, ioredis, and kafkajs, the single-job buildx approach is fast enough and much simpler.

Step 6: Handle platform-specific dependencies

Some Node.js packages ship different versions for different platforms, or need platform-specific configuration at build time. Three patterns handle this cleanly:

Pattern A: Conditional npm scripts

Use scripts in package.json that detect the platform:

{
  "scripts": {
    "postinstall": "node scripts/platform-check.js"
  }
}
// scripts/platform-check.js
const os = require('os');
const arch = os.arch(); // 'arm64' or 'x64'
const platform = os.platform(); // 'linux' or 'darwin'

if (platform === 'linux' && arch === 'arm64') {
  // Apply arm64-specific patches or configurations
  console.log('Applying arm64-specific configuration');
}

Pattern B: Lockfile with platform-specific dependencies

Modern package managers handle platform-specific packages through optional dependencies. The lockfile records which packages were resolved for which platform. When you run npm ci inside an emulated arm64 build, npm downloads the arm64 variants automatically. This is why multi-platform builds work without changes for most projects. The packages that cause trouble are the ones that do not publish platform-specific variants and always compile from source.

Pattern C: Conditional Dockerfile stages

If one architecture needs different base packages, use build arguments:

ARG TARGETARCH

FROM node:22-slim AS base
WORKDIR /app

FROM base AS deps-arm64
RUN apt-get update && apt-get install -y --no-install-recommends \
    libvips-dev-arm64-cross

FROM base AS deps-amd64
RUN apt-get update && apt-get install -y --no-install-recommends \
    libvips-dev

FROM deps-${TARGETARCH} AS deps
# Continue with npm ci and npm rebuild
COPY package.json package-lock.json ./
RUN npm ci --omit=dev

The TARGETARCH variable is automatically set by buildx to arm64 or amd64 based on the platform being built. Use it to select architecture-specific dependencies without manual conditionals.

Testing both architectures before deployment

Building for two architectures is step one. Verifying both images actually work is step two. Add a smoke test stage to your workflow:

- name: Smoke test arm64
  run: |
    docker run --rm --platform linux/arm64 \
      ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.meta.outputs.version }} \
      node -e "require('./dist/index.js'); console.log('arm64 OK')"

- name: Smoke test amd64
  run: |
    docker run --rm --platform linux/amd64 \
      ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.meta.outputs.version }} \
      node -e "require('./dist/index.js'); console.log('amd64 OK')"

For production confidence, run a full integration test suite against both images:

- name: Integration test arm64
  run: |
    docker compose -f docker-compose.test.yml -p test-arm64 up \
      --abort-on-container-exit --exit-code-from app
    # Override the app service image platform
    docker compose -f docker-compose.test.yml -p test-arm64 run \
      -e DOCKER_DEFAULT_PLATFORM=linux/arm64 app npm test

The caching trap that makes multi-platform builds slow

The most common complaint about multi-platform builds is that they are too slow. The complaint is almost always caused by missing or misconfigured cache. Here are the three cache rules:

Rule 1: Use mode=max for --cache-to. The default mode (mode=min) only caches the final stage layers. Intermediate stages (deps, build) are rebuilt every time. With mode=max, all stages are cached across builds.

Rule 2: Do not mix type=registry cache and type=gha cache in the same workflow unless you understand the tradeoffs. Registry cache is persistent and shared across branches. GHA cache is scoped to the branch and has a 10GB limit. For a monorepo with multiple services, registry cache is the safer choice.

Rule 3: Cache keys include the platform. BuildKit automatically scopes cache entries by platform, so an amd64 build does not pollute the arm64 cache. You do not need to manage this yourself, but you should know it works so you do not add manual platform suffixes that break the built-in scoping.

The practical takeaway

Multi-platform Docker builds for Node.js require exactly four things:

  • QEMU binfmt registration via tonistiigi/binfmt or docker/setup-qemu-action.
  • A buildx builder with docker-container driver.
  • A platform-agnostic Dockerfile that uses RUN npm rebuild for native addons and ARG TARGETARCH for platform-specific dependencies.
  • Registry or GHA cache with mode=max to keep build times reasonable.

The workflow above took about 15 minutes to write and copy into your repo. The first multi-platform build will take 3-5 minutes. Every subsequent build will take under a minute for a typical Node.js service. The cost of not having this setup is the deploy that silently pushes arm64 images to an amd64 fleet, which costs far more than 15 minutes to debug.

Before your next deploy, run through this checklist:

  • docker/setup-qemu-action is in your workflow (or binfmt is registered on self-hosted runners).
  • docker/setup-buildx-action creates a builder with docker-container driver.
  • The build action specifies platforms: linux/amd64,linux/arm64.
  • Native addons rebuild with npm rebuild in a Docker stage (not just npm ci).
  • Cache is configured with mode=max using either type=gha or type=registry.
  • Both images pass a smoke test (at minimum node -e "require('./app')") before the manifest is pushed.
  • The deploy target matches one of the built platforms.

Do not let the next architecture mismatch incident be the one that wakes you up at 2 AM. A dozen lines of YAML, a handful of Dockerfile conventions, and you ship to every architecture your infra runs.


A note from Yojji

Infrastructure patterns like multi-platform container builds are exactly the kind of production plumbing that separates a service that deploys cleanly from one that crashes on the first Graviton instance. Getting the QEMU setup, cache strategy, and native addon handling right requires the same attention to detail that Yojji applies to every client engagement, from CI/CD pipelines to full-stack product delivery. Yojji is an international custom software development company founded in 2016, with offices in Europe, the US, and the UK. Their senior engineering teams specialize in the JavaScript ecosystem, cloud infrastructure on AWS, Azure, and Google Cloud, and the full cycle of product delivery from discovery through DevOps.