惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

N
News | PayPal Newsroom
云风的 BLOG
云风的 BLOG
GbyAI
GbyAI
Engineering at Meta
Engineering at Meta
B
Blog RSS Feed
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
The Register - Security
The Register - Security
L
LangChain Blog
A
About on SuperTechFans
S
Schneier on Security
博客园 - 三生石上(FineUI控件)
Stack Overflow Blog
Stack Overflow Blog
The Hacker News
The Hacker News
AWS News Blog
AWS News Blog
博客园 - 司徒正美
Scott Helme
Scott Helme
K
Kaspersky official blog
Cyberwarzone
Cyberwarzone
T
Tenable Blog
腾讯CDC
Recorded Future
Recorded Future
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
G
GRAHAM CLULEY
Security Latest
Security Latest
S
Securelist
D
Darknet – Hacking Tools, Hacker News & Cyber Security
aimingoo的专栏
aimingoo的专栏
Google DeepMind News
Google DeepMind News
V
Vulnerabilities – Threatpost
雷峰网
雷峰网
T
The Exploit Database - CXSecurity.com
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
V2EX
T
The Blog of Author Tim Ferriss
D
Docker
S
Security Affairs
F
Full Disclosure
Know Your Adversary
Know Your Adversary
N
News and Events Feed by Topic
N
News and Events Feed by Topic
T
Tor Project blog
Hugging Face - Blog
Hugging Face - Blog
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Microsoft Security Blog
Microsoft Security Blog
Simon Willison's Weblog
Simon Willison's Weblog
Recent Announcements
Recent Announcements
博客园_首页
博客园 - 聂微东
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
S
Security @ Cisco Blogs

The Practical Developer

The Libuv Thread Pool Trap: Why Node.js Async APIs Stall Under Load Postgres Covering Indexes with INCLUDE: Eliminate Heap Fetches on Read-Heavy Workloads Postgres DISTINCT ON: The Fastest Way to Get the Latest Row Per Group Postgres Transaction Isolation: The Anomalies Your App Actually Faces in Production Linux TCP Tuning for Node.js Microservices: The Kernel Settings That Stop Silent Connection Drops Under Load Postgres HOT Updates and Fillfactor: Why Not All Writes Are Created Equal Database Connection Pool Leaks: Finding the Promise That Never Returns Its Seat Linux OOM Killer in Production: Why Your Node.js Containers Die Without a Stack Trace Postgres Materialized Views: Refresh Strategies That Do Not Lock Your Dashboards API Dependency Health Checks: Why /health Is Not Enough Authorization with Zanzibar Tuples: How Google Manages Permissions and How To Build the Same Check in Node.js Postgres Advisory Locks: The 20-Character Primitive That Replaces Redis for Coordination Dead Letter Queues: The Message Queue Pattern That Saves You at 2 a.m. File Descriptor Exhaustion: The Kernel Limit That Silently Drops Node.js Connections Graceful Degradation: The Pattern That Turns Total Outages into Partial Success PostgreSQL Full-Text Search: Dropping Elasticsearch for 90% of Use Cases S3 Presigned Multipart Uploads: Stop Your API Server from Being a File Upload Bottleneck MessagePack vs JSON: The Binary Serialization Switch That Cut Our Internal RPC Overhead by 40% DNS Caching in Node.js: The Silent Cause of Production Latency Spikes Reliable Cron Jobs: The Pattern That Stops Double Runs, Missed Executions, And The 2 AM Page GraphQL Query Complexity: Stop the OOM Query Before It Reaches Your Resolver Node.js Event Loop Lag: The Hidden Metric Behind Random Latency Spikes API Request Validation with Zod: The Schema That Catches Bad Input Before It Corrupts Your Database Load Shedding in Node.js: How to Reject Traffic Before You Drown Request Hedging: Cut Tail Latency In Half Without Overprovisioning Git Bisect: The Automated Binary Search That Finds Breaking Commits in Minutes Node.js Garbage Collection Tuning: Stop Letting V8 Pause Your Event Loop Node.js Server Timeouts: The Settings That Stop Slow Clients from Holding Sockets Hostage Postgres BRIN Indexes: The Time-Series Secret That Shrinks Indexes by 99% Event Sourcing with PostgreSQL: The Pragmatic 80% Solution Node.js Cluster Mode: Scaling the Event Loop Across CPU Cores Postgres Partial Indexes: Stopping Soft Deletes from Ruining Your Query Performance Request Coalescing with the Singleflight Pattern: Stop Drowning Your Database on Every Cache Miss The Bulkhead Pattern: Why One Slow Endpoint Should Not Drown Your Whole Service Node.js AsyncLocalStorage: End-to-End Request Context Without the Propagation Hell Postgres Deadlocks: Logging the Victim, Reproducing the Race, and Fixing the Lock Order Your Node.js HTTP Client Is the Bottleneck: Connection Pool Tuning That Works Optimistic Locking in Postgres: Stop Losing Data to Race Conditions Postgres Read Replicas: Stop Serving Stale Data to Your Users Cursor Pagination: Why Offset Queries Explode at Scale and How to Fix Them Node.js Worker Threads: 60 Lines That Stop a CSV Upload from Timing Out Every Other Request Reliable Webhook Delivery: Architecture for Outbound HTTP You Can Trust Request Timeouts and Deadline Propagation: Stop the Chain of Slowness Advanced Security Practices in Node.js Graceful Shutdown in Node.js: The 40 Lines That Stop 502s During Deploys Finding Node.js Memory Leaks with Heap Snapshots Idempotency Keys in 30 Lines: Stop Your Webhook From Charging Customers Twice Backpressure In Node.js: The Fix For Slow-Motion Queue Meltdowns Retries Done Right: Jitter, Budgets, and the Stampede You Did Not See Coming The Cache Stampede: Why Your "Just Add Redis" Layer Crashes Postgres at 3 a.m. Postgres SKIP LOCKED: An 80-Line Job Queue You Can Run Without Redis Stop Doing Work Nobody Wants: AbortController in Node.js, Done Right The N+1 Query Problem: We Found 23 In One Codebase And Killed Every One I Tried 5 AI Coding Tools for a Month. Here Is What I Actually Use CI/CD From Zero to Production in 30 Minutes With GitHub Actions Node.js vs Bun vs Deno: Which Runtime Should You Pick in 2025? Kubernetes Resource Requests And Limits: The Numbers That Decide If Your Cluster Is Stable The Three Pillars of Observability Are A Myth: What Actually Matters In Production pnpm Vs npm Vs yarn Vs Bun For Monorepos: Which One Earns The Migration In 2024 JSONB Indexing In Postgres: GIN Vs Expression Indexes, And When Each Is The Right Choice A Code Review Checklist That Ends The Same Three Arguments Every Sprint gRPC Vs REST In 2024: When The Switch Pays For Itself React Suspense For Data Fetching: The Pattern That Replaces Half Your Loading State Code The Five-Stage Rollout: How To Ship A Risky Change Without Holding Your Breath GitHub Actions In A Monorepo: Caching, Path Filters, And Secret Boundaries That Actually Work The Blameless Postmortem That Actually Improves Things: A Template And Six Hard-Won Rules Recursive CTEs In Postgres: How To Query A Tree Without N Round Trips Node.js Streams: When They Actually Help, And When They Just Add Complexity Playwright Vs Cypress In 2024: The Honest Comparison Of Which One Earns The Test Time React Server Components: The Mental Model That Makes The "use client" Boundary Obvious Pod Disruption Budgets: The K8s Object That Keeps Your Service Up During Cluster Maintenance Postgres LISTEN/NOTIFY: The Pub/Sub You Already Have And Are Not Using Chaos Engineering Starter Kit: The Five Drills That Don't Need Netflix-Scale Spec-Driven API Development With OpenAPI: How To Stop Drifting From Your Docs Kubernetes Autoscaling Beyond CPU: The Custom-Metric HPA Pattern That Actually Works Postgres Partitioning For Time-Series: The Boring Setup That Saves Your Database Distributed Locks With Redis: An Honest Look At Redlock And When You Don't Need It HTTP/2 vs HTTP/3: What Actually Changes For Your App, And What Doesn't Image Optimization For The Web In 2023: srcset, AVIF, And The Lighthouse Score You Actually Want Kafka vs RabbitMQ: A Decision Tree That Doesn't Hate You UUID vs Bigint Primary Keys In Postgres: The Index Math That Decides For You Flame Graphs: How To Find The Slow Function In 30 Seconds Without Profiling Theatre Postgres Streaming Vs. Logical Replication: Which One Solves Your Actual Problem ESLint Rules That Earn Their Keep: The Twelve I Enable On Every Project Pre-Commit Hooks That Pay For Themselves: Husky, lint-staged, And The Five Rules That Stick Zero-Downtime Database Migrations: The Six-Step Pattern That Rules Them All Circuit Breakers In Node.js: 50 Lines That Stop A Failing Dependency From Taking Down Your Service Postgres VACUUM Is Not Magic: How Your Hot Table Bloats To 80GB And How To Fix It Kubernetes Liveness And Readiness Probes: The Difference That Causes Half Your Outages Rate Limiting In Production: A Token Bucket In 30 Lines Of Redis The Outbox Pattern: How To Stop Losing Events When Postgres And Kafka Disagree Load Testing With k6: The Three Scenarios That Find Real Bugs (Not Synthetic Numbers) Postgres Row-Level Security For Multi-Tenant Apps: The Pattern That Stops You From Leaking Data Rebase vs. Merge: The Team Policy That Ends The Argument Forever OpenTelemetry in Node.js: Distributed Tracing That Actually Helps During an Incident Feature Flags That Pay Rent: The 4 Flag Types And When To Delete Each ETag, Last-Modified, and the Caching Headers Most APIs Get Wrong Connection Pooling Without the Cargo Cult: pgbouncer in 100 Lines of Config JSONB Is Not a Schema: When To Reach For It in Postgres, And When To Stop Bash Strict Mode: The Three Lines That Stop Your Deploy Script From Lying To You
Linux Capabilities and Container Security for Node.js: Running Without Root
The Practica · 2026-06-24 · via The Practical Developer

Your Node.js container runs as root. You know this because your Dockerfile says FROM node:20-slim and you never added a USER directive. The process runs with uid 0 inside the container, which means if an attacker gets RCE through a vulnerability in express, lodash, or any of the other 1,200 packages in node_modules, they have full root privileges on the container. From there, kernel exploit or misconfigured seccomp, host access is one CVE away.

The Dockerfile that ships with half the tutorials on the internet looks exactly like this:

FROM node:20-slim
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]

No non-root user. No capability drops. No read-only filesystem. No seccomp. It builds, it runs, it passes every smoke test. And it is one curl command from a container breakout that exposes the host.

This post covers the exact four things you need to harden a Node.js container: dropping Linux capabilities, running as a non-root user, mounting the root filesystem read-only, and applying a seccomp profile. Every step is deployable today, compatible with Docker and Kubernetes, and breaks nothing if you account for the side effects.

Why root inside a container is still root

A common misconception is that Docker containers run in a sandbox and that root inside a container is somehow less powerful than root on the host. That is partially true and dangerously misleading.

Docker applies a default seccomp profile and drops some Linux capabilities. But the default set of capabilities Docker keeps is generous. A node:20-slim container running as root has the following capabilities by default:

CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_FSETID, CAP_FOWNER, CAP_MKNOD, CAP_NET_RAW, CAP_SETGID, CAP_SETUID, CAP_SETFCAP, CAP_SETPCAP, CAP_NET_BIND_SERVICE, CAP_SYS_CHROOT, CAP_AUDIT_WRITE, CAP_KILL

That is fourteen capabilities, including CAP_DAC_OVERRIDE (bypass file permission checks), CAP_NET_RAW (raw socket access for ARP spoofing), and CAP_SYS_CHROOT (chroot escapes). If an attacker compromises your Node.js process, they inherit all of these.

The attack chain looks like this:

  1. A prototype pollution vulnerability in a dependencies package lets the attacker write a file to disk.
  2. Because the container runs as root, the file lands with uid 0 and can overwrite any binary in /usr, /sbin, or anywhere else in the container.
  3. The attacker writes a malicious binary that uses CAP_SYS_CHROOT and a mounted /proc to escape the container namespace.
  4. The attacker now has a foothold on the host.

Every step of this chain is blocked by the hardening techniques below.

Step 1: Drop every capability, then add back only what you need

The first and easiest hardening step is to drop all capabilities and only add back the ones your application actually needs.

For a typical Node.js HTTP server, the only capability you need is CAP_NET_BIND_SERVICE if you want to bind to a privileged port (under 1024). If your application listens on port 3000 or above (which it should), you do not even need that.

Docker Compose:

services:
  app:
    build: .
    cap_drop:
      - ALL
    cap_add: []
    ports:
      - "3000:3000"

Docker run:

docker run --cap-drop=ALL --cap-add=NET_BIND_SERVICE my-app

But wait. If you test --cap-drop=ALL on a Node.js container running as root, you might see something unexpected. Node.js’s fs module uses uv_fs_open() which, under the hood, calls openat(). Without CAP_DAC_OVERRIDE, the kernel enforces the file’s permission bits strictly. If your application writes to a log file or uploads a file, the uid and gid of the running process must have write permission on the target directory. This is not a capability issue but a permissions issue, which the next step solves.

The key insight: capability drops are free. They add zero runtime overhead, they require no code changes, and they block entire classes of kernel-level exploits. There is no reason not to drop ALL and add back only what you need.

Step 2: Run as a non-root user

This is the single highest-impact change you can make. A process running as uid 1000 inside the container cannot write to /usr/bin, cannot modify /etc/passwd, and cannot chroot to escape namespaces. The kernel checks against the effective uid of the process, and if that uid is not 0, the privileged syscalls are blocked regardless of what capabilities the container holds.

The Dockerfile change is two lines:

FROM node:20-slim

# Create a non-root user and group
RUN groupadd --system --gid 1000 appuser && \
    useradd --system --uid 1000 --gid appuser appuser

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production && \
    # Ensure the app user owns the application files
    chown -R appuser:appuser /app
COPY --chown=appuser:appuser . .

USER appuser
EXPOSE 3000
CMD ["node", "server.js"]

If you are using Alpine-based images (node:20-alpine), the commands are different because Alpine uses busybox:

FROM node:20-alpine

RUN addgroup -S -g 1000 appuser && \
    adduser -S -u 1000 -G appuser appuser

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production && \
    chown -R appuser:appuser /app
COPY --chown=appuser:appuser . .

USER appuser
EXPOSE 3000
CMD ["node", "server.js"]

The uid 1000 is arbitrary but conventional. Any uid above 1000 works. Do not use uids below 100 (system accounts) for application processes.

What breaks when you switch to a non-root user?

Anything that writes to filesystem paths controlled by root. The most common issues:

  • Log files written to /var/log. Your application cannot create files there. Write logs to stdout/stderr (which you should be doing anyway for containerized apps) or to a directory under /app that has the right ownership.
  • Socket files in /var/run. If you use Unix domain sockets, create the socket in a directory owned by the app user.
  • npm install with lifecycle scripts. Some npm packages run postinstall scripts that need to write to protected paths. If you npm install as the appuser, those scripts fail. Always run npm ci during the build (as root or with a temporary build user) and copy the result.

Once you switch to a non-root user and drop all capabilities, your container is dramatically harder to exploit.

Step 3: Mount the root filesystem read-only

A read-only root filesystem means the process cannot write to any path on the root filesystem, period. Combined with a non-root user, this closes the entire class of binary-overwrite and configuration-tampering attacks.

Docker:

docker run --read-only --tmpfs /tmp --tmpfs /app/data my-app

Docker Compose:

services:
  app:
    build: .
    read_only: true
    tmpfs:
      - /tmp
      - /app/data
    cap_drop:
      - ALL

Kubernetes (Pod Security Context):

spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 1000
    fsGroup: 1000
    readOnlyRootFilesystem: true

The --read-only flag makes the container’s union filesystem immutable. Node.js writes to /tmp and /app/data are redirected to an in-memory tmpfs. No files survive a container restart, which is fine because containers are ephemeral.

What needs a writable path that is not /tmp?

Node.js itself writes to a few paths at runtime:

  • V8 compilation cache. Node.js caches compiled bytecode in a platform-specific directory. If $XDG_CACHE_HOME is not writable, Node.js skips the cache. The performance impact is negligible.
  • npm cache. If your application runs npm commands at runtime (which it should not), the npm cache directory needs to be writable. Set npm config set cache /tmp/.npm in your Dockerfile.
  • Temporary files. Libraries like sharp (image processing), puppeteer (headless Chrome), and node-gyp (native compilation) write to /tmp. As long as /tmp is mounted as tmpfs, they work fine.
  • Upload directories. If your application accepts file uploads, the upload destination must be a tmpfs mount or a PersistentVolumeClaim in Kubernetes.

The rule is simple: everything under / is read-only. Anything that needs writes goes to /tmp or a named volume.

Step 4: Apply a seccomp profile

seccomp (secure computing mode) restricts the system calls a process can make. Docker ships with a default seccomp profile that blocks around 50 dangerous syscalls (like mount, reboot, swapon). But the default profile is permissive enough to run most applications without issues. You can tighten it.

A custom seccomp profile for a Node.js application should block syscalls that are never used by a JavaScript runtime: mount, umount2, ptrace, perf_event_open, bpf, kexec_file_load, swapon, swapoff, create_module, init_module, finit_module, delete_module.

Here is a seccomp profile that is stricter than the Docker default but still allows Node.js to run normally:

{
  "defaultAction": "SCMP_ACT_ERRNO",
  "architectures": ["SCMP_ARCH_X86_64", "SCMP_ARCH_AARCH64"],
  "syscalls": [
    {
      "names": [
        "accept", "accept4", "access", "arch_prctl", "bind",
        "brk", "capget", "capset", "chdir", "chmod", "chown",
        "clock_getres", "clock_gettime", "clock_nanosleep",
        "clone", "clone3", "close", "connect", "copy_file_range",
        "creat", "dup", "dup2", "dup3", "epoll_create1",
        "epoll_ctl", "epoll_pwait", "eventfd2", "execve",
        "exit", "exit_group", "faccessat2", "fadvise64",
        "fallocate", "fchdir", "fchmod", "fchmodat", "fchown",
        "fchownat", "fcntl", "fdatasync", "fgetxattr",
        "flistxattr", "flock", "fork", "fremovexattr",
        "fsetxattr", "fstat", "fstatfs", "fsync", "ftruncate",
        "futex", "getcwd", "getdents64", "getegid", "geteuid",
        "getgid", "getpeername", "getpgid", "getpgrp",
        "getpid", "getppid", "getpriority", "getrandom",
        "getresgid", "getresuid", "getrlimit", "getrusage",
        "getsockname", "getsockopt", "gettid", "gettimeofday",
        "getuid", "getxattr", "inotify_add_watch",
        "inotify_init1", "inotify_rm_watch", "ioctl",
        "ioprio_get", "ioprio_set", "kcmp", "kill",
        "lgetxattr", "link", "linkat", "listen", "listxattr",
        "llistxattr", "lremovexattr", "lseek", "lsetxattr",
        "lstat", "madvise", "mbind", "memfd_create",
        "membarrier", "mincore", "mkdir", "mkdirat",
        "mlock", "mlock2", "mmap", "mmap_cache", "mount",
        "move_mount", "mprotect", "mquery", "mremap",
        "msgctl", "msgget", "msgrcv", "msgsnd",
        "msync", "munlock", "munmap", "name_to_handle_at",
        "nanosleep", "newfstatat", "open", "openat",
        "openat2", "pause", "pidfd_getfd", "pidfd_open",
        "pidfd_send_signal", "pipe", "pipe2", "poll",
        "ppoll", "prctl", "pread64", "preadv", "preadv2",
        "prlimit64", "process_vm_readv", "pselect6",
        "pwrite64", "pwritev", "pwritev2", "read",
        "readlink", "readlinkat", "readv", "recvfrom",
        "recvmmsg", "recvmsg", "rename", "renameat",
        "renameat2", "restart_syscall", "rmdir", "rseq",
        "rt_sigaction", "rt_sigpending", "rt_sigprocmask",
        "rt_sigqueueinfo", "rt_sigreturn", "rt_sigsuspend",
        "rt_sigtimedwait", "sched_getaffinity",
        "sched_getattr", "sched_getparam", "sched_getscheduler",
        "sched_rr_get_interval", "sched_setaffinity",
        "sched_setattr", "sched_setparam", "sched_setscheduler",
        "sched_yield", "seccomp", "select", "semctl",
        "semget", "semop", "semtimedop", "sendfile",
        "sendmmsg", "sendmsg", "sendto", "set_gid",
        "set_robust_list", "set_tid_address", "setdomainname",
        "setgid", "setgroups", "sethostname", "setitimer",
        "setpgid", "setpriority", "setregid", "setresgid",
        "setresuid", "setreuid", "setrlimit", "setsid",
        "setsockopt", "setuid", "shmctl", "shmdt",
        "shmget", "shutdown", "sigaltstack", "signalfd4",
        "socket", "socketpair", "splice", "stat", "statfs",
        "statx", "symlink", "symlinkat", "sync",
        "sync_file_range", "sysinfo", "tee", "tgkill",
        "time", "timer_create", "timer_delete",
        "timer_getoverrun", "timer_gettime", "timer_settime",
        "timerfd_create", "timerfd_gettime", "timerfd_settime",
        "tkill", "truncate", "umask", "uname", "unlink",
        "unlinkat", "unshare", "utimensat", "utimes",
        "vfork", "vmsplice", "wait4", "waitid", "write",
        "writev"
      ],
      "action": "SCMP_ACT_ALLOW"
    }
  ]
}

Save this as node-seccomp.json and apply it:

docker run --security-opt seccomp=node-seccomp.json my-app

In Kubernetes, seccomp profiles can be referenced via a RuntimeClass or a PodSecurityPolicy. The simplest approach is to use the default seccomp profile and tighten capabilities instead, since seccomp profiles are harder to manage across a cluster.

Putting it all together: the hardened Dockerfile

Here is the complete hardened Dockerfile that combines every technique above:

FROM node:20-slim AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

FROM node:20-slim
RUN groupadd --system --gid 1000 appuser && \
    useradd --system --uid 1000 --gid appuser appuser

WORKDIR /app
COPY --from=builder --chown=appuser:appuser /app/node_modules ./node_modules
COPY --chown=appuser:appuser . .

USER appuser
EXPOSE 3000

# Use tini for proper signal handling
RUN apt-get update && apt-get install -y --no-install-recommends tini && \
    apt-get clean && rm -rf /var/lib/apt/lists/*
ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["node", "server.js"]

And the corresponding docker-compose.yml:

version: '3.8'
services:
  app:
    build: .
    user: "1000:1000"
    cap_drop:
      - ALL
    cap_add: []
    read_only: true
    tmpfs:
      - /tmp
      - /app/data
    security_opt:
      - no-new-privileges:true
      - seccomp:node-seccomp.json
    ports:
      - "3000:3000"

Kubernetes Pod Security Context

In Kubernetes, all of these settings go into the Pod spec:

apiVersion: v1
kind: Pod
metadata:
  name: node-app
  labels:
    app: node-app
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 1000
    fsGroup: 1000
    seccompProfile:
      type: RuntimeDefault
  containers:
    - name: app
      image: my-app:latest
      ports:
        - containerPort: 3000
      securityContext:
        allowPrivilegeEscalation: false
        privileged: false
        readOnlyRootFilesystem: true
        capabilities:
          drop:
            - ALL
          add: []
        runAsNonRoot: true
        runAsUser: 1000
      volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: data
          mountPath: /app/data
  volumes:
    - name: tmp
      emptyDir:
        medium: Memory
    - name: data
      emptyDir:
        medium: Memory

The allowPrivilegeEscalation: false flag is critical. It sets the no_new_privs bit on the process, which prevents the binary from gaining additional privileges via setuid binaries or setcap executables. Combined with runAsNonRoot: true, this means that even if an attacker overwrites a binary with a setuid root binary, the kernel will refuse to elevate the process.

Testing that the hardening is actually enforced

A quick smoke test to verify your container is not running as root:

# Verify the user inside the container
docker run --rm --cap-drop=ALL --read-only my-app id
# Expected output: uid=1000(appuser) gid=1000(appuser) groups=1000(appuser)

# Verify you cannot write anywhere outside /tmp
docker run --rm --cap-drop=ALL --read-only my-app touch /test.txt
# Expected output: touch: cannot touch '/test.txt': Read-only file system

# Verify privilege escalation is blocked
docker run --rm --security-opt no-new-privileges:true my-app \
  /bin/sh -c "chmod u+s /usr/bin/touch && touch /test.txt"
# Expected output: Operation not permitted

In your CI pipeline, add a step that runs these checks after the image build:

# GitHub Actions
- name: Security smoke test
  run: |
    docker run --rm --read-only --cap-drop=ALL \
      my-app node -e "process.exit(0)"
    echo "Container runs with read-only root FS and dropped capabilities"

What about npm audit and image scanning?

The container hardening in this post is about runtime security: what happens after the container starts. It is complementary to image-level scanning (Trivy, Grype, Snyk) that checks for known CVEs in your base image and dependencies. You need both.

A container that passes every CVE scan can still be exploited if the process runs as root with too many capabilities. And a hardened container running as non-root with read-only filesystem can still be exploited if a dependency has a deserialization vulnerability. Layer the defenses.

A note from Yojji

Container security is easy to defer until after a breach, and nearly impossible to retrofit without breaking something if you do not plan for it from the start. The four layers covered here (non-root user, capability drops, read-only filesystem, seccomp) cost nothing to implement and require no architectural changes if applied during initial setup. Yojji is an international custom software development company founded in 2016, with offices in Europe, the US, and the UK, and their teams regularly design and deploy Node.js services on AWS, Azure, and Google Cloud with the kind of security-first container posture that makes platform engineers breathe a little easier during incident calls.