惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

SecWiki News
SecWiki News
S
Secure Thoughts
N
News and Events Feed by Topic
NISL@THU
NISL@THU
WordPress大学
WordPress大学
H
Hacker News: Front Page
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
GbyAI
GbyAI
Scott Helme
Scott Helme
Hacker News: Ask HN
Hacker News: Ask HN
S
Security @ Cisco Blogs
J
Java Code Geeks
T
The Blog of Author Tim Ferriss
Attack and Defense Labs
Attack and Defense Labs
The Register - Security
The Register - Security
Y
Y Combinator Blog
Latest news
Latest news
小众软件
小众软件
Know Your Adversary
Know Your Adversary
P
Proofpoint News Feed
P
Palo Alto Networks Blog
C
Cyber Attacks, Cyber Crime and Cyber Security
H
Help Net Security
C
Comments on: Blog
The GitHub Blog
The GitHub Blog
T
Tailwind CSS Blog
博客园 - 聂微东
阮一峰的网络日志
阮一峰的网络日志
K
Kaspersky official blog
MongoDB | Blog
MongoDB | Blog
宝玉的分享
宝玉的分享
Google DeepMind News
Google DeepMind News
C
CERT Recently Published Vulnerability Notes
V
Visual Studio Blog
M
MIT News - Artificial intelligence
F
Full Disclosure
T
Tor Project blog
F
Fortinet All Blogs
B
Blog RSS Feed
博客园 - 三生石上(FineUI控件)
A
Arctic Wolf
量子位
Last Week in AI
Last Week in AI
www.infosecurity-magazine.com
www.infosecurity-magazine.com
博客园_首页
T
The Exploit Database - CXSecurity.com
P
Proofpoint News Feed
酷 壳 – CoolShell
酷 壳 – CoolShell
The Hacker News
The Hacker News
G
Google Developers Blog

DEV Community

Source Score: Continuing Exploration of LLM Usage in Automated Workflows Tried using the Claude Platform on AWS 🚀 Google Antigravity 2.0 Quietly Changes What It Means to Be a Software Engineer Environment variables vs connection references in Power Platform Multi-BU D365 environment: single tenant, multiple LEs AI API Integration Testing Checklist for Multi-Model Apps ORA-00203 오류 원인과 해결 방법 완벽 가이드 Designing a Data Extension in SFMC: The Four Decisions First Kayrol — Day 0: Building AI highlight reels for athletes (in public) The Agony of Over-Engineered Operators: Why Simplicity Saved Our Treasure Hunt Engine Business Rules vs Power Automate vs Plugin: pick one Dataverse virtual tables on SQL: three latency patterns Comunicación y sincronización entre procesos distribuidos I let Gemma 4 analyze my credit card statements so I wouldn't have to Faithfulness gate: the agent layer most teams skip Centralized procurement D365: global address book + vendors Why I Can't Stop Thinking About Google's New A2A Protocol Perovskite cell scaps simulation analysis ¿Qué significan esas letras del CVSS? Guía para entenderlo de una vez scrcpy Integration in a Tauri App — Android Screen Mirroring on Mac Shopify theme editor: design tokens merchants can edit Dataverse security restructure: lessons applied too late Floatkit is live now!!! SimGemma: Democratizing STEM Education with Offline-First AI Simulations What to monitor in an AI agent before you launch (and after) The precedence rule deserves a name Diffusion Language Models Are Here: Deep Dive into NVIDIA's Nemotron-Labs DLM Architecture [Boost] I Still Remember the Day Our Server Stall Almost Killed the Product Launch AI Agents Need More Than Fact-Checking Evaluation & Benchmark Results 5 things `flutter_gemma` doesn't tell you about shipping Gemma 4 on Android How I Indexed 2,000 Claude Code Skills (And What the Install Data Says About AI Coding in 2026) Architecting Instant Micro-Loans: Data Pipelines and KYC Automation Bulk Rename Files from the Command Line with Python Virtual SOC Analyst This project was an absolute blast to build for the Hermes Agent Challenge. If you found the architecture layout or the local automation breakdown helpful, please drop a ❤️ or a 🦄 on the post! Let me know if you want me to write a follow-up guide specifi How I built a fully offline AI assistant on Android with Gemma 4 E2B How I Got Users to Willingly Wait 1 Minute for an API Call (Without Over-Engineering) What Training Exists for Security Professionals Learning AI and Data Science? Easier Bets to Get Early Customer Validation and VC Attention django-deploy-probes — deployment probe endpoints for Django AI Won’t Replace Developers. Weak Thinking Will. Building Micro Agents as Production-Grade Microservices Why Open-Weight Models Like Gemma 4 Are the Future of Secure Backend Architecture I lost 3 enterprise clients in one night because of a GitHub repo. So I built a tool to make sure it never happens again. Building a Local AI SOC Analyst on an M1 MacBook Pro Carelo: A Modern Dual-Pane File Manager for Linux AI API Pricing in 2026: What You Actually Pay for GPT-5.5, Claude Opus, Gemini, and 20+ Models I Built a Free Offline-First Event Operations Platform at 13. Here's Why the Architecture Is Different. I Built an AI Tools Directory. These 10 Lessons Hurt the Most. The "Disappearing Zero": Handling Numeric Inputs in React Native Forms I Finished My Local AI Coding Agent After 5 Months — Eve Agent V2 Unleashed published Neuropsychology: What Brain Damage Reveals About the Mind Shipping Gemma 4 speech recognition in a Windows .NET desktop app: a 5-variant model-selection tour Engineers Don’t Fail Technical Interviews Because They’re Bad at Tech — They Fail Because They Ignore Communication The 20% of ML theory that earns its keep in production WeiQi - (Go) game based productivity tool Diário de dev #1: o que 15 minutos desbloqueou 远程安装及部署应用 · 用户配合指南 The Complete Guide to API Design in 2026: REST, GraphQL, and tRPC in Production 🐍 Flask Python Structured Logging — What Most Miss in Production CSS in 2026: Container Queries, Cascade Layers, and the End of Utility-Class Bloat TypeScript 5.5 — The Features That Actually Matter for Production Code Database Migration Strategies That Actually Work in Production Detecting unusual processes on your servers without writing a single rule 2026 Q1 is the year developers still build the agent harness. 2026 Q3 / 2027 is the year the LLM builds its own harness. Introduction to Generative AI no-cycle finds 0 cycles in next.js (and other lies caches tell you) Google I/O 2026 Wasn’t About AI Models — It Was About Infrastructure Hermes Agent vs Openclaw بناء موقع شخصي يمثلك كمطور: دروس من رحلتي Building a Developer Portfolio That Represents You: Lessons from My Journey Your Checkout Is Probably Leaking Revenue. The Problem Is You Cannot See Where. Domain-Based C++ Logging With Nova OpenCode Go + Oh My OpenAgent: The Model Routing Config That Actually Saves Money Seven Types of Data Extensions We Use on SFMC Projects Rollup vs calculated columns in Dataverse: the async trap we fell for MES integration with D365 Supply Chain: Azure middleware pattern Custom API vs Custom Action vs Azure Function: Dataverse decision Cutting agent latency from 30s to 8s without model swap When recall plateaus: the late-interaction technique most teams skip Mobile stack decision: FlutterFlow vs React Native vs Flutter Plugin + Azure Function + Service Bus: async integration at scale SFMC Data Model and Cardinality: Wire DEs Together Without Regret Custom connector with OAuth2: three auth pitfalls we debugged Four forensics when a production AI agent fails Hiring engineers in the age of AI Go Unit Testing: Structure & Best Practices The cognitive bottleneck: rethinking velocity for AI-assisted development GitHub Bounty 赏金接单全攻略:从0到第一桶金 I Built a Mix Translation Tool in a Single HTML File LIKAS: An offline disaster companion for the Philippines, powered by on-device Gemma 4 E2B Being Seen — The World of Aying (7/12) OpenClaw vs Hermes Agent: Similarities, Differences, and Where Each Shines Your Vercel Redirect Is Backwards and Google Is Ignoring Your Site When a 200-Line CPQ Quote Takes 30 Seconds: Where to Look First SOQL Selectivity: Avoiding Full Table Scans on Million-Row Objects Building a Mini Tailwind-to-CSS Converter — How Utility Class Names Map to Real CSS Piclu - Turning voice notes into a shopping list with local Gemma 4
Your Node.js Server is Using Just One CPU. Here's How to Fix It.
Blackwatch · 2026-05-24 · via DEV Community

CLUSTERING

You created your node application, it's ready, you have chosen an 8 vCPU instance to deploy it. You are done with deployment. Everything is working fine, but unknowingly you aren't using the full potential of the deployment. We know that node.js runs on a SINGLE THREAD, which means our node application uses only one vCPU at a time — but you took an 8 vCPU instance, so aren't the other 7 vCPUs sitting there idle?

The solution for this is CLUSTERING. It's a concept of running multiple instances of an application, where each works as an individual entity but still gets the work done and runs on the same port. Now the question is — how will this work? Isn't it going to cause issues among the instances? The simple and short answer is no.

HOW IT WORKS

When clustering is done, we end up with multiple processes. There are two kinds:

  1. Primary – There is only one primary. It is responsible for spinning up the worker processes, managing them, and if any one of them dies, spawning it back. The primary is there only to manage — it doesn't run the code (connecting with db, spinning up server, etc.) Note – If the primary is down, the entire cluster will crash.
  2. Workers – These are the actual instances where the application runs — they serve the users.

Key Facts

  1. Since there are 8 vCPUs in our case, there will be 9 total processes — 1 primary + 8 workers.
  2. Each worker has its own memory – nothing is shared among workers.
  3. Workers share a single port – connections are distributed across them.
  4. Primary is intentionally dumb – never runs code or connects with db.
  5. Workers can't see their siblings — for each worker, only itself exists.

CODE SNIPPET

import cluster from "node:cluster";
import os from "node:os";
import app from "./src/app";
import { connectDB } from "./src/config/database";
import { createServer } from "http";

const PORT = process.env.PORT || 3000;
const enableCluster = process.env.NODE_ENV === "development";

if (enableCluster && cluster.isPrimary) {
  const numWorkers = os.cpus().length;
  for (let i = 0; i < numWorkers; i++) cluster.fork();

  cluster.on("exit", (worker) => {
    console.log(`worker ${worker.process.pid} died — respawning`);
    cluster.fork();
  });
} else {
  const httpServer = createServer(app);
  connectDB().then(() => httpServer.listen(PORT));
}

Enter fullscreen mode Exit fullscreen mode

EXPLANATION OF CODE
In production we generally use services like pm2 to manage clustering, but here we are doing it using native options. For that, we first need the cluster and os modules of node.
Then we check if the current process is the primary or not. If it is the primary, we spawn new workers as per the number of cores available — it's not hard coded, we may change it as per our convenience, but it should not be more than the number of cores/vCPUs. If it isn't the primary (meaning we are already inside a worker), we run the actual backend code — connecting to the DB and starting the server. So now we have 8 worker instances up and running (plus the primary watching over them).
Using process.pid, we can see the unique id of each worker.
Note – this id, and whatever happens inside an instance, stays there only. Other instances can't access this one's data, process, etc.

PROS/CONS

Pros:

  1. Uses all CPU cores
  2. Crash isolation
  3. Built into Node
  4. Higher throughput, CPU-bound work
  5. Auto-respawns dead workers

Cons:

  1. Each worker has its own RAM (no shared state)
  2. In-memory caches/sessions break silently
  3. WebSockets/SSE need extra infrastructure
  4. Harder to debug – 'which worker logged that?'
  5. Primary crash = whole cluster dies

Note – load balancing is round-robin on Linux; on Windows, the OS decides routing.

BIG CAVEAT

This much is enough for simple clustering or for learning purposes, as long as our app is using stateless data (REST APIs backed by a DB).

In this case, the DB is the source of truth. Workers don't need to know about each other. Any worker can serve any request.

STATEFUL connections (WebSocket)
Prerequisite — knowledge of websockets.

Now things change. Once a connection is established and the HTTP request is upgraded to a WebSocket, the socket connection details (which user is on which socket) are stored in memory, inside that worker. So if User A connects through Worker 1 and User B connects through Worker 2, both are logged in and both users' data is stored in the DB. But the live sockets sit on different workers. Now when A sends a message to B, Worker 1 tries to push it to B's socket — but B's socket lives in Worker 2's memory, not Worker 1's. So the message gets saved to the DB, but real-time delivery to B fails.
Also, workers are standalone, so they can't even talk to each other to ask "do you have this user with you?"
A TCP socket lives inside one process.

STICKY Session
Imagine a user lands on Worker A and creates a socket connection. Details regarding the session are stored in Worker A's memory. Somehow, on the next request, the user is shifted to Worker B. Now the user tries to continue the conversation. The worker checks if this session exists or not, but there is no record of it in Worker B (that detail lives in Worker A). So the interaction fails.

To make it easier to picture, here are two ways to think about it:

Analogy 1 (hotel front desk) — You check into Hotel A. The front desk writes your name against Room 204. Later, you walk into Hotel B and ask for your room key. Hotel B has no idea who you are, because your check-in details only exist at Hotel A's front desk.

Analogy 2 (locker at a station) — You drop your bag at locker #5 in Station A and get a ticket. Later, you go to Station B and try to use the same ticket. Station B has no locker matching that ticket, because the bag is sitting back in Station A.

To mitigate this issue, we need Sticky Sessions. It ensures that a user stays on a single worker only — pinning all of one client's requests to the same worker.

One more thing worth knowing — Socket.IO's connection handshake itself is made of multiple HTTP requests (long-polling fallback) before it upgrades to WebSocket. Without stickiness, those handshake requests can scatter across different workers, and the connection never even establishes. So sticky sessions are needed not just after the user is connected, but during the initial connection itself.

REDIS ADAPTER for SOCKET
Even with stickiness, workers still can't communicate with each other. So User A on Worker 1 has no way to push a message to User B sitting on Worker 2. This is a major issue in applications using sockets or real-time communication. To solve this, we have adapters — one of them is the Redis adapter for Socket.IO. It acts as a coordination layer on pub/sub. With this in place, when Worker 1 emits a message, the adapter publishes that emit to a shared bus (Redis). Every worker is subscribed to this bus, and the worker that actually owns B's socket picks it up and delivers the message locally. Now the application will work just like an application running on a single instance.

STICKY + ADAPTER
The two solve different problems, and you actually need both together.

  • Sticky sessions make sure a user's requests always land on the same worker, so the connection (and the handshake) never breaks mid-way.
  • The Redis adapter makes sure that when a worker needs to push a message to a user sitting on a different worker, the message can still reach them through the shared pub/sub bus.

Sticky alone — your user stays connected, but messages between users on different workers still don't reach. Adapter alone — workers can broadcast across each other, but the initial connection itself keeps breaking. Together — your clustered app behaves like a single instance from the user's perspective.

TL;DR
Node is single-threaded. Clustering spawns one worker per core. REST scales for free because the DB is shared. Sockets don't — connections live in one worker's RAM. Fix with sticky sessions (so handshakes complete) plus a pub/sub adapter (so workers can deliver each other's messages).

So this sums up basic clustering in a node.js application.
Thanks for reading.