惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

N
News and Events Feed by Topic
Malwarebytes
Malwarebytes
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
Cybersecurity and Infrastructure Security Agency CISA
F
Future of Privacy Forum
C
Cisco Blogs
T
The Exploit Database - CXSecurity.com
A
Arctic Wolf
S
Securelist
K
Kaspersky official blog
S
Schneier on Security
T
ThreatConnect
T
Tenable Blog
Spread Privacy
Spread Privacy
T
True Tiger Recordings
AWS News Blog
AWS News Blog
F
Fox-IT International blog
量子位
T
Threatpost
V
Vulnerabilities – Threatpost
C
CERT Recently Published Vulnerability Notes
Cisco Talos Blog
Cisco Talos Blog
GbyAI
GbyAI
宝玉的分享
宝玉的分享
腾讯CDC
G
Google Developers Blog
aimingoo的专栏
aimingoo的专栏
Cyberwarzone
Cyberwarzone
有赞技术团队
有赞技术团队
S
SegmentFault 最新的问题
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
U
Unit 42
雷峰网
雷峰网
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Simon Willison's Weblog
Simon Willison's Weblog
O
OpenAI News
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The GitHub Blog
The GitHub Blog
The Register - Security
The Register - Security
MyScale Blog
MyScale Blog
小众软件
小众软件
A
About on SuperTechFans
Last Week in AI
Last Week in AI
Y
Y Combinator Blog
博客园 - 三生石上(FineUI控件)
美团技术团队
Google Online Security Blog
Google Online Security Blog
P
Proofpoint News Feed
MongoDB | Blog
MongoDB | Blog

DEV Community

Gemma4 Challenge OptiLearn - Powered by Google Gemma 4 Aura — The Gemma 4 Powered Agentic Web Copilot & Self-Healing Accessibility Engine I built a tool that catches misleading charts using Gemma 4 running locally Worklog companion with Gemma4 GBase: Building LLM Agents That Actually Learn from Their Mistakes Blossom — a small step toward student mental wellbeing WordPress Performance Monitoring: A Complete Guide When three sharp wallets agree: what consensus signals on Polymarket actually mean I Built a Fail-Fast Rust Scheduler with Background OAuth Auto-Refresh (Part 2) Sharing is caring How Putting Faces (Literally) to My AI Garden Images Gave It a Personality Sofi Log #001: Thailand's Tourism Tax & the 180-Day AI Surveillance Wall Sofi Log #006: Decentralized IP-Address Obfuscation Specs Sofi Log #008: Bypassing Legacy Cross-Border Bank Fee Traps Secret Rotation Automation: The Operational Cost of Security Sofi Log #009: Portable Identity & DID Passport Framework Sofi Log #011: Autonomous Smart Treasury Repatriation Specs History of Linux & Unix I asked Claude if my plan was on track for the goal — and got an honest 'No' PHPStan 'expects X, Y given' — the trace it doesn't give you Using Gemma4 2B to Assist Community Health Workers Open-source Playwright wrapper that passes bot.sannysoft.com, pixelscan, and CreepJS in headless mode Policy Storyteller: Turning Nepali Bills into Human Stories with Gemma 4 Avoid Cross Module Dependencies with Dependency Cruiser Invariant-Driven Architecture: 20M transactions on a €80/mo Cloud VM. Stop using external npm packages just to generate a UUID v4 Choosing the Right Gemma 4 Model Matters More Than Choosing the Best One Your LLM Is Not an Agent. Your Framework Is Not Enough. You Need a Harness. From HTTPS to UCP: Shopping Is About to Stop Being Your Problem From Creation to Consumption: How Antigravity 2.0 and Gemini Spark Are Defining the Agentic Era 10 Mistakes I Wish I Knew Before Taking the CKA Exam AI That Actually Does Stuff: Autonomous Agents Explained Exploring AI workflow Orchestration: Comparing Weft, Python & Alternative Pipeline Approaches El Poder del Aprendizaje Federado: Cuando los Algoritmos Distribuidos Entrenan a la IA Email Marketing Automation in 2026: 5 Tools (and 1 Self-Hosted) Through Their APIs A Replay Runbook For Missed Publishing Windows Why timeout handling matters more than most backend logic How I Make $6,800/Month Selling Niche VS Code Extensions Model Routing Cost Checklist: Hosted APIs, Open Models, Or Self-Hosted Inference? ORA-00207 오류 원인과 해결 방법 완벽 가이드 Deno 2.8 Operator Upgrade Checklist: CI, Lockfiles, Node Compatibility, And Rollback AI-Discovered Vulnerabilities Need A Triage Queue, Not A Panic Channel AI Agent Workboards Need Audit Controls Before They Need More Agents Demystifying DevRel: What It Actually Is (And Why Should You Become One?) Your AI, Your Device, Your Data - Introducing Aide Gemma 4 GenAI Coach - GenAI Concepts Made Easy with an Interactive Playground QuietPulse - Mood Tracker Principal Components in TypeScript (Part 3) The pgAudit Attribution Gap: Why Role-Level Logging Fails GDPR and How to Close It Gemma 4 CAD Orchestrator I built a local Postgres triage co-pilot because HIPAA says I can't paste plans into ChatGPT or Claude Live Holographic Editor In Fractal Time Everbench: A document management system with Local Intelligence Instanton in Fractal Time The Hidden Features of Claude How I Built an AI News Brief with Next.js, Supabase, Vercel, and GPT-4o-mini How We Built a Multi-Agent AI Documentation System (And What We Learned) I got tired of writing post-mortems — so I built RCAi for SREs MIA: A Futuristic AI Desktop Assistant Built with Voice, Gestures, and Controlled Chaos Best Programming Language for Backend Web Development: PHP vs Python PayPal Alternatives for Indian Businesses: Best Payment Gateways for International Card Payments (2026) Gemma 4 Made Me Rethink Local AI: Not Just Text, But Images Too Clean Architecture in .NET Explained (The Dependency Rule) I Compiled Rust to WebAssembly and Made My JavaScript 6 Faster Outlook.com Is the Final Boss of 'Just Send an Email' Conditional Statements and Control Flow in Python Insults & Cutlasses, Local LLM Sword Fighting on Melee Island Production Lab: ECS Fargate + Prometheus + Grafana + Loki + Alloy + Node Exporter How 12 AI agent frameworks handle human approval (most badly) The Four-Index Reality: Why AI Search Isn't One Thing I Scanned 1 Million AI Services. Here's What Worries Me More Than the Vulnerabilities Managing multiple docker hub accounts using docker-use System Design Interview: Decentralized Web Crawler Metric Cardinality: High or Low? 4 Steps to Making the Right Choice 로컬 LLM 셋업 가이드 (v23) GEO vs SEO in 2026 — What Google's May Guidance Changed Cursor Review 2026 — Honest 'Not For Me' Take From a VSCode User Hello from rikuq — a practitioner blog for solo AI SaaS founders Why DevOps Engineers Need Practical Tutorials, Not Just Theory AI Agents in CI/CD: Give Them Context, Not Production Authority Now I See Why Translators Are Panicking Over AI—Should Coders Panic Too? Why I Track HRV Every Morning (And How It Actually Changes My Day) Diffusion Language Models: How NVIDIA's Nemotron-Labs DLM Is Killing Token-by-Token Generation Chatbots GPT pour le support client : ce que les équipes françaises ont réellement besoin de savoir I Hit the 1,232-Byte Wall So You Don't Have To Google Just Rebuilt the Search Box (Again) — But This Time It's Different Aether: A local Android assistant built with Gemma 4 BoxAgnts Introduction (1) — Out of the Box mkdev: trusted HTTPS for localhost, mapped by name Just one question, one answer. Why Java Still Rules the Programming World in 2026 Four Architectures for Letting Claude Edit Elementor (and Why We Shipped Clone-and-Mutate) yard-yaml 0.1.1: safer UTF-8 handling for YAML documentation I Built a Mac App That Keeps Your Clipboard in Sync Across All Your Android Devices Stop Using UUIDs: Why B2B SaaS Needs ULIDs in Laravel 🐘 I'm a non-technical founder who built a Slack approval tool. Here's what actually broke first. Open-Sourcing Our Game AI Stack — SDKs, Templates, and CLI Tools for NPC Dialogue I Built an AI System That Makes 1,000 Decisions a Day. Here's Where I Drew the Line. Lets Encrypt DNS Challenge with Traefik and AWS Route 53
Principal Components in TypeScript (Part 4)
bitanath · 2026-05-25 · via DEV Community

This is part four of a series Principal Components in TypeScript and focuses on the application of PCA to actually derive named insights

TL;DR

If you need a TL;DR, just read the code or grab the package here on npm


Not a Code Blog

This is not a code blog. There’s no easy copy-paste solution here.
If that’s what you want, go straight to the source code above.


Now this post attempts to use PCA in a totally different direction compared to the vanilla dimensionality reduction or to data compression discussed in the earlier Parts 1 and 2. Part 3 used it for neural network explanation. Part 4 will use it for something else entirely ... Attributing causation to data through factor analysis.


Now remember how PCA collapses data with 100 dimensions into a single dimension, wouldn't it be cool if this dimension was interpretable. For example, let's say the 100 columns were like stress, smoking frequency, alcohol ml etc etc.. you see where I am going with this, the final dimension would be something like cardiac arrest or premature demise. On that cheery note, let's figure out how PCA can actually be used to label this reduced dimension.

Wait really? Can it?

Nope, not PCA, but the starting point remains the same, SVD. We still decompose the data to get a bunch of eigenvectors and then utilize them to get a single factor or a combination of factors.


From SVD to named factors

The standard PCA pipeline goes: center data → covariance → eigendecomposition. That gives you eigenvectors that are abstract math objects. To get something you can name, you need to go one step further — compute how each original variable correlates with each factor score.

Here's the entire pipeline in one function, building directly on SVD:

import { svd } from "./svd";
interface FactorResult {
  loadings: number[][];
  scores: number[][];
  variance: number[];
}
function standardize(data: number[][]): number[][] {
  const m = data.length;
  const n = data[0].length;
  const means = new Array(n).fill(0);
  const stds = new Array(n).fill(0);
  const standardized: number[][] = Array.from({ length: m }, () => new Array(n));
  for (let j = 0; j < n; j++) {
    let sum = 0;
    for (let i = 0; i < m; i++) sum += data[i][j];
    means[j] = sum / m;
  }
  for (let j = 0; j < n; j++) {
    let sumSq = 0;
    for (let i = 0; i < m; i++) {
      const diff = data[i][j] - means[j];
      sumSq += diff * diff;
    }
    stds[j] = Math.sqrt(sumSq / m) || 1;
  }
  for (let i = 0; i < m; i++) {
    for (let j = 0; j < n; j++) {
      standardized[i][j] = (data[i][j] - means[j]) / stds[j];
    }
  }
  return standardized;
}
function correlation(x: number[], y: number[]): number {
  const n = x.length;
  let sumX = 0, sumY = 0, sumXY = 0, sumX2 = 0, sumY2 = 0;
  for (let i = 0; i < n; i++) {
    sumX += x[i]; sumY += y[i];
    sumXY += x[i] * y[i]; sumX2 += x[i] * x[i]; sumY2 += y[i] * y[i];
  }
  const num = n * sumXY - sumX * sumY;
  const den = Math.sqrt((n * sumX2 - sumX * sumX) * (n * sumY2 - sumY * sumY));
  return den === 0 ? 0 : num / den;
}
export function factor(data: number[][]): FactorResult {
  const standardized = standardize(data);
  const result = svd(standardized);
  const n = data[0].length;
  const m = data.length;
  // Factor scores = standardized · V  (equiv. to U · Σ)
  const factorScores: number[][] = Array.from({ length: m }, () => new Array(n));
  for (let i = 0; i < m; i++) {
    for (let j = 0; j < n; j++) {
      let sum = 0;
      for (let k = 0; k < n; k++)
        sum += standardized[i][k] * result.V[k][j];
      factorScores[i][j] = sum;
    }
  }
  // Loadings = correlation between each variable and each factor score
  const loadings: number[][] = Array.from({ length: n }, () => new Array(n));
  for (let varIdx = 0; varIdx < n; varIdx++) {
    const variableCol = standardized.map(row => row[varIdx]);
    for (let factorIdx = 0; factorIdx < n; factorIdx++) {
      const factorCol = factorScores.map(row => row[factorIdx]);
      loadings[varIdx][factorIdx] = correlation(variableCol, factorCol);
    }
  }
  // Sign consistency
  for (let j = 0; j < n; j++) {
    let sum = 0;
    for (let i = 0; i < n; i++) sum += loadings[i][j];
    const sign = Math.abs(sum) < 1e-10 ? 1 : (sum < 0 ? -1 : 1);
    for (let i = 0; i < n; i++) loadings[i][j] *= sign;
    for (let i = 0; i < m; i++) factorScores[i][j] *= sign;
  }
  // Variance per factor
  const variance = new Array(n);
  for (let j = 0; j < n; j++) {
    let sumSq = 0;
    for (let i = 0; i < n; i++) sumSq += loadings[i][j] * loadings[i][j];
    variance[j] = sumSq / n;
  }
  return { loadings, scores: factorScores, variance };
}

Enter fullscreen mode Exit fullscreen mode


What's happening here

Three things, and they're all important.

First, we standardize the data — z-scores, mean 0, variance 1. This puts every variable on the same scale so the SVD isn't biased by units (millilitres vs cigarettes vs hours of sleep).

Second, we compute factor scores as standardized · V. Since the SVD gives us X = UΣVᵀ, multiplying by V is equivalent to UΣ. Each row of the result is how that observation scores on each factor. The first column is the score on the strongest latent dimension.

Third, we compute loadings as the correlation between each original variable and each factor score. A loading of 0.8 means that variable moves tightly with that factor. This is what makes interpretation possible — you read the loadings column by column and ask: what do the high-loading variables have in common?

Reading the result

Pass your health data into factor() and inspect the loadings:

const healthData = [
  [8, 20, 200, 1, 5],  // stress, smoking, alcohol, exercise, sleep
  [6, 15, 150, 3, 6],
  [9, 25, 300, 0, 4],
  [3, 5,  50,  5, 8],
  [7, 18, 180, 2, 5],
];

const result = factor(healthData);

const varNames = ["Stress", "Smoking", "Alcohol", "Exercise", "Sleep"];

result.loadings.forEach((row, i) => {
  console.log(`${varNames[i]}: [${row.map(l => l.toFixed(3)).join(", ")}]`);
});

console.log("Variance explained:", result.variance.map(v => v.toFixed(3)).join(", "));

Enter fullscreen mode Exit fullscreen mode

Now let's assume this is yours truly, a poverty stricken stressed out developer of things.

Factor 1 loads high on stress, smoking, and alcohol, negative on exercise and sleep. That's my "Cardiac Risk" axis. Factor 2 might split exercise and sleep against the rest — call it "Recovery." I named your latent space without guessing. I better make some lifestyle changes soon or ☠️


Why this works better than raw eigenvectors

Raw eigenvectors are arbitrary in sign and scale. A value of 0.5 on an eigenvector doesn't mean the same thing as a correlation of 0.5 — eigenvectors are unit-scaled by construction. Computing correlations with factor scores gives you directly interpretable numbers between -1 and 1. You can look at a loading of 0.92 and say "this variable is strongly associated with this factor" with the same confidence you'd have looking at any Pearson correlation.

The sign-flip loop at the end ensures consistency — factor 1 always points in the dominant direction of its loadings, so you don't get a random sign reversal between runs.


PCA vs Factor Analysis redux

PCA decomposes total variance. This function decomposes standardized data via SVD and then derives loadings from correlations — which is closer to what factor analysis does conceptually (modelling shared variance through latent variables). The practical difference is small with clean data and vanishes entirely with rotation, but purists will note it.

For naming dimensions and moving on, this approach is better than raw PCA because the output is already in interpretable units. You don't need a second step to convert eigenvectors into something readable. The loadings are the interpretation.

Everything shown here uses the same SVD from the package pca-js, just applied differently. The SVD is the Golub-Reinsch implementation — same decomposition, different post-processing. The difference between dimensionality reduction and factor analysis isn't the math. It's what you do after the decomposition lands.

I will also be writing a chonky implementation that does this quite easily, here is my github in case I actually manage to release. If not, life is hard and then we ☠️


The end, and mostly just a donut

With this we come to the end of this 4 part series. I hope you liked it. If not, that's ok too. Bye! and have a great rest of your day!