惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
The Cloudflare Blog
Vercel News
Vercel News
博客园 - 聂微东
博客园 - Franky
T
Troy Hunt's Blog
U
Unit 42
T
Threatpost
F
Fortinet All Blogs
Microsoft Azure Blog
Microsoft Azure Blog
AWS News Blog
AWS News Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
T
The Exploit Database - CXSecurity.com
P
Palo Alto Networks Blog
G
GRAHAM CLULEY
I
Intezer
Latest news
Latest news
NISL@THU
NISL@THU
云风的 BLOG
云风的 BLOG
H
Hackread – Cybersecurity News, Data Breaches, AI and More
博客园_首页
Project Zero
Project Zero
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
T
Tor Project blog
Schneier on Security
Schneier on Security
Know Your Adversary
Know Your Adversary
T
Tailwind CSS Blog
博客园 - 司徒正美
S
Security @ Cisco Blogs
P
Privacy International News Feed
The Last Watchdog
The Last Watchdog
小众软件
小众软件
S
Security Affairs
MyScale Blog
MyScale Blog
TaoSecurity Blog
TaoSecurity Blog
The Register - Security
The Register - Security
Cyberwarzone
Cyberwarzone
人人都是产品经理
人人都是产品经理
L
LINUX DO - 最新话题
C
Cyber Attacks, Cyber Crime and Cyber Security
罗磊的独立博客
GbyAI
GbyAI
P
Proofpoint News Feed
博客园 - 三生石上(FineUI控件)
SecWiki News
SecWiki News
G
Google Developers Blog
Security Latest
Security Latest
PCI Perspectives
PCI Perspectives
B
Blog
S
Securelist

Hacker News

Introducing Claude Opus 4.7 Qwen Studio The Future of Everything is Lies, I Guess: Where Do We Go From Here? GitHub - SeanFDZ/macmind: Single-layer transformer in HyperTalk for the classic Macintosh Show HN: Agent-cache – Multi-tier LLM/tool/session caching for Valkey and Redis Ancient DNA reveals pervasive directional selection across West Eurasia [pdf] Moving a large-scale metrics pipeline from StatsD to OpenTelemetry / Prometheus GitHub - Nightmare-Eclipse/RedSun: The Red Sun vulnerability repository GitHub - SethPyle376/hiraeth: Local AWS emulator focused on fast integration testing, with SQS support, SQLite-backed state, and a debug-friendly web UI. GitHub - macOS26/Agent: Any AI, replaces Claude Code, Cursor, OpenClaw. Over 18 LLM providers (Claude, OpenAI, Gemini, Ollama, Zai, HF, Qwen) wired into a native Mac app that writes code, builds Xcode projects, bumps versions, manages git, automates Safari, use AppleScript, JS or Accessibility, extend Agent! w/ MCP Servers, run tasks from your iPhone via Messages. YouTube now lets you turn off Shorts I Made a Terminal Pager Burgers | マクドナルド公式 Commands — HackerNews CLI documentation ChatGPT for Excel PiCore - Raspberry Pi Port of Tiny Core Linux Live Nation illegally monopolized ticketing market, jury finds Google Broke Its Promise to Me. Now ICE Has My Data. Founding Engineer at Adaptional | Y Combinator CRISPR takes important step toward silencing Down syndrome’s extra chromosome GitHub - saffron-health/libretto: The AI toolkit for building reliable browser automations US v. Heppner (S.D.N.Y. 2026) no attorney-client privilege for AI chats [pdf] Unexpected €54k billing spike in 13 hours: Firebase browser key without API restrictions used for Gemini requests Retrofitting JIT Compilers into C Interpreters IPv6 – Google The Accursèd Alphabetical Clock Cybersecurity Looks Like Proof of Work Now Fragments: April 14 Cal.com Goes Closed Source: Why AI Security Is Forcing Our Decision | Cal.com - Scheduling Software for Online Bookings Laravel raised money and now injects ads directly into your agent When moving fast, talking is the first thing to break Too much Discussion of the XOR swap trick – Heather Cafe Introduction to Spherical Harmonics for Graphics Programmers The Grand Line Building a Z-Machine in the worst possible language High-Level Rust: Getting 80% of the Benefits with 20% of the Pain GitHub - duguyue100/midnight-captain: Inspired by Midnight Commander, tailored to my taste. How to build a `git diff` driver · Jamie Tanna | Software Engineer Center for Responsible, Decentralized Intelligence at Berkeley The Local Universe’s Expansion Rate Is Clearer Than Ever, but Still Doesn’t Add Up - A new synthesis of astronomical measurements confirms a persistent mismatch that could point to physics beyond current models The air throughout our homes is infused with microplastics. But there are things you can do to breathe less of them The disturbing white paper Red Hat is trying to erase from the internet – OSnews The Future of Everything is Lies, I Guess: Annoyances ‘Abhorrent’: the inside story of the Polymarket gamblers betting millions on war Productive procrastination — Max van IJsselmuiden maps, territory and LMs 447 Terabytes per Square Centimetre at Zero Retention Energy: Non-Volatile Memory at the Atomic Scale on Fluorographane Show HN: Pardonned.com – A searchable database of US Pardons 20 Years on AWS and Never Not My Job The Seasons are Wrong Artemis II crew splashes down near San Diego after historic moon mission We gave an AI a 3 year retail lease in SF and asked it to make a profit | Andon Labs How a dancer with ALS used brainwaves to perform live On filing the corners off my MacBooks Installing every* Firefox extension OpenClaw’s memory is unreliable, and you don’t know when it will break Steve Blank Nowhere Is Safe Chimpanzees in Uganda locked in vicious 'civil war', say researchers watgo - a WebAssembly Toolkit for Go linux/Documentation/process/coding-assistants.rst at master · torvalds/linux GitHub - callumlocke/json-formatter: Makes JSON easy to read. Founding Product Engineer at Bild AI | Y Combinator A compelling title that is cryptic enough to get you to take action on it GitHub - Keychron/Keychron-Keyboards-Hardware-Design: Industrial design files for Keychron keyboards and mice. 100+ models with CAD assets in STEP, DXF, DWG, and PDF. Source-available, with commercial use allowed for original compatible accessories within the license terms. [ANNOUNCE] WireGuardNT v0.11 and WireGuard for Windows v0.6 Released 1D-Chess Helium Is Hard to Replace Cooperative Vectors Introduction | Evolve Keeping a Postgres queue healthy — PlanetScale Our response to the Axios developer tool compromise Do Americans read print books, e-books or audiobooks more? The Zettelkasten Method in Obsidian: A Practical Setup Guide Artemis II Is Competency Porn and We Are Starving For It WeakC4 Flight Viz — Cockpit View A Mexican surveillance giant you’ve never heard of is now watching the U.S. border Surelock: Deadlock-Free Mutexes for Rust RISC-V 101 – what is it and what does it mean for Canonical? | Ubuntu The Problem That Built an Industry How Much Linear Memory Access Is Enough? | Solidean Investigating Split Locks on x86-64 Simplest hash functions Sybilproof reputation mechanisms (2005) [pdf] What is a property? How Complex is my Code? Static code analysis in Kotlin — tools overview Toffoli gates are all you need PGLite evangelism dcmake: a new CMake debugger UI Clojure on Fennel part one: Persistent Data Structures Fragments: April 2 Python Release Python install manager 26.1 The Life and Death of the Book Review - Liberties Bitcoin miners are losing $19,000 on every BTC produced as difficulty drops 7.8% God sleeps in the minerals Building slogbox Apple Silicon and Virtual Machines: Beating the 2 VM Limit Who was “Not Even Wrong” first? Pokemon Evolution Vs Darwinian Evolution The APL Programming Language Source Code
The 90-year-old idea behind JEPA models: Canonical Correlation Analysis (CCA) – Shon Czinner’s Blog
Shon Czinner · 2026-06-11 · via Hacker News

Introduction

Concepts of correlation and regression may be applied not only to ordinary one-dimensional variates but also to variates of two or more dimensions.

This is the first sentence from the paper “Relations Between Two Sets of Variates” (Hotelling 1936) by statistician and economist Harold Hotelling. This paper introduced Canonical Correlation Analysis (CCA). In modern terminology, “CCA is used to find a common signal among two large matrices” (Bykhovskaya and Gorin 2025).

In JEPA, the objective is the same except the second data matrix happens to be simply a different view of the same data in the first dataset (e.g. via data augmentation or spatial or temporal proximity). One of the recent papers to acknowledge a connection states, “JEPA-based models implicitly perform a non-linear generalization of Canonical Correlation Analysis”. (Huang 2026)

CCA’s connection to JEPA is relevant to Schmidhuber’s debate on who invented JEPA, which is directed at Yann LeCun. Personally, I think Hotelling deserves the credit for the idea of maximizing correlation in embedding space.

Of course, the CCA model has many differences from JEPA.

For one, CCA does not enforce a shared encoder. But the biggest difference is that CCA is linear. Non-linear neural variants of CCA have been researched with the earliest usage of the term “Deep CCA” being (Andrew et al. 2013).

Connecting JEPA models back to its CCA roots is genuinely useful. Another Deep CCA paper (Benton et al. 2017) relaxed the assumption of two sets of variables to an arbitrary number based on a generalization of CCA proposed in 1961 (Horst 1961). Conceivably, JEPAs could be expanded to handle more than two views as well.

CCA vs. JEPA Overview

Suppose we have zero-mean matrices \(X=(x_1,...,x_n)^T\in \mathbb R^{n\times d_x}\) and \(Y=(y_1,...,y_n)^T\in\mathbb R^{n\times d_y}\).

Let \(k\leq \min(d_x,d_y, n)\) and \(A\in \mathbb R^{d_x\times k}\) and \(B\in \mathbb R^{d_y\times k}\) so that \(XA=z_x\in\mathbb R^{n \times k}\) and \(YB=z_y\in\mathbb R^{n \times k}\).

CCA solves the following maximization problem,

\[\max_{A,B} \text{tr}\left(\frac{1}{n}z_x^Tz_y\right) \] \[\text{s.t}\] \[\frac{1}{n}z_x^Tz_x=\frac{1}{n}z_y^Tz_y=I\]

This maximizes the trace of the cross-correlation matrix, while constraining embedding vectors to unit variance and zero covariance.

Similar to the equivalence between maximizing variance and minimizing prediction error in solving PCA, we have a relationship between the trace of the cross-correlation matrix and embedding prediction error,

\[\frac{1}{n}\sum_{i=1}^n ||z_x^{(i)}-z_y^{(i)}||^2=\frac{1}{n}||z_x-z_y||_F^2= \frac{1}{n}\text{tr}(z_x^Tz_x) + \frac{1}{n}\text{tr}(z_y^Tz_y) - \frac{2}{n}\text{tr}(z_x^Tz_y)\] And due to the whitening constraints, \[=2k- \frac{2}{n}\text{tr}(z_x^Tz_y)\]

So maximizing the trace of the cross-correlation under the whitening constraints is equivalent to minimizing the MSE of the embedding representations. Therefore we can write CCA as,

\[\min_{A,B} \frac{1}{n}\sum_{i=1}^n ||z_x^{(i)}-z_y^{(i)}||^2\] \[\text{s.t}\] \[\frac{1}{n}z_x^Tz_x=\frac{1}{n}z_y^Tz_y=I\]

JEPA

Adopting the previous notation, JEPA is constrained to \(d_x=d_y=d\) as a result of the joint-embedding. In JEPA, we have the encoder \(f_\theta:\mathbb R^{d}\rightarrow \mathbb R^k\), and predictor \(g_\varphi:\mathbb R^{k}\rightarrow \mathbb R^k\).

Let \(z_x^{(i)}=g_\varphi(f_\theta(x_i))\), \(z_y^{(i)}=f_\theta(y_i)\).

Then we solve,

\[\min_{\theta,\varphi}\frac{1}{n} \sum_{i=1}^n ||z_x^{(i)}-z_y^{(i)}||^2\]

Note the similarity in the objective function but the lack of whitening constraints. The lack of whitening constraints results in representational and dimensional collapse. For example, a trivial solution to the above problem is \(z_x^{(i)}=z_y^{(i)}=c\).

As discussed in my previous blog post SIGReg (Balestriero and LeCun 2025) fixes this problem. What does it do? It encourages the embeddings \(z_x\) and \(z_y\) to have an isotropic (i.e. unit variance, uncorrelated) Gaussian distribution. As a result it encourages,

\[\frac{1}{n}z_x^Tz_x=\frac{1}{n}z_y^Tz_y=I\]

Conclusion

As I mentioned in the introduction, Schmidhuber has debated who invented JEPA and said this about LeCun,

Dr. LeCun’s heavily promoted Joint Embedding Predictive Architecture (JEPA) is the heart of his new company. However, the core ideas are not original to LeCun. Instead, JEPA is essentially identical to our 1992 Predictability Maximization system.

Schmidhuber references Yann LeCun’s response,

JEPA is merely a name for a general concept. The question is, and has always been, how do you make it work (particularly how do you prevent it from collapsing), and how do you make it work at scale with SOTA results on non-toy problems. That’s the hard part. Ideas are a dime a dozen. Making them work is what the community will give you credit for.

Do I agree with LeCun? Yes and no.

Yes, because of course you will get credit for making things work, and ideas are indeed arguably “a dime a dozen”.

No, because the thread of citations is important for progress. If important citations are missed, whether intentionally or not, the correct thing to do is just add them. We’re all only the better for doing so. The connection that JEPA models have to CCA is informative.

My opinion is that JEPA/Predictability Maximization models are architectural enhancements layered on top of CCA. Non-linearity is an enhancement.

Ultimately, these models all have the same objective function introduced by CCA: find the transformations that result in maximal correlation between sets of multidimensional data.

References

Andrew, Galen, Raman Arora, Jeff Bilmes, and Karen Livescu. 2013. “Deep Canonical Correlation Analysis.” International Conference on Machine Learning, 1247–55. https://proceedings.mlr.press/v28/andrew13.html.

Balestriero, Randall, and Yann LeCun. 2025. LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics. https://arxiv.org/abs/2511.08544.

Benton, Adrian, Huda Khayrallah, Biman Gujral, Dee Ann Reisinger, Sheng Zhang, and Raman Arora. 2017. Deep Generalized Canonical Correlation Analysis. https://arxiv.org/abs/1702.02519.

Bykhovskaya, Anna, and Vadim Gorin. 2025. Canonical Correlation Analysis: Review. https://arxiv.org/abs/2411.15625.

Horst, Paul. 1961. Generalized Canonical Correlations and Their Application to Experimental Data. Journal of clinical psychology.

Hotelling, Harold. 1936. “Relations Between Two Sets of Variates.” Biometrika 28 (3/4): 321–77. http://www.jstor.org/stable/2333955.

Huang, Yongchao. 2026. VJEPA: Variational Joint Embedding Predictive Architectures as Probabilistic World Models. https://arxiv.org/abs/2601.14354.