惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

酷 壳 – CoolShell
酷 壳 – CoolShell
H
Hacker News: Front Page
P
Palo Alto Networks Blog
T
ThreatConnect
Apple Machine Learning Research
Apple Machine Learning Research
博客园_首页
T
True Tiger Recordings
P
Privacy & Cybersecurity Law Blog
B
Blog
IT之家
IT之家
Last Week in AI
Last Week in AI
F
Full Disclosure
Hacker News: Ask HN
Hacker News: Ask HN
C
Comments on: Blog
Microsoft Azure Blog
Microsoft Azure Blog
C
Cybersecurity and Infrastructure Security Agency CISA
Microsoft Security Blog
Microsoft Security Blog
博客园 - 【当耐特】
N
News and Events Feed by Topic
NISL@THU
NISL@THU
腾讯CDC
雷峰网
雷峰网
Security Latest
Security Latest
李成银的技术随笔
M
Microsoft Research Blog - Microsoft Research
L
LangChain Blog
L
Lohrmann on Cybersecurity
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
C
Check Point Blog
Y
Y Combinator Blog
Recent Announcements
Recent Announcements
博客园 - Franky
N
News | PayPal Newsroom
V
V2EX
A
About on SuperTechFans
The Register - Security
The Register - Security
月光博客
月光博客
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Google Online Security Blog
Google Online Security Blog
MyScale Blog
MyScale Blog
Cisco Talos Blog
Cisco Talos Blog
Vercel News
Vercel News
WordPress大学
WordPress大学
C
Cyber Attacks, Cyber Crime and Cyber Security
The Hacker News
The Hacker News
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
爱范儿
爱范儿
A
Arctic Wolf
L
LINUX DO - 最新话题
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

The Next Platform: In-depth coverage of high end computing

intelligence.theregister.com Dell Bulks Up Hardware As AI Infrastructure Shifts To On-Premises Cisco Wins Over AI Customers With Merchant Silicon And Optics With Its IPO Done, Cerebras Can Get Back To Pushing The AI Envelope intelligence.theregister.com HPE Throws VM Users A Lifeline, Unifying Containers And VM Management In Cloud Stack OpenAI, Microsoft And Friends Build A Better, More Scalable Ethernet intelligence.theregister.com Compute And Memory Price Hikes Drive IT Spending Way Higher Sometimes, Air Is The Only Way For AI Systems To Keep Their Cool Arista Rides AI Scale Out Networks, Moves Into Scale Across, And Awaits Scale Up If You Can Make A Compute Engine, You Can Sell A Compute Engine Cleveland Clinic Simulates Large Proteins With Quantum-Centric Supercomputing Broadcom Helps CPU And XPU Makers Go Vertical With Compute Microsoft Committed To Doubling AI Infrastructure In Two Years Google Is A Full Stack AI Player, And Is Playing Well AWS Will Be An OEM, Just Like Google And Maybe Microsoft intelligence.theregister.com intelligence.theregister.com New Google Networks Tuned Up For GenAI Inference And Training Microsoft And OpenAI Remain Friends, Are Looking To Hook Up With Others intelligence.theregister.com AI-Driven CPU Shortage Saves Intel’s Financial Cookies The GenAI Battle Shifts From Frontier Models To Agentic Platforms With TPU 8, Google Makes GenAI Systems Much Better, Not Just Bigger intelligence.theregister.com Cisco Scales Out Quantum Systems With A Quantum Network Switch Stop Measuring AI Training Costs In GPU Hours The Second Time Will Be The IPO Charm For Cerebras Imagine An Army Of AI Minions Handling Incident Response AI Will Soon Drive A Third Of TSMC’s Business How HPC And AI Digital Twins Accelerate Quantum Error Correction The Embrace Of AI In Design Transforms Cadence And Its Customers intelligence.theregister.com intelligence.theregister.com Nvidia Brings The Power Of Open Source AI Models To Quantum Computing intelligence.theregister.com Building The Imperfect Beast For Enterprises, GPUs Need Virtualization As Much As CPUs Ever Did CoreWeave Takes As Much Financial Engineering As It Does Datacenter Design Contemplating Meta’s Homegrown MTIA Compute Engine Roadmap Most Neoclouds, Sovereigns, And Enterprises Will Buy, Not Build, Their AI Stacks Broadcom And Google Benefit Mightily From Anthropic’s Meteoric Growth Rebellions AI Rings Up The Money To Rack Up AI Inference Systems Nvidia Software Pushes MLPerf Inference Benchmarks To New Highs Broadcom Makes Its Pitch To Run Kubernetes On VMware VCF The $2 Billion Nvidia Deal With Marvell Is About A Lot More Than NVLink Fusion Classiq Says Quantum Is On Its Way, But Patience Is Needed Demonstrating The Scientific Usefulness Of Quantum Systems We Need Servers – Lots Of Servers. . . . Arm Comes Full Circle With Homegrown, AI-Tuned Server CPU Riding The Memory Boom And Trying To Avoid The Bust Data Analytics Helps Make The Mighty Lionesses Roar Driving Down The AI System Roadmap With Nvidia The Open Agentic AI World According To Nvidia Nvidia Finally Admits Why It Shelled Out $20 Billion For Groq Nvidia Says OpenClaw Is To Agentic AI What GPT Was To Chattybots IBM Unrolls Blueprint For Quantum-Classical HPC Computing Women Get Data-Driven Health Boost As The FA Tackles Sports Science's Male Bias Four Months Into Its Comeback, Zapata Stakes Its Claim In Quantum Software Eridu Cuts To The AI Networking Chase With High Radix Switch System HPE Works Harder And Smarter To Chase Datacenter Profits We Need A Proper AI Inference Benchmark Test How AI Is Boosting Gender Equality In High Performance Racing Custom Compute Engine Biz Growing More Than Marvell Ever Hoped Broadcom May Become The Biggest Counterbalance To Nvidia Ayar Labs Gets $500 Million To Ramp Photonics Into 2028 AI Systems With Cisco Outshift, Agentic AI Is Teed Up For the Internet Of Cognition Nvidia Sees The Light On Silicon Photonics And Maybe Optical Switching AI Servers Finally Dominate Dell’s Systems Business VAST Data: What Controls The Data Is More Important Than What Stores It So Far, Nobody Turns Tokens Into Money Like Nvidia SambaNova Pits Its Engineering Against Nvidia For Agentic AI Some More Game Theory, This Time On The AMD-Meta Platforms Deal CPU-Only Compute Still Matters To A Lot Of HPC Centers AMD Says “Helios” Racks And MI400 Series GPUs On Track For 2H 2026 Taalas Etches AI Models Onto Transistors To Rocket Boost Inference Some Game Theory On That Nvidia-Meta Platforms Partnership AI Eats The World, And Most Of Its Flash Storage The Current AI Networking Wave Will Be A Tsunami Of Money By 2027 The Memory Crunch Pinches Cisco’s Profits Only A Few AI Platforms Can Survive Cisco Doubles Up The Switch Bandwidth To Take On AI Scale Out And Eventually Scale Up The Greatest AI Show On Earth Datacenter Spending Forecast Revised Upwards – Yet Again The Twin Engine Strategy That Propels AWS Is Working Well With GenAI Turbochargers, Google Is Shifting Its Cloud Into A Higher Gear AMD Finally Makes More Money On GPUs Than CPUs In A Quarter Dassault And Nvidia Bring Industrial World Models To Physical AI TACC Explores Mixed Precision And FP64 Emulation For HPC With Horizon Robotics Will Break AI infrastructure: Here’s What Comes Next Oracle’s Financing Primes The OpenAI Pump Gartner Takes Another Stab At Forecasting AI Spending Microsoft Is More Dependent On OpenAI Than The Converse Big Blue Poised To Peddle Lots Of On Premises GenAI Microsoft Takes On Other Clouds With “Braga” Maia 200 AI Compute Engines Nvidia’s $2 Billion Investment In CoreWeave Is A Drop In A $250 Billion Bucket Intel Is Still Struggling In The Datacenter, But It Could Get Better Is Nvidia Assembling The Parts For Its Next Inference Platform? TSMC Has No Choice But To Trust The Sunny AI Forecasts Of Its Customers
Bechtolsheim & Friends Breathe Life Into Pluggable Optics One Last Time
2026-04-17 · via The Next Platform: In-depth coverage of high end computing

Andy Bechtolsheim, legendary co-founder of Unix system maker Sun Microsystems and more than a few networking startups after that, has nothing against co-packaged optics. Well, except that thus far CPO modules have not been able to be manufactured in volume.

There is no question that co-packaged optics is coming to datacenters at this point, particularly with AI industry juggernaut Nvidia embracing CPO links on its Quantum X800 InfiniBand and Spectrum X800 Ethernet switches – previewed in June 2024, launched in March 2025 and shipping since December 2025. Nvidia is also told customers in March this year at GTC that with the future “Feynman” GPUs coming in 2028 that the NVSwitch 8 coherent memory scale up network at the heart of its rackscale systems will be moving to CPO as well. This strongly implies that the Feynman GPU and whatever the next generation of Arm server CPU that Nvidia pairs with it – we don’t have a codename as yet – will also have CPO ports.

NEXTPLATFORM AD

To address the manufacturability issues, ahead of GTC this year, Nvidia pumped $2 billion each into Lumentum and Coherent, who among other things make the lasers that drive CPO ports. Nvidia also inked multi-billion dollar, multi-year supply agreements with these companies. And later in March, Nvidia inked a similar $2 billion deal with Marvell to add NVLink Fusion ports to custom accelerators, but we think there is a chance that some CPO technology from the $2.5 billion Celestial AI acquisition by Marvell back in December 2025 is also part of this deal.

In the meantime, AI datacenters need more density in their networks and they need optics that are denser than the SFP, QSFP, and OSFP pluggable modules that have dominated the datacenter for the past two decades. And more specifically, they need more density than the OSFP pluggables that were first conceived in 2016 by Arista Networks, the upstart who took on Cisco Systems in datacenter networking where Bechtolsheim is co-founder and chief development officer, and Google and the industry adopted the standard and delivered pluggables at 400 Gb/sec speeds a year or so later. These have become the most popular pluggable optics in history.

The trouble is, the OSFP modules are too fat for the radix that is necessary for modern AI systems, particularly if you want to use Ethernet for scale up and scale out. Enter Extra-dense Pluggable Optics multi-source agreement, which is a standard started by Arista Networks, Microsoft, Marvell, Broadcom, and Ciena that now has over a hundred companies backing it. Google is not on the list, and that might mean that Google is going to try to jump from OSFP pluggables to some sort of on-chip CPO in the same timeframe that XPO modules are expected to come to market in volume.

The XPO module is clever in that it packs a whole lot more bandwidth into the same space as an OSFP module, and that means the front panels on switches have a lot more radix coming out of them as well as more bandwidth. Because this is physics we are dealing with, you have to pay for the XPO module in heat density, and that means the XPO module has to remove that heat with a cold plate and liquid cooling. But this is a minor thing in a world where GPUs and XPUs need to be connected in larger and larger scale up and scale out domains and liquid cooling is becoming normal for rackscale systems. There is no other way to get components closer to each other and therefore drive down latency and drive up performance.

NEXTPLATFORM AD

According to Bechtolsheim, right now, using 1.6 Tb/sec OSFP modules, you can get 32 ports for a total of 51.2 Tb/sec coming out of the front panel of a 1U Ethernet switch. The OSFP modules burn somewhere between 30 watts and 40 watts, and if you slap a cold plate on it, you can’t really increase the cooling capacity or the switch front panel port density. What that means is if you have a 204.8 Tb/sec switch ASIC, which believe it or not we will have soon, you need 4U of chassis space to get 128 OSFP modules running at 1.6 Tb/sec.

The XPO module crams 64 channels running at 200 Gb/sec into the same space taken up by two OSFP modules, which increases lane density by a factor of four. Bandwidth goes up by a factor of eight with the XPO module, and heat dissipation goes up by a factor of ten to 400 watts.

Here is the exploded view of the XPO module:

There is some lucky geometry going on here, says Bechtolsheim, in that the paddle cards – the little motherboards of circuits – fit into the same exact space as two OSFP modules. So the chips and paddle card designs did not have to be changed. You put two side by side and then stack a pair of those pairs belly to belly and you get eight times the lanes in twice the space.

NEXTPLATFORM AD

When you look at it, XPO seems kind of intuitively obvious. Like many good engineering ideas do in hindsight.

The XPO module will support a variety of front panel fiber connectors:

The upshot, says Bechtolsheim, is that XPO supports any optics standard, any optics technology, any type of driver, retimer, or gearbox, any optical connector, and any kind of cable, and it can give density improvements without having to shift to CPO. Presumably the economics will be better, but don’t jump to that conclusion too fast.

One other side benefit is that with liquid cooling on the XPO components, they run anywhere from 20 degrees to 25 degrees cooler (that’s in Celsius) in a 12.8 Tb/sec ZR module, with its schematic and heat map shown below, than an air-cooled 1.6 Tb/sec OSFP-ZR module. (That is around 45 degrees Celsius for the XPO module compared to 65 degrees to 70 degrees for the OSFP module at the same 1.6 Tb/sec bandwidth.)

NEXTPLATFORM AD

That lower temperature means fewer failures in the field – how much remains to be seen. But this has real monetary value in an AI supercomputer where any outage stops a training run cold and you have to roll back to a checkpoint and start again. Time is money when it comes to GPUs and XPUs, and a lot of money at that.

In the switch designs that will come out using XPO, the power for the XPO modules will be drawn directly off of a 50 volt bus bar, rather than going through motherboard voltage converters used in switches using OSFP modules. This is a more efficient way to distribute the power.

The net-net is that a rack of switches using XPU modules can cram an aggregate of 6.5 Pb/sec into an Open Rack v3 switch rack from the Open Compute Project, which looks like this:

But here is the money math that will have AI datacenter builders interested in the XPO modules. They can cut the size of their datacenters in half because of the increase in density of the network racks. The change is dramatic.

Let’s say that you want to have a row of compute and networking based on Ethernet that links 512 XPUs together and that you want to use switches with top-of-the-line 1.6 Tb/sec OSFP ports. You will have 128 compute engines per rack, and need four racks for the compute, but it will take eight racks of networking. With a shift to XPO, you have the same four racks of XPUs, but you only need two racks of switching. So twelve racks collapses down to six racks for the same compute and the same interconnectivity. The switch racks are hotter, but there are fewer of them and the power is mostly a wash. The lengths of the cables between the XPUs and the switches are also shorter, which is a benefit in that it is less fiber, which costs money. Over a 1 gigawatt datacenter, such little things add up, as does having to pour less concrete and putting up smaller shells for a given amount of compute and networking.

So how far does XPO push out CPO? And Bechtolsheim gave us the same answer he has given for years now.

“We have told both customers and said in public, we are not religious about any technology here,” Bechtolsheim tells The Next Platform. “So that is understood. The only thing we are religious about is that we can ship stuff in high volume. Everybody working on XPO did their own work based on their own nickel, and they all want to own what they developed, and they all are going to be there. This effort has been driven by this one large end customer, but I think everybody looked at this and concluded that this the way to get the next level of density.”

There are more than 20 different vendors that are going to be making XPO modules, and they are expected to be in volume production in 2027.