惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Fox-IT International blog
Recent Announcements
Recent Announcements
D
Docker
IT之家
IT之家
B
Blog
Jina AI
Jina AI
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
博客园 - 【当耐特】
Google DeepMind News
Google DeepMind News
F
Fortinet All Blogs
量子位
C
Check Point Blog
Microsoft Azure Blog
Microsoft Azure Blog
罗磊的独立博客
博客园 - 司徒正美
李成银的技术随笔
美团技术团队
Blog — PlanetScale
Blog — PlanetScale
雷峰网
雷峰网
The GitHub Blog
The GitHub Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
J
Java Code Geeks
T
The Blog of Author Tim Ferriss
酷 壳 – CoolShell
酷 壳 – CoolShell
MongoDB | Blog
MongoDB | Blog
P
Proofpoint News Feed
L
LangChain Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Y
Y Combinator Blog
大猫的无限游戏
大猫的无限游戏
有赞技术团队
有赞技术团队
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
V
Visual Studio Blog
T
Tailwind CSS Blog
H
Help Net Security
Engineering at Meta
Engineering at Meta
小众软件
小众软件
B
Blog RSS Feed
Stack Overflow Blog
Stack Overflow Blog
月光博客
月光博客
M
Microsoft Research Blog - Microsoft Research
宝玉的分享
宝玉的分享
人人都是产品经理
人人都是产品经理
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
GbyAI
GbyAI
H
Hackread – Cybersecurity News, Data Breaches, AI and More
Last Week in AI
Last Week in AI
Martin Fowler
Martin Fowler
Stack Overflow Blog
Stack Overflow Blog

The Register - Special Features: Supercomputing Month

GPUs aren't worth their weight in gold – it just feels like they are HPC won't be an x86 monoculture forever – and it's starting to show Norway's new supercomputer to use waste heat to raise salmon The exascale offensive: America's race to rule AI HPC India has satisfied its supercomputing needs, but not its ambitions UK lines up £250M cloud procurement to feed its growing AI research appetite How high-end supercomputer filesystem DAOS can break out of its niche Eviden set to build France's first exascale supercomputer with AMD at the wheel Power: The answer to and source of all your AI datacenter problems GPU goliaths are devouring supercomputing – and legacy storage can't feed the beast HPE details Vera Rubin blades for next-gen Cray supercomputers Battery trade war hits booming datacenter industry AI isn't throttling HPC. It <em>is</em> HPC Oak Ridge lab gets $125M to combine HPCs with quantum Power crunch threatens to derail AI datacenter construction $10B + spent on liquid cooling this week – it's only Tuesday Nvidia, OpenAI, and the trillion-dollar loop Nvidia will help build 7 AI supercomputers for for DoE NextSilicon Maverick-2 promises to blow away the HPC market Nvidia left behind UK waves £750M supercomputer contract at HPC builders Tsunami forecasting about to get a lot faster thanks to El Capitan super
HPE to build Discovery exascale successor for Oak Ridge
2025-10-28 · via The Register - Special Features: Supercomputing Month

REG AD

Supercomputing Month

HPE's Discovery to succeed Frontier supercomputer with next-gen Cray tech

Oak Ridge's $500M system due in 2028, paired with a separate Lux AI cluster arriving two years earlier

HPE is set to build a successor to the Frontier exascale system for America's Oak Ridge National Laboratory, based on the next generation of its Cray supercomputer platform, plus a separate AI cluster to advance machine learning with a multi-tenant cloud-like platform.

The Discovery system will "bolster productivity up to 10x," according to HPE, and like many other supercomputers will be used for scientific research into various areas including medicine, cancer research, nuclear energy, and aerospace.

ORNL GX 3D system mock-up Discovery

Mock-up of HPE's forthcoming Discovery GX5000 system

Oak Ridge issued a request for proposals (RFP) for a successor to Frontier last year, with an expected delivery date of late 2027 to early 2028 and anticipated budget of $500 million.

REG AD

HPE now says delivery of Discovery is expected in 2028, with user operations set to begin in 2029.

REG AD

The national laboratory will also receive a second HPE-built system, Lux, the AI cluster intended to support both training and inference work at the site. This is expected to be installed early in 2026.

Discovery will be based on HPE's Cray Supercomputing GX5000, the next iteration of its supercomputing architecture, and will also feature a new Cray Storage Systems K3000 running the DAOS object storage platform, plus the next generation of Cray's Slingshot high-performance networking.

HPE says the Discovery nodes will be built with AMD's "Venice" (a code name) server processors, which are not due to be launched until next year, plus Instinct MI430X GPUs – also due next year – for the level of performance required for modeling, simulation, and AI projects.

However, HPE did not disclose how many nodes or CPUs and GPUs will go into building Discovery, or how much memory the system will have.

For interconnect, it will use the next generation of Slingshot networking HPE gained when it acquired Cray, although this has yet to launch and the company didn't give a date as to when it will. The current Slingshot 11 supports 200 Gbps per port, and can be regarded as a superset of Ethernet.

Discovery will be supported by Cray Storage Systems K3000, which HPE claims will support up to 75 million input/output operations per second per storage rack, 4x more performance than the next 30 storage systems on the IO 500 list, according to the firm.

This will be based on the open source DAOS (Distributed Asynchronous Object Storage) platform, but will complement rather than replace the Lustre file system-based Cray Storage Systems E2000, which will also be included in Discovery.

DAOS was developed by Intel, but farmed out to an independent foundation after the chipmaker canceled its Optane memory technology in 2022 and lost interest. HPE then hired Intel's DAOS engineers and brought them into its own storage team.

REG AD

Lux, meanwhile, is set to be an all-AMD affair, based on liquid-cooled HPE ProLiant Compute XD685 nodes with Epyc CPUs, Instinct MI355X GPUs, and linked together using AMD's Pensando SmartNIC networking.

Liquid cooling innovations

Crosshead text

Trish Damkroger, HPE's senior VP for HPC and AI Infrastructure Solutions, told The Register that the GX5000 had been in the works for years, but the company had "made some pivots over the last year and a half, as we've seen the growth of TDPs (thermal design points), the growth of different silicon coming out from all the vendors, and the need to be able to support all of these different workloads."

She said the racks will be able to accommodate up to 25 kilowatts per compute slot, 127 percent higher than before. But she seemed prouder of the liquid cooling for the GX5000 infrastructure, which now supports 40°C (104°F) water to meet new energy requirements for a lot of customers in Europe.

This means additional chillers and refrigerators are not needed, which cuts power, so it is a much more energy-efficient system for upcoming deployments.

"It is a bookend design," she said. "So basically, the cooling pump is designed to be more compact. And can be placed on the side of the system instead of in the middle. And each pump is going to have redundancy to ensure that there's always-on operation."

HPE next-gen cooling

HPE next-gen cooling

Damkroger added that users can now control the water flow rate, so instead of every single blade having the same, it can be optimized for each blade and its workloads.

REG AD

HPE said there will be an opportunity to see the new GX5000 infrastructure at the SC 25 high-performance compute conference in St. Louis, Missouri, next month, though the platform is not expected to be available to customers until early 2027. ®