惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
S
SegmentFault 最新的问题
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Attack and Defense Labs
Attack and Defense Labs
F
Full Disclosure
Vercel News
Vercel News
N
News | PayPal Newsroom
The GitHub Blog
The GitHub Blog
H
Hacker News: Front Page
H
Heimdal Security Blog
P
Privacy International News Feed
博客园 - 司徒正美
Google DeepMind News
Google DeepMind News
N
Netflix TechBlog - Medium
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
Cisco Blogs
L
Lohrmann on Cybersecurity
D
Docker
Recent Announcements
Recent Announcements
Security Archives - TechRepublic
Security Archives - TechRepublic
人人都是产品经理
人人都是产品经理
C
CXSECURITY Database RSS Feed - CXSecurity.com
P
Proofpoint News Feed
T
Tailwind CSS Blog
C
Check Point Blog
博客园 - 叶小钗
Google Online Security Blog
Google Online Security Blog
Martin Fowler
Martin Fowler
Stack Overflow Blog
Stack Overflow Blog
博客园 - 聂微东
S
Secure Thoughts
博客园 - Franky
博客园_首页
阮一峰的网络日志
阮一峰的网络日志
P
Palo Alto Networks Blog
Latest news
Latest news
量子位
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
博客园 - 三生石上(FineUI控件)
The Cloudflare Blog
Last Week in AI
Last Week in AI
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
Cyberwarzone
Cyberwarzone
小众软件
小众软件
Cisco Talos Blog
Cisco Talos Blog
Hacker News: Ask HN
Hacker News: Ask HN
T
Threatpost
T
Tenable Blog
P
Privacy & Cybersecurity Law Blog
WordPress大学
WordPress大学

The Register - Special Features: SC25

Copackaged optics have officially found their killer app - of course it's AI Nvidia's green500 dominance continues as France's Kairos super takes efficiency title SC25 gets heavy with mega power and cooling solutions Scientific computing is about to get a massive injection of AI Europe joins the US as an exascale superpower Dell expands AI lineup with new servers and storage Ayar Labs eyes hyperscale customers with GUC design collab Europe’s Jupiter supercomputer hits exascale threshold at inauguration Fugaku successor in the works as Fujitsu wins HPC contract UK reheats Edinburgh supercomputer plan sans exascale chops SiPearl ships reference node design for Rhea1 Arm chip
Swiss boffins tease 'fully open' LLM trained on Alps super
Tobias Mann Tobias Mann · 2025-07-10 · via The Register - Special Features: SC25

SC25

Swiss boffins just trained a 'fully open' LLM on the Alps supercomputer

Source code and weights coming later this summer with an Apache 2.0 bow on top

Supercomputers are usually associated with scientific exploration, research, and development, and ensuring our nuclear stockpiles actually work. 

Typically, these workloads rely on highly precise calculations, with 64-bit floating point mathematics being the gold standard. But as support for lower-precision datatypes continues to find its way into the chips used to build these systems, supercomputers are increasingly being used to train AI models.

This is exactly what the boffins at ETH Zürich and the Swiss Federal Technology Institute in Lausanne, Switzerland, have done. At the International Open-Source LLM Builders Summit in Geneva this week, researchers teased a pair of open large language models (LLMs) trained using the nation's Alps supercomputer.

As supercomputers go, Alps is better suited than most for running AI workloads alongside more traditional high-performance computing (HPC) applications. The system is currently the third-most powerful supercomputer in Europe, and eighth worldwide in the bi-annual Top500 ranking. It's also among the first large-scale supercomputers based around Nvidia's Grace-Hopper Superchips.

Each of these GH200 Superchips features a custom Grace CPU powered by 72 Arm Neoverse V2 cores, connected via a 900GB/s NVLink-C2C fabric to a 96GB H100 GPU. Those GPUs account for the lion's share of Alps' total compute capacity, with up to 34 teraFLOPS of FP64 vector performance. However, if you're willing to turn down the resolution a bit to, say, FP8, the performance jumps to nearly four petaFLOPS of sparse compute.

Built by HPE's Cray division, Alps features a little over 10,000 of these chips across 2688 compute blades, which have been stitched together using the OEM's custom Slingshot-11 interconnects. Combined, the system boasts 42 exaFLOPS of sparse FP8 performance or roughly half when using the more precise BF16 data type.

While Nvidia's H100 accelerators have been widely employed for AI training for years now, the overwhelming majority of these Hopper clusters have employed Nvidia's 8-GPU HGX form factor rather than its Superchips.

With that said, Alps isn't the only supercomputer to use them. The Jupiter supercomputer in Germany and the UK's Isambard AI, both of which came online this spring, also use Nvidia's GH200 Superchips.

"Training this model is only possible because of our strategic investment in 'Alps', a supercomputer purpose-built for AI," Thomas Schulthess, Director of Swiss National Supercomputing Centre (CSCS) and professor at ETH Zurich, said in a blog post.

The researchers have yet to name the models, but we do know they'll be offered in both eight-billion and 70-billion parameter sizes, and have been trained on 15 trillion tokens of data. They're also expected to be fluent in more than 1,000 languages, with roughly 40 percent of the training data being in languages other than English.

More importantly, the researchers say, the models will be fully open. Instead of releasing simply the models and weights for the public to scrutinize and tweak, as we've seen with models from Microsoft, Google, Meta, and others, researchers at ETH Zürich also intend to release the source code used to train the model and claim that the "training data will be transparent and reproducible."

"By embracing full openness — unlike commercial models that are developed behind closed doors — we hope that our approach will drive innovation in Switzerland, across Europe, and through multinational collaborations," EPFL professor Martin Jaggi said in the post.

According to Imanol Schlag, a research scientist at the ETH AI Center, this transparency is essential to building high-trust applications and advancing research in AI risks and opportunities."

What's more, researchers contend that for most tasks and general knowledge questions, circumventing web crawling protections wasn't necessary, and complying with these opt-outs showed no sign of performance degradation.

The LLMs are expected to make their way into public hands later this summer under a highly permissive Apache 2.0 license. ®