惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Google DeepMind News
Google DeepMind News
F
Fortinet All Blogs
阮一峰的网络日志
阮一峰的网络日志
Apple Machine Learning Research
Apple Machine Learning Research
爱范儿
爱范儿
WordPress大学
WordPress大学
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
J
Java Code Geeks
罗磊的独立博客
S
SegmentFault 最新的问题
V
V2EX
V
Visual Studio Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
美团技术团队
博客园 - 三生石上(FineUI控件)
Stack Overflow Blog
Stack Overflow Blog
Y
Y Combinator Blog
MyScale Blog
MyScale Blog
D
Docker
Google DeepMind News
Google DeepMind News
Blog — PlanetScale
Blog — PlanetScale
M
Microsoft Research Blog - Microsoft Research
Martin Fowler
Martin Fowler
S
Secure Thoughts
B
Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Recent Announcements
Recent Announcements
MongoDB | Blog
MongoDB | Blog
C
Cisco Blogs
C
CERT Recently Published Vulnerability Notes
T
True Tiger Recordings
GbyAI
GbyAI
P
Proofpoint News Feed
P
Privacy International News Feed
Jina AI
Jina AI
The Cloudflare Blog
I
Intezer
AWS News Blog
AWS News Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
S
Security Archives - TechRepublic
NISL@THU
NISL@THU
The Register - Security
The Register - Security
Recent Commits to openclaw:main
Recent Commits to openclaw:main
P
Palo Alto Networks Blog
S
Schneier on Security
L
LINUX DO - 热门话题
C
CXSECURITY Database RSS Feed - CXSecurity.com
Security Latest
Security Latest
C
Cybersecurity and Infrastructure Security Agency CISA

Latest from Tom's Hardware

Analyst says Nvidia poised to capture two-thirds of the x86 server CPU market from Intel and AMD with expected $20 billion in revenue — 'Nvidia is already on track'to deliver 4 million Vera CPUs in FY2027 AI is starting to out-design chip engineers in narrow areas as LLMs accelerate software chip design tool development — "There is still a lot of human guidance" says Berkley researcher Acer Nitro 65 review: Solid gaming performance, but skimping on some features Best Buy shaves $700 off the amazing 49-inch Samsung Odyssey OLED G9 — The ultimate gaming monitor for your setup… Save 30% on a 12-month Proton Unlimited sub to secure an all-in-one privacy suite for under $110 — big price drop on service that includes a no-logs VPN with servers across 145 countries, 500GB of cloud storage, encrypted mail, password manager, and more Get RTX power for less at Lenovo’s epic Memorial Day gaming sale — save big on Legion gaming PCs and laptops Europol's Operation Saffron takes down First VPN service over ransomware attacks — 33 'bulletproof' servers spread across 27 countries seized Save up to $350 on an iBuyPower gaming PC in this massive Memorial Day sale — use the coupon code to secure a high-spec pre-built rig or configure your own with AMD, Nvidia, and Intel parts Open-source non-profit claims Bambu Lab violated license — move follows cease-and-desist demand on OrcaSlicer fork that restored cloud printing features without using Bambu Connect Nvidia's memory costs soar 485%, latest AI systems now cost $7.8 million to build — memory now comprises 25% of the total cost, Rubin GPUs a mere $50,000 apiece EU forced to exempt banned Chinese chipmaker after auto industry warns of supply crisis — European car factories… Samsung reportedly set to distribute up to $26.6 billion to staff in AI-driven semiconductor bonuses after last-minute union deal — average payouts could approach $400,000 per chip employee Taiwan raids 12 locations in its first formal crackdown on Nvidia AI chip smuggling — hunts three fugitives for document forgery, fraudulent declarations in Super Micro smuggling case Get up to 25% Herman Miller gaming chairs and desks in this Memorial Day Sale — save hundreds on premium office hardware The custom AI ASIC state of play (May 2026) — Broadcom deals, Google TPUs, Meta MTIA & beyond AMD begins production ramp of 256-core EPYC Venice — first 2nm HPC chip claims 70% performance leap MSI MPG 322UR QD-OLED X24 4K 240 Hz gaming monitor review: Blistering performance with pro-level color Lenovo G02 retro handheld allegedly comes preloaded with thousands of copyrighted games, including Nintendo ROMs — company confirms that it’s an officially white-labeled device meant for the Chinese market Audi enthusiast upgrades 2001 A4 gear shift with old smartwatch in 3D-printed housing — vide-coded WearOS app displays gear selection, can be used to control in-car media Pick up an $1100 discount on this RTX 5080-powered HP Omen Max gaming laptop — 32GB of DDR5 RAM and powerful Intel 275HX CPU help to crush the competition in-game Save over $1,120 on this 4K-ready RTX 5070 Ti gaming PC with 32GB DDR5 and a 2TB SSD — score a big discount on this ABS Kaze II rig from Newegg with a 24-core Intel CPU and powerful Nvidia GPU Flipper One computing multitool bristles with network, GPIO, and M.2 connectivity — new keychain device is also a… Intel leans on LPDDR5X to dodge global HBM crisis, leaked Crescent Island AI GPU pics reveal massive Xe3P core — chip sidesteps HBM shortage with 160GB of cheaper memory ASML CEO says Elon Musk is 'very serious' about TeraFab chipmaking megaproject, confirms direct talks — Musk targets $119 billion Texas semiconductor facility Don't settle for less and grab one of the finest Chromebooks for $699 — a 30% discount on the only Chromebook… China's new homegrown gaming GPU flops in performance and price — flagship $485 LX 7G100 can't keep pace… Angry tiny Texas town council member proposes total ban on cellular and GPS devices in protest over AI dispute — says 'Let’s take Bandera back to 1880' after town votes to dump AI-powered license plate reader Save hundreds of dollars on these fantastic Best Buy Memorial Day PC deals — Nvidia RTX 50-series laptops and OLED gaming monitors, among hefty hardware discounts Nvidia no longer reports sales of graphics solutions as a separate segment — posts eye-watering $81.6 billion Q1… Korean funeral services company lost $33 million of its customers' money over a bad crypto bet — firm was secretly investing client funds into leveraged crypto ETFs AMD Ryzen AI Max 400 ‘Gorgon Halo’ packs up to 192GB of unified memory — refreshed APU uses Zen 5 and RDNA 3.5, and can clock up to 5.2 GHz Samsung narrowly avoids 18-day chip strike after last-minute wage deal with 48,000-worker union — tentative deal, subject to workers' vote, suspends billions of dollars worth of potential losses Memory makers brace for hydrogen fluoride pricing shock as Hormuz blockade impacts supply chain — key etching and cleaning material faces sharp cost increase amid trade disruption Ryzen 7 5800X3D AM4 10th Anniversary Edition surfaces online for $310 — return of iconic gaming CPU for budget builders seems imminent Team Group agrees to $1.1 million DRAM settlement in another false advertising lawsuit — claimed advertised memory speeds required BIOS tweaks and overclocking settings Get a 2TB Western Digital Black SN7100 SSD for only $299.99 — $200 savings drives snappy PCIe 4.0 M.2 SSD down to… Samsung and SK Hynix employees are reportedly abandoning overseas training programs to nab up to $400,000 performance bonuses — online dating grades rise as female members 'seeking out SK hynix employees' Russia's Sberbank wants Chinese chips for its GigaChat AI in the face of Western sanctions — faces a long wait behind ByteDance and Alibaba HyperX FlipCast Review: For the gamer-podcaster Asus ZenScreen Duo OLED (MQ149CD) Review: An impressive, book-style dual-screen portable monitor Intel kicks off development on next-decade 10A and 7A process technologies — 14A node remains on track for… China banned Nvidia 5090D V2 while CEO Jensen Huang was in town, report claims — move comes as Beijing pushes its AI tech companies to use homegrown chips Grab the RX 9070 XT for just $504, $95 below its MSRP, in this Newegg combo deal with the 9850X3D — epic discount on high spec AMD parts for a 4K gaming PC saves you nearly $200 in total Hacker group hits 3,800 internal GitHub repositories via poisoned developer plugin — TeamPCP claims source code theft and attempts $50,000 sale, employee installed malicious VS Code extension $300 slashed from Apple's M4 MacBook Air with 24GB of RAM — get a premium laptop for just $1,099 Intel CEO Lip-Bu Tan stamps out chip bugs with aggressive new quality standards, says major validation errors can result in termination — 'B0, you keep your job. Anything above that, you are fired' You can save up to 72% in HP's Memorial Day sale — grab your next gaming PC, laptop, monitor, or peripheral… Laser-driven spintronic memory device switches 1,000 times faster than DRAM —non-volatile device switches in 40 picoseconds while generating almost no heat Lisa Su meets China's vice premier during China visit — AMD CEO pledges deeper investment days after Trump-Xi… Score up to $129 off a new Secretlab gaming chair or desk ahead of the Memorial Day weekend — bundles and sales galore on a whole range of upgrades for your setup, with stackable coupon codes that knock up to $100 extra off for big orders
中国通过1.54-exaflops的"莱因夏恩(LineShine)"超级计算机绕过美国GPU禁令——这款仅由CPU构成的怪兽配备了240万个华为(Huawei)设计的Armv9核心。
ashilov@gmai · 2026-05-17 · via Latest from Tom's Hardware
Google
(图片来源:Google)

如今,绝大多数领先的超级计算机和人工智能集群使用CPU执行通用任务和编排,并使用AI GPU进行大规模并行计算工作负载,以实现异常高的百亿亿次浮点运算(ExaFLOPS)级性能。但在中国,我们看到了不同的趋势,因为近年来该国已部署了一批纯CPU超级计算机用于AI与HPC工作负载,很大程度上是由于美国对GPU的禁令,导致该国无法采购足够的GPU用于超级计算机。例如,中国国家超级计算中心最近部署了其1.54 ExaFLOPS级别的机器,该机器使用了20,480颗基于Armv9的CPU。

LineShine LX2处理器

China's National Supercomputing Center

(图片来源:中国国家超级计算中心)

每个LX2处理器使用两个计算芯片,总共包含304个CPU核心,这些核心被组织成八个CPU集群,每个集群包含38个核心。每个核心都包含Arm SVE(可扩展向量扩展)和SME(可扩展矩阵扩展)单元,用于加速人工智能训练和科学计算中使用的向量和矩阵运算,支持FP64、FP32、BF16、FP16和INT8数据格式。每个核心配备32 KB L1指令缓存和32 KB L1数据缓存,而每个集群共享28.5 MB的L2缓存。

深入了解TH Premium:AI与数据中心

该处理器采用了一种极为罕见的内存子系统,结合了32 GB的片上HBM(高带宽存储器),可提供高达4 TB/s的带宽,以及多达256 GB的片外DDR5内存。富士通(Fujitsu)基于Arm架构的A64FX处理器也曾使用类似的内存子系统,该处理器正是驱动Fugaku(富岳)超级计算机的核心。,虽然LX2可能是业界首款采用这种内存子系统的基于Armv9的AI和HPC CPU。

每个芯粒包含四个HBM域和四个DDR域;每个处理器有16个NUMA域。HBM访问对局部性高度敏感,而DDR内存在芯片内的访问更均匀,并在集群间共享。这种行为迫使开发者设计拓扑感知的内存放置和调度技术(这对AI训练尤为有用),这些技术由专用的SDMA引擎执行,用于在DDR和HBM之间移动数据。

就性能而言,单个LX2处理器可提供60.3 TFLOPS FP64性能、240 TFLOPS BF16/FP16吞吐量和960 TOPS INT8性能。与传统的服务器CPU不同,该架构虽然仍以CPU为中心,但似乎针对密集的人工智能和矩阵工作负载进行了高度优化。论文指出,为了保持SME矩阵引擎的高利用率,需要在HBM和DDR层次结构中跨内核、运行时调度、缓存驻留管理和张量放置进行广泛的协同设计。

莱恩夏恩(LineShine)超级计算机

LineShine超级计算机包含20,480个计算节点,每个节点配备两个LX2处理器,每个LX2处理器拥有304个CPU核心。因此,整个系统共使用40,960个LX2处理器,总计包含2,451,840个CPU核心。该超级计算机通过灵启高速网络(LQLink)互连,每个节点带宽为1.6 Tb/s。

获取Tom's Hardware的最佳新闻和深度评测,直接发送到您的收件箱。

China's National Supercomputing Center

(图片来源:China's National Supercomputing Center(中国国家超级计算中心))

该机器提供1.54 ExaFLOP/s的BF16训练性能,并在训练一个63亿参数的地球观测生成式压缩模型时峰值达到2.16 ExaFLOP/s。由于像xAI这样的公司不公布其使用数十万块Nvidia AI GPU的AI集群的峰值性能,我们无法将LineShine(莱恩夏恩)的性能与Colossus(巨像)或其他先进AI集群进行比较。然而,xAI的Colossus(巨像)的理论峰值性能是据信为 497.9 ExaFLOPS(百亿亿次浮点运算),因此即使模型浮点运算利用率约为15%(如 LineShine 所做的那样),它也能提供约75 ExaFLOPS 的性能。

在理论峰值 FP64 性能方面,这40960个 LX2 处理器可提供2.47 ExaFLOPS,不过我们对该机器的实际 FP64 吞吐量一无所知,因为这严重依赖于多种因素。

优势众多,但有一个注意事项

纯CPU的AI和高性能计算(HPC)超级计算机相对于传统的异构CPU+GPU系统有几个优势,特别是对于结合了AI训练与大规模数据摄入、预处理、存储交互、模拟及编排的复杂科学任务而言。

由于所有操作都在同一处理器和内存空间上运行,因此避免了异构计算带来的诸多复杂问题,例如昂贵且带宽消耗大的CPU到GPU数据传输、复杂的编程模型、GPU内存限制以及特定于加速器的软件栈。

此外,基于同构CPU的系统可以通过将HBM与大容量DDR相结合来暴露更大的相干内存池,这对于处理大规模科学数据集、检索增强生成(retrieval-augmented generation)和长上下文窗口非常有用。

此外,它们对于涉及不规则控制流、分布式I/O、通信密集型管线以及无法高效映射到GPU的执行模式的科学计算AI应用具有吸引力。

同时,纯CPU系统可以更自然地与传统HPC环境集成,并执行常规超级计算机任务(例如模拟),这对于那些同时需要AI训练/推理和HPC的用户特别有用。

最后但同样重要的是,这类系统减少了对Nvidia GPU和CUDA软件生态系统等外国加速器与平台的依赖,这对中国来说意义重大。

然而,这里存在一个巨大的权衡:纯CPU系统通常能效更低,且密集AI吞吐量低于基于GPU的超级计算机,这正是业界押注异构CPU+GPU架构的原因。

Google Preferred Source

关注 在 Google News 上关注 Tom's Hardware ,或 将我们添加为首选来源 ,以在您的推送中获取我们的最新新闻、分析 与 评测。

安东·希洛夫 (Anton Shilov) 是 Tom's Hardware 的特约撰稿人。在过去的二十多年里,他报道的内容涵盖从 CPU 和 GPU 到超级计算机,从现代工艺技术和最新晶圆厂工具到高科技行业趋势。