






















UPDATED 16:30 EDT / JUNE 24 2026
AI

OpenAI Group PBC today revealed a custom chip called Jalapeño that it will use to power its large language models.
The processor is the fruit of a collaboration with Broadcom Inc., which is no stranger to custom silicon design. The company helped Google LLC develop its TPU line of artificial intelligence accelerators. In April, the search giant extended its chip collaboration with Broadcom to 2031.
Nvidia Corp.’s flagship Rubin graphics cards can run both training and inference workloads. By contrast, Jalapeño is only designed for the latter use case, which is the process of running the AI models in response to queries. According to OpenAI, early testing indicates that the chip can perform inference with significantly higher performance per watt than “current state-of-the-art,” which may be a reference to Nvidia chips.
The company has shared few details about Jalapeño’s design. However, the blog post in which it announced the chip specifies that the underlying “architecture reduces data movement.” That hints Jalapeño’s architecture may be designed to reduce data movement between its logic circuits and off-chip memory, one of the main performance bottlenecks in inference clusters.
AI chip suppliers take several approaches to reducing data movement. One of the most common methods is to equip an accelerator with a large amount of onboard SRAM, a type of high-speed memory. The more SRAM a chip includes, the less data must be sent to off-chip memory. Cerebras Systems Inc. and Groq Inc. are among the companies that have adopted that approach.
OpenAI says that its Jalapeño-powered inference clusters will use multiple Broadcom networking technologies. One of them is the company’s Tomahawk chip series, which is designed to power Ethernet switches. Tomahawk-based switches can be used to move data both between servers in the same rack and between racks.
Broadcom’s newest Tomahawk chip, the Tomahawk 6, can process up to 1.6 terabits of traffic per second. A built-in congestion management engine fixes network bottlenecks that might slow down connections.
OpenAI plans to deploy Jalapeño and its Broadcom-supplied network equipment in custom server racks. The ChatGPT developer is developing the systems in collaboration with Celestia Inc., a Toronto-based provider of data center equipment design services. The company can also help customers optimize their server production lines.
It will bring its first Jalapeño servers online by year’s end. It plans to expand its use of the chip over time. Its blog post describes Jalapeño as the “first step in a multi-generation compute platform,” which hints that it may be planning to develop additional inference processors in the future. Another possibility is that OpenAI will design custom chips for adjacent use cases such as model training.
Jalapeño may have the potential to open new revenue streams for the company. Nvidia sells its graphics cards as part of systems called DGX appliances that also include central processing units, cooling modules and other hardware. OpenAI has the resources to bring competing Jalapeño-powered appliances to market. It could even enable customers to run its AI models on-premises using such systems.
A move into the lucrative AI hardware market might not only boost OpenAI’s revenue growth but also raise investor interest in its upcoming public offering. Anthropic PBC, the company’s top rival, recently filed for a listing of its own. An inference hardware offering could be a valuable differentiator for OpenAI during its roadshow, particularly if Anthropic goes public first.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。