Artificial Intelligence & Machine Learning , Next-Generation Technologies & Secure Development
Custom Silicon Advances Firm's Push Toward a Full AI Stack • June 24, 2026
OpenAI took its first step towards becoming a full-stack, end-to-end artificial intelligence company after it announced its first inference chip, Jalapeño.
See Also: Accelerate Vector Search for enterprise-scale AI with Elastic and NVIDIA
The company developed Jalapeño in partnership with Broadcom and Canadian electronics manufacturer Celestica. OpenAI will mostly use Jalapeño for model responses and actions rather than training large language models, for which it uses Nvidia chips and GPUs. Inference chips tend to use less energy than training chips and often run faster.
In a blog post, OpenAI said Jalapeño is built "for modern LLM inference, not a general purpose accelerator adapted from earlier AI workloads." While Jalapeño will be deployed mostly for current and future AI models, much of its design was informed by how OpenAI's models and products work.
"The goal is to combine the power and throughput of today's leading AI accelerators with latency closer to the fastest specialized inference systems, making Jalapeño well-suited for interactive LLM products at scale," OpenAI said.
OpenAI plans to offer the chip to data center partners this year, though the timeline is unclear.
The inference space accelerated in the past few years as established companies like AMD, Intel, Google and AWS began designing silicon specifically for inferencing tasks. Investment in smaller competitors such as SambaNova and Groq also grew. Making any chip is difficult, but inferencing handles a narrower, more predictable computational workload than a chip focused on training.
Jalapeño already runs workloads at "production target frequency and power," including GPT-5.3-Codex-Spark. OpenAI said it's still measuring the chip's performance metrics, but early tests showed it delivers "better than the current state of the art." The company explained that one reason for its strong performance is the chip's architecture, which reduces data movement and balances computing.
The AI company's first chip "was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers," said OpenAI hardware lead Richard Ho in the blog post. "We optimized the architecture around the kernels, memory movement, networking and serving patterns that matter most for frontier AI models. Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware's theoretical limits."
The biggest impact Jalapeño has on OpenAI is the ability to move away from third-party inference chips. While its dependence on Nvidia GPUs to train its AI models is likely to continue, OpenAI can begin lessening its reliance on Nvidia or Google for inference tasks. OpenAI has used a mix of different inference silicon over the years, inking deals with Nvidia, AMD and Google's tensor processing units to further refine its models, agents and platforms.
OpenAI calls this the full-stack advantage. Should the chips become a product success, it will have greater control over costs, potentially lowering the cost of model development. OpenAI's Stargate project, a network of data centers across Texas, New Mexico and the Midwest, will also give the company more control over this area of the AI stack.
OpenAI seizing more control over its tech stack won't automatically mean lower costs for end users. Vertical integration does equate to less vendor dependence, but it requires significant upfront capital expenditures without a guarantee of a return. OpenAI has already raised a lot of money and filed the necessary forms to go public, just to fund many of its infrastructure projects. Other companies that are more vertically integrated, such as Google - which makes its own chips, develops models, runs its own data centers, hosts its services in the cloud and has a built-in distribution system for its products - still price their LLMs at a premium.


























