OpenAI on June 25, 2026 publicly detailed Jalapeño, its first custom AI inference chip co-developed with Broadcom, via multiple tech outlets. The processor is designed to cut reliance on Nvidia GPUs and lower the cost of running services like ChatGPT and Codex, with early tests showing higher performance per watt than current accelerators.
This article aggregates reporting from 3 news sources. The TL;DR is AI-generated from original reporting. Race to AGI's analysis provides editorial context on implications for AGI development.
Custom silicon has quietly become one of the sharpest weapons in the AGI arms race, and Jalapeño is OpenAI planting its flag. By moving beyond commodity GPUs to a chip tailored for large language model inference, OpenAI is chasing not just speed, but control over its cost curve and deployment roadmap. If Jalapeño really can match or beat top-tier accelerators on performance per watt, it changes the economics of running models like GPT‑5.x at global scale.
Strategically, this aligns OpenAI with the playbooks of Google (TPU), Amazon (Trainium/Inferentia), and Meta (MTIA): own your silicon, own your destiny. It also tightens the bond with Broadcom and, indirectly, with Microsoft’s Azure data centers, which are expected to host a large fraction of these chips. The notable detail is that OpenAI reportedly used its own models to accelerate Jalapeño’s design, hinting at a recursive loop where AI helps design the next generation of AI hardware.
If Jalapeño succeeds, smaller labs dependent on Nvidia or other vendors could find themselves structurally disadvantaged on cost, latency, or both. That tilts the field further toward a handful of vertically integrated players who control data, models, and now the compute substrate itself.