OpenAI and Broadcom on June 24, 2026 unveiled Jalapeño, OpenAI’s first custom “Intelligence Processor” built specifically for large language model inference and manufactured by TSMC. Engineering samples are already running workloads such as GPT‑5.3‑Codex‑Spark in OpenAI’s labs, with large‑scale data center deployment planned beginning in 2026.
This article aggregates reporting from 1 news source. The TL;DR is AI-generated from original reporting. Race to AGI's analysis provides editorial context on implications for AGI development.
Jalapeño is OpenAI’s clearest signal yet that frontier labs now see custom silicon as strategic, not optional. By co‑designing an inference ASIC with Broadcom and moving from concept to tape‑out in just nine months, OpenAI is trying to own more of the stack that determines how fast, cheap, and reliable its models can run in production. The chip is tuned specifically for LLM inference, with architecture choices around memory movement, networking, and utilization that OpenAI claims deliver substantially better performance per watt than today’s leading accelerators. ([openai.com](https://openai.com/index/openai-broadcom-jalapeno-inference-chip/))
From a race‑to‑AGI perspective, this matters less as a one‑off chip and more as the first turn of a multi‑generation compute flywheel. Cheaper, more efficient inference lets OpenAI drop API prices, run more agentic workflows, and reinvest cash flows into even larger training runs. It also deepens its interlock with Microsoft, which will host Jalapeño at gigawatt‑scale data centers, and with TSMC at the cutting edge of 3nm manufacturing. ([openai.com](https://openai.com/index/openai-broadcom-jalapeno-inference-chip/)) If OpenAI can routinely design chips around upcoming model architectures, it gains a structural advantage over labs that must adapt to generic GPUs. At the same time, this accelerates the broader shift toward vertically integrated AI stacks, forcing Anthropic, Google, Meta, and others to sharpen their own hardware strategies.


