OpenAI is building its own AI inference chip, named Jalapeño, with Broadcom. The launch, announced by OpenAI on Tuesday, makes the company behind ChatGPT the latest hyperscaler to design custom silicon for running AI models rather than training them.
That distinction matters. Inference chips are specialized processors designed to execute already-trained models in production, answering user prompts, generating responses, processing images. Training chips, like Nvidia's flagship GPUs, build those models in the first place. OpenAI's chip is for serving the model, not creating it.
The strategic signal is vertical integration. OpenAI helped create the surge in compute demand that has defined the AI buildout; now it is trying to own more of the pipe that satisfies that demand. Greg Brockman, OpenAI's co-founder, framed the move in unusually direct terms, pointing to the broader shift: "The world is moving to a compute-powered economy."
OpenAI is not the first hyperscaler to take this path. Google has run its own Tensor Processing Units in production for nearly a decade, having announced its eighth generation of TPUs in April 2026. Amazon has Trainium and Inferentia. Microsoft has Maia. Meta has its MTIA chip. The shared logic: hyperscalers do not want a single-vendor dependency on Nvidia for the resource that powers their core product. OpenAI joining that list signals that custom silicon is now standard for any company whose business model depends on inference at scale.
But the announcement is short on specifics. OpenAI claims its first-generation accelerator delivers performance-per-watt "substantially better than current state-of-the-art." The company did not say whether that baseline is Nvidia's current GPUs, prior AI accelerators like Groq or Cerebras, or OpenAI's own existing Cerebras-deployed inference stack. No process node, transistor count, power draw, or throughput numbers were disclosed. The Broadcom investor release confirms the partnership and the "LLM-optimized intelligence processor" framing but does not add hard numbers either.
That ambiguity is the story. Without a stated comparison, "substantially better" is a claim, not a measurement, and one that EE Times treats as the news rather than the headline.
Jalapeño is also not OpenAI's only inference bet. The company has a separate, ongoing partnership with Cerebras for wafer-scale inference systems, which use entire silicon wafers as a single processor. Calling Jalapeño OpenAI's "first custom chip," as some headlines did, obscures that OpenAI is building a portfolio of inference architectures rather than committing to a single design.
What to watch: OpenAI says Jalapeño targets gigawatt-scale deployment across multiple generations with data center partners, a scale that requires factory-sized power commitments. The next test is whether any independent benchmark, or any shipped system, will let observers measure whether "substantially better" actually held up.