On Wednesday, OpenAI was revealed its first custom inference processor, designed and built in collaboration with Broadcom. Named Jalapeño, the new processor was designed specifically for the unique needs of OpenAI’s inference systems. OpenAI’s artificial intelligence models helped develop the chip, the company said.
While the chip is still being tested, OpenAI says early results show significantly better performance per watt than current state-of-the-art alternatives.
The collaboration was was officially announced in Octoberbut OpenAI’s chip designs have has been rumored for some time as a way to reduce the company’s reliance on Nvidia GPUs. Google and Amazon both have built custom chips to serve a similar purpose, often called “artificial intelligence accelerators” — silicon designed specifically to accelerate machine learning workloads.
OpenAI president Greg Brockman explained the company’s approach to chip development inside the podcastshortly after the Broadcom partnership was announced.
“We have a deep understanding of the workload,” Brockman said in the episode. “We were really looking for specific workloads that are underserved, [and asking] how can we make something that will be able to accelerate everything that is possible?’
Jalapeño is specifically designed for inference, the process of running pre-built AI models in response to user commands. In the announcement, OpenAI highlighted the chip’s low operational cost when running real-time coding models. It’s likely that more performance-intensive tasks like pre-training will still rely on Nvidia hardware, but even small reductions in inference costs could do a lot to improve the company’s bottom line.
Optimizing this inference system may prove to be a critical factor in AI economics in the future — and is likely to occur at every level of the stack. OpenAI already builds representative products like Codex and the models that feed them, as well as data centers to run those models. The move to custom-made chips allows the company to take this process even further, the company explained in its announcement.
“OpenAI doesn’t just develop frontier models or build products on top of them, it designs the infrastructure beneath them: chip architecture, cores, memory systems, networking, programming, development systems, and product experience,” the company wrote. “Because OpenAI works across the stack, each layer can be optimized around the same goal: to make its models faster, more reliable and more accessible for users.”
When you purchase through links in our articles, we may earn a small commission. This does not affect our editorial independence.
