Ampere and Qualcomm aren’t the most obvious partners. Both, after all, offer Arm-based chips to run data center servers (though Qualcomm’s biggest market remains mobile). But as the two companies announced today, they’re now joining forces to offer an AI-focused server that uses Ampere’s CPUs and Qualcomm’s Cloud AI100 Ultra AI inference chips for running — not training — models.
Like every other chip maker, Ampere wants to take advantage of the AI boom. However, the company’s focus has always been on fast and efficient server chips, so while it can use Arm IP to add some of those capabilities to its chips, it’s not necessarily a core capability. That’s why Ampere decided to partner with Qualcomm (and SuperMicro to integrate the two solutions), Arm CTO Jeff Wittich tells me.
“The idea here is that while I’m going to show you some great performance for Ampere processors running AI inference on CPUs alone, if you want to scale to even larger models — multi-100 billion parameter models, for example — like all the other workloads , AI is not for everyone,” Wittich told TechCrunch. “We’re working with Qualcomm on this solution, combining their highly efficient Ampere processors to do many of the general-purpose tasks you do in conjunction with inference, and then using their really efficient cards, we have a server.”
When it comes to working with Qualcomm, Wittich said Ampere wanted to bring together the best solutions.
“[R]very good partnership we had with Qualcomm here,” he said. “That’s one of the things that we’ve been working on, I think we share a lot of really similar interests, so I think that’s really exciting. They create really, really effective solutions and a lot of different parts of the market. We’re building really, really efficient solutions on the server CPU side.”
Qualcomm’s partnership is part of Ampere’s annual roadmap update. Part of this roadmap is the new 256-core AmpereOne chip, built using a modern 3nm process. These new chips aren’t yet generally available, but Wittich says they’re fab-ready and will be released later this year.
In addition to the additional cores, the defining feature of this new generation of AmpereOne chips is 12-channel DDR5 RAM, which allows Ampere’s data center customers to better tune their users’ memory access according to their needs.
However, the selling point here is not just performance, but the power consumption and cost of running these chips in the data center. This is especially true when it comes to AI inference, where Ampere wants to compare its performance against Nvidia’s A10 GPUs.
It’s worth noting that Ampere isn’t ditching any of its existing brands in favor of these new ones. Wittich pointed out that even these older chips still have many use cases.
Ampere also announced another partnership today. The company works with NETINT to create a joint solution combining Ampere CPUs with NETINT video processing chips. This new server will be able to encode 360 live video channels in parallel, while also using OpenAI Whisper speech-to-text model in subtitles 40 streams.
“We started down this path six years ago because it was clear it was the right path,” Ampere CEO Renee James said in today’s announcement. “Low power was synonymous with low performance. Ampere has proven that this is not true. We’ve pushed the envelope of computing performance and delivered performance beyond legacy CPUs in an efficient computing envelope.”