Three years ago, Luminal co-founder Joe Fioti was working in chip design at Intel when he had a realization. While he was working to make the best chips he could, the biggest bottleneck was software.
“You can make the best hardware on earth, but if it’s hard for developers to use it, they’re just not going to use it,” he told me.
Now, he has started a company that focuses exclusively on this problem. On Monday, Luminal announced $5.3 million in fundingin a round led by Felicis Ventures with angel investment from Paul Graham, Guillermo Rauch and Ben Porterfield.
Fioti co-founders Jake Stevens and Matthew Gunton come from Apple and Amazon, respectively, and the company was part of Y Combinator’s Summer 2025 batch.
Luminal’s core business is simple: the company sells computers, just like neo-cloud companies like Coreweave or Lambda Labs. But where those companies focus on GPUs, Luminal has focused on optimization techniques that allow the company to squeeze more computing out of the infrastructure it has. Specifically, the company is focusing on optimizing the compiler that sits between written code and GPU hardware—the same developer systems that caused Fioti so many headaches in his previous work.
Currently, the industry’s leading compiler is Nvidia’s CUDA system — an underrated component of the company’s success. However, many components of CUDA are open source, and Luminal is betting that, with many in the industry still looking for GPUs, it will be of great value in building the rest of the stack.
It’s part of a growing cohort of inference optimization startups that have become more valuable as companies look for faster and cheaper ways to make their models work. Inference providers like Baseten and Together AI have long specialized in optimization, and smaller companies like Tensormesh and Clarifai are now emerging to focus on more specific technical tricks.
Luminal and other members of the cohort will face stiff competition from optimization groups in large laboratories, which have the advantage of optimizing for a single family of models. Working for clients, Luminal must adapt to whatever model comes their way. But even with the risk of being outflanked by hyperscalers, Fioti says the market is growing fast enough that he’s not worried.
“It’s always going to be possible to spend six months setting up a model architecture on a given piece of hardware, and you’re probably going to beat any kind of, any kind of compiler performance,” Fioti says. “But our big bet is that other than that, the all-purpose case is still very economical.”
