Multiverse Computing is pushing its compressed AI models into the mainstream

With private company defaults running on over 9.2% — the highest rate in years — VC firm Lux Capital recently advised AI-based companies to take their compute capacity commitments is confirmed in writing. With economic volatility roiling the AI supply chain, Lux warned, a handshake deal is not enough.

But there is another option entirely, which is to stop relying entirely on external computing infrastructure. Smaller AI models that run directly on a user’s device—no data center, no cloud provider, no counterparty risk—are doing well enough to be worth considering. And Multiverse Computing raises his hand.

The Spanish startup has so far kept a lower profile than some of its peers, but as demand for AI efficiency grows, that’s changing. After compressing models from major AI labs, including OpenAI, Meta, DeepSeek, and Mistral AI, it released both an app that showcases the capabilities of its compressed models and an API portal — a portal that allows developers to access and build with those models — that makes them more widely available.

THE CompactifAI applicationwhich shares its name with Multiverse’s quantum-inspired compression technology, is an AI chat tool along the lines of Mistral’s ChatGPT or Le Chat. Ask a question and the model answers. The difference is that Multiverse incorporated Gilda, a model so small it can work locally and offline, according to the company.

For end users, this is a taste of AI at the edge, with data that doesn’t leave their devices and doesn’t require a connection. But there is a caveat: their mobile devices must have enough RAM and storage space. If they don’t — and many older iPhones don’t — the app falls back to cloud-based models via APIs. Routing between local and cloud processing is done automatically by a system Multiverse calls Ash Nazg, whose name will ring a bell for Tolkien fans as it refers to the One Ring inscription in “The Lord of the Rings.” But when the app is routed to the cloud, it loses its primary privacy advantage in the process.

These limitations mean that CompactifAI is not yet ready for mass customer adoption, although that may never have been the goal. According to data from Sensor Tower, the application had less than 5,000 downloads last month.

The real target is business. Today, Multiverse is launching one self-service API portal that gives developers and businesses instant access to its compressed models — no AWS Marketplace required.

Techcrunch event

San Francisco, California
|
13-15 October 2026

“The CompactifAI API Gateway 1773926402 gives developers instant access to compressed models with the transparency and control needed to run them in production,” CEO Enrique Lizaso said in a statement.

Real-time usage monitoring is one of the key features of the API, and this is no accident. In addition to the potential advantages of edge development, lower computational costs are one of the main reasons why businesses are considering smaller models as an alternative to large language models (LLMs).

It also helps that the small models are less limited than they used to be. Earlier this week, Mistral updated its small family of models with the launch of the Mistral Small 4which it says is simultaneously optimized for general conversation, coding, agent tasks, and reasoning. The French company also released Forge, a system that allows businesses to build custom models, including small models for which they can choose the trade-offs that their use cases can best tolerate.

Recent results from Multiverse also suggest that the gap with LLMs is narrowing. Its latest compressed model, HyperNova 60B 2602, is built on gpt-oss-120b — an OpenAI model whose underlying code is publicly available. The company claims it is now delivering faster responses at a lower cost than the original from which it was derived, an advantage that is of particular importance for agent coding workflows, where AI autonomously completes complex multi-step programming tasks.

Making models small enough to work on mobile devices while still being useful is a big challenge. Apple Intelligence circumvented this issue by combining an on-device model and a cloud model. Multiverse’s CompactifAI implementation can also route requests to gpt-oss-120b via API, but its main goal is to show that local models like Gilda and its future replacements have advantages beyond cost savings.

For mission-critical workers, a model that can run locally and offline in the cloud offers greater privacy and resiliency. But the greatest value is in the business use cases it can unlock—for example, integrating artificial intelligence into drones, satellites, and other settings where connectivity can’t be taken for granted.

The company already serves more than 100 clients worldwide, including Bank of Canada, Bosch and Iberdrola, but expanding its customer base could help it unlock more funding. After raising a $215 million Series B last year, it’s now is rumored to be raising a new funding round of €500 million at a valuation of more than 1.5 billion euros.

What's Hot

Multiverse Computing is pushing its compressed AI models into the mainstream

Related Posts

Leave A Reply Cancel Reply