Chinese companies continue to release AI models that rival the capabilities of systems developed by OpenAI and other US-based AI firms.
this week, MiniMaxa startup backed by Alibaba and Tencent which owns lifted up approximately $850 million in venture capital and is valued at more than $2.5 billion; made his debut three new models: MiniMax-Text-01, MiniMax-VL-01 and T2A-01-HD. MiniMax-Text-01 is a text-only model, while MiniMax-VL-01 can understand both images and text. The T2A-01-HD, meanwhile, produces sound — specifically speech.
MiniMax claims that MiniMax-Text-01, which is 456 billion parameters in size, outperforms models such as Google’s recently introduced Gemini 2.0 Flash on benchmarks such as MATH and SimpleQA, which measure a model’s ability to answer math problems and fact-based questions. Parameters roughly correspond to a model’s problem-solving skills, and models with more parameters generally perform better than those with fewer parameters.
As for the MiniMax-VL-01, MiniMax says it competes with Anthropic’s Sonnet Claude 3.5 in assessments that require multimodal understanding, such as ChartQA, which tasks models with answering questions related to graphs and charts (e.g. . “What is the maximum value of the orange line on this chart?”). Admittedly, the MiniMax-VL-01 doesn’t best the Gemini 2.0 Flash in many of these tests. OpenAI’s GPT-4o and Meta’s Llama 3.1 also won it over to many.
It is worth noting that MiniMax-Text-01 has an extremely large context window. A model’s context, or context window, refers to input (for example, text) that a model examines before generating output (additional text). With a context window of 4 million tokens, MiniMax-Text-01 can parse about 3 million words in one go — or just over five copies of “War and Peace.”
For context (no pun intended), MiniMax-Text-01’s context window is about 31 times the size of GPT-4o and Llama 3.1.
The latest of MiniMax’s models released this week, the T2A-01-HD, is a sound generator optimized for speech. The T2A-01-HD can create a synthetic voice with adjustable tempo, pitch and tenor in about 17 different languages, including English and Chinese, and clone a voice from just 10 seconds of a recording.
MiniMax has not published benchmark results comparing the T2A-01-HD to other audio output models. But to this reporter’s ear, the T2A-01-HD’s outputs sound on par with audio models from After and startups like PlayAI.
With the exception of T2A-01-HD, which is exclusively available through the MiniMax API and the Hailuo AI platform, the new MiniMax models can be downloaded from GitHub and the Hugging Face AI platform.
However, just because the models are “open” available doesn’t mean they aren’t locked in certain aspects. MiniMax-Text-01 and MiniMax-VL-01 are not truly open source in the sense that MiniMax has not released the assets (eg training data) needed to recreate them from scratch. In addition, they are subject to MiniMax’s restrictive license, which prohibits developers from using the models to improve competing AI models and requires platforms with more than 100 million monthly active users to request special permission from MiniMax.
MiniMax was founded in 2021 by former employees of SenseTime, one of the largest artificial intelligence companies in China. The company’s projects include applications such as Talkie, an AI-powered role-playing platform according to Character AI, and text-to-video models launched by MiniMax on Hailuo.
Some of MiniMax’s products have been the subject of a bit of controversy.
Talkie, which was pulled from Apple’s App Store in December for unspecified “technical” reasons, features AI avatars of public figures including Donald Trump, Taylor Swift, Elon Musk and LeBron James, none of whom appear to consent to appear in the application.
In December, Broadcast magazine was mentioned that MiniMax’s video generators can reproduce the logos of British TV channels, suggesting that MiniMax’s models were trained on content from those channels. And the MiniMax is said to be is being sued by iQIYI, a Chinese video streaming service that claims MiniMax was illegally trained on iQIYI’s copyrighted recordings.
The new MiniMax models arrive just days after the outgoing Biden administration proposed tougher export rules and restrictions on artificial intelligence technologies for Chinese companies. Companies in China were already barred from buying advanced AI chips, but if the new rules go into effect as written, companies will face tighter limits on both the semiconductor technology and the models needed to run sophisticated systems AI.
On Wednesday, the Biden administration was announced additional measures focus on keeping sophisticated chips out of China. Chip foundries and packaging companies that want to export certain chips will be subject to broader licensing requirements unless they exercise greater scrutiny and due diligence to prevent their products from reaching Chinese customers.