AI2 says his new model AI hits one of the best of Deepseek

Move, Deepseek. There is a new AI champion in the city – and they are Americans.

On Thursday, AI2, a non -profit AI research institute in Seattle, released a model that claims to exceed the top Deepseek V3 systems, one of the leading DEEPSEEK systems.

The AI2 model, called Tulu3-405B, also hits Openai’s GPT-4O at some AI reference points, according to AI2 internal tests. In addition, unlike GPT-4O (and even Deepseek V3), Tulu3-405B is open source, which means that all the elements necessary to reproduce them from scratch are freely available and permitted permit.

An AI2 spokesman told TechCrunch that the laboratory believes that Tulu3-405B “emphasizes the US’s ability to lead the global development of the best AI genetic models.”

“This milestone is a basic time for the future of Open AI, reinforcing the US position as leader in competitive open source models,” the spokesman said. “With this launch, AI2 introduces a powerful alternative to the US developed in Deepseek models-signifying a central moment not only in the development of AI but also on the demonstration that the US can lead with a competitive, open source independent of technological giants.

Tulu3-405b is a fairly large model. It contains 405 billion parameters, required 256 GPUs running parallel to train, according to AI2. The parameters correspond to approximately the problem -solving skills of a model and models with more parameters generally give better than those with fewer parameters.

AI2 examined Tulu3-405B at popular benchmarks.Image credits:AI2

According to AI2, one of the keys to achieve competitive performance with Tulu3-405B was a technique called learning enhancement with verifiable rewards. Learning aid with verifiable rewards, or RLVR, trains models in work with “verifiable” results, such as solving mathematics and instructions.

AI2 claims that at Popqa reference, a total of 14,000 specialized knowledge questions from Wikipedia, Tulu3-405B hit not only Deepseek V3 and GPT-4O, as well as META’s Llama 3.1 405B model. The Tulu3-405B also had the highest performance of any model in its GSM8K class, a test containing mathematical math.

Tulu3-405b is available for test Through the AI2 Web Chatbot app and Code for Model Training are on gitHub and the AI Dev platform platform. Take it while it is warm-and before the next report-beating flagship model AI together.

What's Hot

AI2 says his new model AI hits one of the best of Deepseek

Related Posts

Leave A Reply Cancel Reply