AI researchers at Stanford and the University of Washington were able to train an AI “reasoning” model for under $ 50 in cloud compute credits, according to a new research paper released last Friday.
The model, known as S1, performs similar to rationality models, such as Openai’s R1 and R1, in tests that measure mathematics and coding skills. The S1 model is Available in GitHubalong with the data and code used for education.
The team behind the S1 said they started with an off-the-shirt base model, and then perfected through the distillation, a process for exporting the “logic” capabilities from another AI model with the preparation of its answers.
Researchers said the S1 is distilled by one of Google’s reasoning models, the Gemini 2.0 Flash Thinking Experimental. Distillation is the same approach that Berkeley’s researchers used to create a logic model AI for about $ 450 last month.
For some, the idea that some researchers without millions of dollars behind them can still innovate in AI is exciting. But the S1 raises real questions about commercialization of AI models.
Where is the ditch if one can carefully reproduce a multi -million dollar model with a relative pocket change?
It is not surprising that the big AI laboratories are not happy. Openai accused Deepseek of not arresting the data from API for its purposes distillation of a model.
The researchers behind the S1 were trying to find the simplest approach to achieve strong reasoning and “scaling testing time”, or allowing an AI model to think more before answering a question. These were some of the discoveries at Openai’s O1, which Deepseek and other AI workshops tried to reproduce through various techniques.
S1 paper suggests that reasoning models can be deprived of a relatively small database using a process called SFT (SFT), in which an AI model has explicit instructions to mimic certain behaviors in a data set.
SFT tends to be cheaper than the large -scale learning method used by Deepseek to train its competitor in OPENAI’s O1 model, R1.
Google offers free access to the Gemini 2.0 Flash Thinking Experimental, though with daily interest rates, via the Google AI studio platform.
Google terms prohibit the reversal of its models for the development of services competing with the company’s AI offers. We have arrived on Google for comments.
The S1 is based on a small AI model from the Chinese AI Lab Qwen owned by Alibaba, which is available for free. To train the S1, the researchers created a database with just 1,000 carefully edited questions, combined with answers to these questions, as well as the “thinking” process behind each answer from Google Gemini 2.0 Flash Thinking Experimental.
After training S1, which took less than 30 minutes using 16 GPU NVIDIA H100, the S1 has achieved strong performance at some AI reference points, according to researchers. Niklas Muennighoff, a Stanford researcher who worked on the project, told TechCrunch that he could rent the necessary calculation today for about $ 20.
The researchers used an elegant trick to get the S1 to control his work and extend his “thinking” time: they told him to wait. Adding the word “wait” during his reasoning S1 helped the model reach slightly more accurate answers, on paper.
In 2025, Meta, Google and Microsoft Plan to invest hundreds of billions of dollars in AI infrastructurewhich will partially move on to AI Model Training.
This level of investment may be necessary to push the AI innovation folder. Distillation has been shown to be a good method for cheap re -creation of the capabilities of an AI model, but does not create new AI models much better than it is available today.