Openai says it has developed a new system for monitoring the latest AI, O3 and O4-MINI reasoning models, for prompts associated with biological and chemical threats. The system aims to prevent models to offer tips that could command someone to carry out potentially harmful attacks, According to the Openai Security Report.
O3 and O4-MINI represent a significant increase in potential in relation to previous Openai models, the company says and thus pose new dangers in the hands of bad actors. According to Openai’s internal reference points, the O3 is more specialized in answering questions about creating certain types of biological threats in particular. For this reason-and in order to mitigate other dangers-Openai has created the new monitoring system, which the company describes as “security-focused monitoring”.
The screen, which has been trained customized to account for Openai’s content policies, runs over O3 and O4-mini. It is designed to identify the prompts associated with biological and chemical risk and to guide models to refuse to provide advice on these issues.
To create a basic line, Openai had red groupers spend about 1,000 hours to signal “unsafe” talks associated with the O3 and O4-Mini biocis. During a test in which Openai simulated the “exclusion of exclusion” of its security monitoring, models refused to respond to dangerous prompts of 98.7% of the time, according to Openai.
Openai acknowledges that its test did not represent people who could try new prompts after being blocked by the screen, so the company says it will continue to be partly based on human monitoring.
O3 and O4-MINI do not cross Openai’s “High Risk” threshold for BIORICISKA, according to the company. However, compared to O1 and GPT-4, Openai reports that the first versions of O3 and O4-Mini proved to be more useful in answering questions about the development of biological weapons.
The company is actively monitoring how its models could make it easier for malicious users to develop chemical and biological threats, according to recent updated Openai Preparedness.
Openai is increasingly based on automated systems to mitigate the dangers of its models. For example, for prevention GPT-4O image generator from creating child sexual abuse (CSAM)Openai says it uses a reasoning screen similar to that that the company developed for O3 and O4-mini.
However, many researchers have caused concern that OpenAI does not prioritize security as much as it should. One of the company’s partners, Metr, said he had a relatively short time to try O3 at a reference point for misleading behavior. Meanwhile, Openai has decided not to release a security report for the GPT-4.1 model, which began earlier this week.
