OpenAI announces new o3 and o3-mini reasoning models that can think at the human level
OpenAI CEO Sam Altman announced new o3 and o3-mini artificial intelligence models on the last day of the 12 Days of OpenAI event, which build on the previous o1 models. These models use a "private chain of thought" method that allows them to plan their answers in advance, which is called simulated reasoning (SR).
Here's What We Know
The o3 model achieved record results in the ARC-AGI benchmark, scoring 75.7% under low computational resources and 87.5% under high computational resources, which is comparable to human performance. The o3 also scored 96.7% on the American Invitational Maths 2024 and 87.7% on the GPQA Diamond test, which includes undergraduate-level questions in biology, physics and chemistry. In EpochAI's Frontier Math benchmark, the o3 solved 25.2 per cent of the problems, while no other model topped 2 per cent.
The o3-mini model includes an adaptive thinking time feature, offering low, medium and high processing speeds. OpenAI claims that higher computation settings produce better results. These models will be available for testing by security researchers. The o3-mini is scheduled to launch in late January, and o3 will launch shortly thereafter.
Source: OpenAI