OpenAI today announced an improved version of its most capable artificial intelligence model yet, one that takes even longer to deliberate questions, just a day after Google announced its first such model.
OpenAI’s new model, called the o3, replaces the o1, which the company introduced in September. Like o1, the new model spends time ruminating on a problem to provide better answers to questions that require step-by-step logical reasoning. (OpenAI chose to omit the “o2” moniker because it is already the name of a mobile phone operator in the UK).
“We see this as the beginning of the next phase of AI,” OpenAI CEO Sam Altman said in a livestream Friday. “Where you can use these models to do increasingly complex tasks that require a lot of reasoning.”
The o3 model scores much higher on several measures than its predecessor, OpenAI says, including those measuring complex skills related to coding and advanced proficiency in math and science. It is three times better than o1 at answering questions posed by ARC-AGI, a benchmark designed to test the ability of AI models to reason about extremely difficult mathematical and logical problems encountered for the first time.
Google is pursuing a similar line of research. Google researcher Noam Shazeer revealed yesterday in a post on X that the company has developed its own reasoning model, called Gemini 2.0 Flash Thinking. Google CEO Sundar Pichai called it “our most thoughtful model yet” in his own post. Google’s new model achieved a high score in SWE-Bench, a test that measures the agent skills of models.
However, OpenAI’s new o3 model is 20 percent better than o1. “O3 blew it out of the water,” says Ofir Press, a postdoctoral researcher at Princeton University who helped develop SWE-Bench. “A very surprising increase, I’m not sure how they did it.”
The two dueling models show that the competition between OpenAI and Google is fiercer than ever. It’s crucial for OpenAI to prove it can keep moving forward as it looks to attract more investment and build a profitable business. Meanwhile, Google is desperate to show it remains at the forefront of AI research.
The new models also show how AI companies are increasingly looking beyond simply scaling AI models to extract more intelligence from them.