Friday, March 20, 2026

OpenAI Announces Up-to-date AI Model Codenamed Strawberry That Solves Strenuous Problems Step by Step

Share

OpenAI made the last gigantic breakthrough in AI by increasing the size of its models to mind-boggling proportions when it introduced GPT-4 last year. The company announced a up-to-date advance today that signals a change in approach — a model that can “reason” logically through a wide range of tough problems and is significantly smarter than existing AI without much scaling.

The up-to-date model, called OpenAI-o1, can solve problems that have puzzled existing AI models, including OpenAI’s most powerful existing model, GPT-4o. Instead of coming up with an answer in one step, as a enormous language model typically does, it reasons through the problem, effectively thinking out noisy like a human before arriving at the correct result.

“This is what we think is the new paradigm in these models,” Mira Murati, OpenAI’s chief technology officer, told WIRED. “It’s much better at handling very complex reasoning tasks.”

The up-to-date model has been codenamed Strawberry within OpenAI and is not a successor to GPT-4o, but rather a complement to it, the company says.

Murati says OpenAI is currently building its next major model, GPT-5, which will be much larger than its predecessor. But while the company still believes that scale will support squeeze up-to-date capabilities out of AI, GPT-5 will likely also include the reasoning technology introduced today. “There are two paradigms,” Murati says. “The scaling paradigm and this new paradigm. We expect to combine them.”

LLMs typically conjure their answers from massive neural networks fed with immense amounts of training data. They may demonstrate extraordinary linguistic and logical abilities, but they have traditionally struggled with surprisingly uncomplicated problems, such as basic math questions that require reasoning.

Murati says OpenAI-o1 uses reinforcement learning, which involves giving the model positive feedback when it gets the right answer and negative feedback when it doesn’t, to improve its reasoning. “The model sharpens its thinking and refines the strategies it uses to arrive at the answer,” he says. Reinforcement learning has enabled computers to play games with superhuman skills and perform useful tasks, such as designing integrated circuits. The technique is also a key ingredient in turning LLM into a useful and well-behaved chatbot.

Mark Chen, vice president of research at OpenAI, demonstrated the up-to-date model to WIRED by using it to solve several problems that the previous GPT-4o model couldn’t solve. These included an advanced chemistry question and the following mind-bending math puzzle: “The princess is as old as the prince would be if the princess was twice as old as the prince, when the princess’s age was half their current ages. How old are the prince and princess?” (The correct answer is that the prince is 30 and the princess is 40.)

“This [new] “The model learns to think for itself, rather than trying to imitate the way humans think,” as a conventional LLM does, Chen says.

OpenAI says its up-to-date model performs significantly better on multiple problem sets, including those focused on coding, math, physics, biology, and chemistry. On the American Invitational Mathematics Examination (AIME), a test for math students, GPT-4o solved an average of 12 percent of problems, while o1 solved 83 percent correctly, according to the company.

Latest Posts

More News