OpenAI is teasing a modern reasoning model — but don't expect to try it out any time soon

Share

On the final day of the ship masses, OpenAI unveiled a modern set of frontier “reasoning” models called o3 and o3-mini. Edge announced for the first time that a modern reasoning model would emerge during this event.

The company is not currently releasing these models (and acknowledges that final results may evolve with more training). However, OpenAI is accepting applications from the research community to test these systems before publication (which date has not yet been set). OpenAI launched o1 (codenamed Strawberry) in September and jumps straight to o3, skipping o2 to avoid confusion (or trademark conflicts) with the British telecommunications company O2.

Deadline reasoning has recently become a buzzword in the artificial intelligence industry, but it essentially means that a machine breaks down instructions into smaller tasks that can produce better results. These models often show how the answer was arrived at, rather than simply providing a final answer without explanation.

According to the company, o3 exceeds previous performance records in all areas. It beats its predecessor in coding tests (called SWE-Bench Verified) by 22.8 percent and outperforms OpenAI’s chief scientist in competitive programming. The model almost won one of the toughest math competitions (called AIME 2024) by missing one question and achieved 87.7% in the benchmark for expert-level science problems (called GPQA Diamond). For the most arduous math and reasoning challenges that typically hamper AI, o3 solved 25.2 percent of the problems (where no other model exceeds 2 percent).

a:hover]:text-gray-63 [&>a:hover]:shadow-underline-black dark:[&>a:hover]:text-grey-bd dark:[&>a:hover]:shadow-highlight-gray [&>a]:shadow-highlight-gray-63 dark:[&>a]:text-grey-bd dark:[&>a]:shadow-underline-grey”>OpenAI

The company also announced modern research on deliberative alignment, which requires an AI model to process security decisions step by step. So instead of simply giving the AI model yes/no rules, this paradigm requires it to actively consider whether a user’s request is consistent with OpenAI’s security policy. The company claims that by testing it on the o1, it was much better at adhering to security guidelines than previous models, including GPT-4.

The AI Sckool

Categories

OpenAI is teasing a modern reasoning model — but don’t expect to try it out any time soon

Learn how subsets of AI are used across industries

OpenAI trained o1 and o3 to “think” about their security policy

The race to translate animal sounds into human language

Prepare to navigate the rapidly evolving AI landscape

A substantial tongue-in-cheek: How SLM companies can beat their larger, resource-intensive cousins

More News

Gemini can now recognize when there is a PDF file on your phone screen

Here is the first CoPilot plus minicomputer with modern Intel Core Ultra 9 processors

TCL’s novel AI shorts range from cheesy comedy to existential horror

Music can thrive in the age of artificial intelligence

Learn how subsets of AI are used across industries

OpenAI trained o1 and o3 to “think” about their security policy

The race to translate animal sounds into human language