Friday, March 6, 2026

OpenAI launches GPT-5.4 in Pro and Thinking editions

Share

on Thursday, OpenAI has released GPT-5.4novel entry-level model, touted as “our most efficient and productive pioneering professional model.” In addition to the standard version, GPT-5.4 is also available as a reasoning model (GPT-5.4 Thinking) or optimized for high performance (GPT-5.4 Pro).

The API version of the model will be available with context windows of up to 1 million tokens, which is by far the largest context window available on OpenAI.

OpenAI also placed an emphasis on improving token performance, claiming that GPT-5.4 was able to solve the same problems with significantly fewer tokens than its predecessor.

The novel model delivers significantly improved benchmark results, including record-breaking results in the OSWorld-Verified and WebArena Verified computer usage tests. The novel model also scored a record 83% on the OpenAI PKBval test for knowledge work tasks.

GPT-5.4 also took the lead in Mercor’s APEX-Agents benchmark, designed to test professional skills in law and finance, according to a statement from Mercor CEO Brendan Foody.

“[GPT-5.4] specializes in creating long-term products such as slide presentations, financial models and legal analysis,” Foody said in a statement, “delivering superior performance, operating faster and at lower costs than competitive, pioneering models.”

GPT-5.4 continues the company’s efforts to reduce hallucinations and factual errors. OpenAI found that the novel model had a 33% lower risk of errors in individual claims compared to GPT 5.2, and an overall error risk of 18% lower.

Techcrunch event

San Francisco, California
|
October 13-15, 2026

As part of the launch, OpenAI has overhauled the way API version GPT-5.4 manages tool invocations, introducing a novel system called Tool Search. Previously, system prompts presented definitions of all available tools when invoking a model – a process that could consume many tokens as the number of available tools increased. The novel system allows models to search for tool definitions as needed, resulting in faster and cheaper requests in systems with many tools available.

OpenAI is also included new security assessment test the chain of thought of your models, the running commentary given by the models to show the thought process through multi-step tasks. AI security researchers have long worried that reasoning models might misrepresent their thinking tests show this can happen under the right circumstances.

OpenAI’s novel assessment shows that the risk of fraud is lower in Thinking GPT-5.4, “suggesting that the model is unable to conceal its reasoning and that CoT monitoring remains an effective security tool.”

Latest Posts

More News