Opeli says that the GPT-5 is applied to people in a wide range of work

Share

Opeli has released a novel one benchmark On Thursday, how his AI models work compared to professionals from a wide range of industries and jobs. The test, PDPVAL, is an early attempt to understand how close Openai systems are exceeding people in the valuable economic matter – the key part of the mission of the founding company consisting in developing artificial general intelligence or Aga.

Opeli claims that its GPT-5 and Claude Opus 4.1 model from Anthropik “are already approaching the quality of work produced by industry experts.”

This does not mean that OPENAI models will start to replace people at work immediately. Despite the forecasts of some presidents And will take up the work of people in a few years, Opeli admits that PDPVAL today includes a very narrow number of tasks performing in their real work. However, this is one of the latest ways in which the company measures the progress of AI towards this milestone.

PDPVAL is based on nine industries that most contribute to the American gross domestic product, including domains such as healthcare, finance, production and government. Benchmark tests the performance of the AI model in 44 competitions among these industries, from software engineers to nurses to journalists.

In the first version of the OPENAI test, GDPVAL-V0, OpenAI asked experienced professionals to compare reports generated by AI with other professionals, and then choose the best. For example, one of the quick asked investment bankers to create a landscape of competitors for the last mile delivery industry and compare them with reports generated by AI. Openai then average AI “wins” in relation to human reports in all 44 competitions.

In the case of the GPT-5-in-one version of the GPT-5 with additional computing force, the company claims that the AI model has been considered better than or on an equal footing with industry experts in 40.6% of cases.

Opeli also tested the Claude Opus 4.1 model from Anthropica, which was better than or on a par with industry experts in 49% of tasks. Opeli says that Claude has gained such a high rating because of his tendency to create pleasant graphics, not pure performance.

TechCrunch event

San Francisco
|.
October 27-29 2025

Image loans:Openai

It is worth noting that most of the working professionals do much more than sending research reports to their boss, which is everything to which PDPval-V0 is testing. Opeli confirms this and says that in the future it plans to create more solid tests that can take into account more industries and interactive work flows.

Nevertheless, the company perceives progress in PDPVAL as noteworthy.

In an interview with TechCrunch, the main economist Opeli, Dr. Aaron Chatterja, said that GDPVAL results suggest that people in these works can now employ AI models to spend time on more significant tasks.

“[Because] The model becomes good in some of these things, “says Chatterja -” People in these tasks can now use the model, more and more often when the possibilities become better to relieve part of their work and perform a potentially higher value. “

Openai’s assessments lead this Patwardhan, says Techcrunch that she encourages her to the pace of progress in PDPVAL. The GPT-4O Openai model obtained only 13.7% (wins and ties compared to people), which was published about 15 months ago. Now GPT-5 results almost triple that Patwardhan’s trend expects to continue.

The Silicon Valley has a wide range of comparative tests, which it uses to measure the progress of AI models and assess whether a given model is the most contemporary. Among the most popular are AIME 2025 (test of competitive mathematical problems) and GPQA Diamond (test of scientific questions at doctoral level). However, several AI models are approaching the saturation of some of these comparative tests, and many AI researchers quoted the need for better tests that can measure AI’s proficiency on real tasks.

Benchmarks, such as PDPVAL, can become more and more critical in this conversation, because OpenAI claims that his AI models are valuable for a wide range of industries. But OpenAI may need a more comprehensive test version to finally say that his AI models can surpass people.

The AI Sckool

Categories

Opeli says that the GPT-5 is applied to people in a wide range of work

5 useful Python scripts to automate exploratory data analysis

Sleep apnea often goes undetected in women. This is starting to change

Anthropic’s contract with the Pentagon is a warning to startups chasing federal contracts

When AI companies go to war, security gets left behind

5 Powerful Python Decorators for Optimizing LLM Applications

More News

Anthropic’s contract with the Pentagon is a warning to startups chasing federal contracts

OpenAI launches GPT-5.4 in Pro and Thinking editions

ChatGPT’s novel GPT-5.3 Instant model will no longer tell you to serene down

ChatGPT uninstalls increased by 295% after the DoD agreement

5 useful Python scripts to automate exploratory data analysis

Sleep apnea often goes undetected in women. This is starting to change

Anthropic’s contract with the Pentagon is a warning to startups chasing federal contracts