Here's How OpenAI Will Determine How Productive Its AI Systems Are

Share

OpenAI has created an internal scale that will track the progress its enormous language models are making toward artificial general intelligence, or AI with human-like intelligence, a spokesman said. he said Bloomberg.

Today’s chatbots, like ChatGPT, are at Level 1. OpenAI says it’s approaching Level 2, defined as a system that can solve basic problems at the level of a Ph.D. user. Level 3 refers to AI agents that can take action on behalf of the user. Level 4 includes AI that can create fresh innovations. Level 5, the final step toward AGI, is AI that can do the work of entire organizations of people. OpenAI previously defined AGI as “a highly autonomous system that outperforms humans at the highest-value economic tasks.”

OpenAI’s unique structure revolves around the mission of achieving AGI, and how OpenAI defines AGI is vital. The company has stated that “if a value-driven, security-conscious project comes close to building AGI” before OpenAI does, it pledges not to compete with the project or drop everything to aid. OpenAI’s wording of this charter is unclear, leaving room for judgment from a for-profit entity (run by a nonprofit), but the scale at which OpenAI can test itself and its competitors could aid to more clearly define when AGI will be achieved.

Still, AGI is a long way off: It would take billions of dollars of computing power to achieve AGI, if ever. The timelines of experts and even OpenAI vary widely. In October 2023, OpenAI CEO Sam Altman he said “it took five years, more or less” before we reached AGI.

This fresh rating scale, although still in development, was introduced a day after OpenAI announced collaboration with Los Alamos National Laboratory to explore how advanced AI models like GPT-4o can safely aid bioscience research. The program manager at Los Alamos, who is responsible for the national security biology portfolio and was key to securing the OpenAI partnership, said Edge that the goal is to test the capabilities of GPT-4o and establish a set of security and other factors for the U.S. government. Ultimately, public or private models can be tested against these factors to evaluate their own models.

In May, OpenAI disbanded its security team after the group’s leader, OpenAI co-founder Ilya Sutskever, left the company. Jan Leike, a key researcher at OpenAI, resigned shortly after, saying in a post that “security culture and processes have taken a back seat to shiny products” at the company. While OpenAI has denied that this was the case, some worry what that means if the company does indeed achieve AGI.

OpenAI did not provide details on how models were assigned to these internal tiers (and declined Edge(request for comment). However, company leaders demonstrated a research project using the GPT-4 AI model during an all-hands meeting on Thursday and believe the project demonstrates some fresh skills that demonstrate human reasoning, according to Bloomberg.

This scale could aid provide a strict definition of progress, rather than leaving it up to interpretation. For example, OpenAI CTO Mira Murati said: in an interview in June that the models in its labs aren’t much better than what the public already has. Meanwhile, CEO Sam Altman he said at the end of last year that the company recently “pushed back the veil of ignorance,” meaning the models are much smarter.

The AI Sckool

Categories

Here’s How OpenAI Will Determine How Productive Its AI Systems Are

Penalties: Does the team that kicks first have a better chance of winning?

3 questions: Beyond data-driven aesthetics

Almost anyone can now sell you GLP-1 on the Internet

7 Real Python Projects You Can Build in 2026 (with Guides)

Start building with Nano Banana 2 Lite and Gemini Omni Flash

More News

What’s going on with Alexa+?

The winter storm tested power grids that are strained to accommodate AI data centers

Google DeepMind employees ask leaders to ensure their “physical safety” from ICE

Google Photos now lets you describe how to turn images into videos

Penalties: Does the team that kicks first have a better chance of winning?

3 questions: Beyond data-driven aesthetics

Almost anyone can now sell you GLP-1 on the Internet