GPT-5 did not succeed

Last week, on the day of the premiere of the GPT-5, and Hype was at the highest level.

In an earlier press briefing, the general director of Opeli, Altman himself, said that GPT-5 is “something that I just want to never come back”, a milestone similar to the first iPhone with a retina display. The night before the announcement of Livestream, Altman Published Picture of the Death Star, building even more noise. On x, one user wrote that the expectation “seems Christmas.” All eyes were on chatgpt-creators, when people in different industries waited to see if the advertisement would deliver or disappoint. And according to most reports, there will be no great disclosure.

Hype has been built for the long-term OPENAI model for years or release of the GPT-4 in 2023. In Reddit AMA with Altman and staff in October last year, users constantly asked about the release date of the GPT-5, looking for details about its function and what would stand out. One Redditor asked: “Why does GPT-5 last so long?” Altman replied that the calculation was a limitation and that “all these models became quite complex and we cannot send as many things as much as we would like.”

But when the GPT-5 appeared in ChatgPT, users were largely an impression. The significant progress they expected seemed to be mostly incremental, and the key benefits of the model were in areas such as cost and speed. In the long run, this can be a solid financial plant for Opeli – though less flashy.

People expected the world of GPT-5. (One user X. Published That after the star post of Altman “everyone has moved expectations”), and OpenAI does not underestimate these projections, Calling GPT-5 His “best AI system” and “significant stroke of intelligence” with “the most modern performance in coding, mathematics, writing, health, visual perception and others.” Altman said in a press briefing that a conversation with the model “wants to talk to an expert at a doctoral level.”

This noise is a clear contrast with reality. For example, a model with intelligence at a doctoral level, many times insist How did some social media users say, in which there were three “B” Blueberry? And whether that not be able to identify How many names does the letter “R” contain? Or that Incorrect label American map with imaginary states, including “New Jefst”, “Micann”, “New Nakama”, “Krizona” and “Mirinia” and Nevada label as an extension of California? People who used a bot for emotional support found a recent raw and distant system, protesting so loudly that Opennai supported the older model. MEMY LIKE – ONE presenting GPT-4 and GPT-4O as powerful GPT-5 dragons next to them as elementary.

The court of expert public opinion did not forgive either. Gary Marcus, a leading voice in the AI industry and a retired professor of psychology at Novel York University, Called model “Similar, exaggerated and disappointing.” Peter Wildeford, co -founder of the AI Policy and Strategy Institute, wrote In its review “Is this a huge partition we were looking for? Unfortunately, no”. Zvi Mowshowitz, a popular blogger from the AI industry, He called it “A good but not great model.” One Redditor on the official GPT-5 Reddit AMA wrote“Someone says 5 is hot rubbish.”

On the days after the GPT-5 release, the attack of dissatisfied reviews was slightly softened. The overall consensus is that although the GPT-5 was not as significant as people expected, it offered the improvements of costs and speed, as well as less hallucinations, and the switch system that offered-authatically directing the inquiry to the facilities to the model that made the most sense to answer, so you don’t have to decide-it was all. Altman leaned in this narrative, writing“GPT-5 is the smartest model we have ever done, but the most important thing we sought about is real utility and mass availability/affordability.”

Openai researcher Christina Kim Published WX in GPT-5: “A true story is usefulness. It helps in what people care-shipping, creative writing and navigation in health information-greater stability and smaller friction. We also put hallucinations. Better calibrated, says” I don’t know, “separates the facts from assumption, and can justify quotation when you want.”

There is a widespread understanding that, speaking, GPT-5 simply made chatgPT to be less persuasive. Viral posts in social media complained that the recent model lacked nuances and depth in his writing, approaching the robot and cool. Even in its own GPT-5 marketing materials, the OPENAI comparison on the GPT-4O side and wedding ridges generated by the GPT-5 does not seem to be unchanged for the recent model-I used it from 4o. When Altman He asked Redditors If they thought that the GPT-5 was better to write, he met him with an attack of defensive comments of the retired GPT-4O model; In one day he agreed to pressure and at least temporarily turned him to chatgpt.

But there is one front in which the model seems to be brighter: coding. One GPT-5 iteration Currently peaks The most popular board of AI leaders in the coding category, and Claude from Anthropik took second place. Promotion of the premiere of OPENAI was shown by games generated by AI (mini-rolling ball and a racing on a typewriter), a tool for pixel art, a drum simulator and a LOFI visualizer. When I tried to encode the game coding with a tool puzzle, she had several defects, but he was successful with simpler projects, such as an interactive embroidery lesson.

This is a great win for OPENAI, because for a long time for a long time has been going among the war encoding AI with competitors such as Antropic, Google and others. Companies are ready to spend a lot on coding of artificial intelligence, and this is one of the most realistic revenue generators for start-ups fired.

Opeli also emphasized the efficiency of GPT-5 in healthcare, but it remains mostly unverified in practice-we will probably not know how it went.

AI benchmarks mean less and less in recent years, because they often change, and some companies choose the results they reveal. But generally they can give us a reasonable GPT-5 picture. The model worked better than its predecessors in many industry tests, but according to many people in the industry. As Wildford Place it“When it comes to formal assessments, it seems that the GPT-5 was largely what you can expect-incremental increases, and not anything worthy of a vague meme of the death star.”

But if the latest story has something to say, these petite, incremental increases may be more likely that it will translate into a specific profit than the delight of individual consumers. Artificial intelligence companies know that their greatest opportunities to earn money are company customers, government agreements and investments, and incremental pushing solid reference points, as well as investing in hallucinations of coding and fighting, are the best way to get more of all three.

Follow topics and authors From this story to see more in the personalized main page channel and receive E -Mail updates.

Hayden Field

Categories

GPT-5 did not succeed

How much electrolytes should you take and can you take too much?

Evaluating hallucinations in a language model with GraphEval

I tried out OpenAI’s fresh AI keyboard – which will be fun for some developers and a bit mysterious for everyone else

One of NASA’s most significant deep space observatories affected by the Spanish fires

Chinese companies sell vaporizers containing chemicals potentially stronger than nicotine

More News

What’s going on with Alexa+?

The winter storm tested power grids that are strained to accommodate AI data centers

Google DeepMind employees ask leaders to ensure their “physical safety” from ICE

Google Photos now lets you describe how to turn images into videos

How much electrolytes should you take and can you take too much?

Evaluating hallucinations in a language model with GraphEval

I tried out OpenAI’s fresh AI keyboard – which will be fun for some developers and a bit mysterious for everyone else