OpenAI releases cheaper and smarter model

Share

OpenAI is releasing a lighter, cheaper DIY model for developers, called the GPT-4o Mini. It costs significantly less than the full-size models and is said to be more competent than GPT-3.5.

Building apps using OpenAI models can be prohibitively costly. Developers who don’t have the resources to tinker with them can be completely sidelined, and may opt for cheaper models like Google’s Gemini 1.5 Flash or Anthropic Claude 3 Haiku. Now OpenAI is entering the lightweight model game.

“I think GPT-4o Mini really delivers on OpenAI’s mission to make AI more accessible to people. If we want AI to benefit every corner of the world, every industry, every application, we need to make AI much more affordable,” said Olivier Godement, who leads the platform’s API product The Verge.

Starting today, ChatGPT users on Free, Plus, and Team plans can exploit GPT-4o Mini instead of GPT-3.5 Turbo, with Enterprise users getting access next week. This means GPT-3.5 will no longer be an option for ChatGPT users, but it will still be available to developers via the API if they don’t want to switch to GPT-4o Mini. Godement said GPT-3.5 will be retired from the API at some point—they’re just not sure when.

“I think it will be very popular,” Godement said.

The fresh lightweight model will also support text and image APIs, and the company says it will soon support all multimodal inputs and outputs, such as video and audio. With all these capabilities, it could look like more capable virtual assistants that can understand your travel plans and make suggestions. However, this model is designed for straightforward tasks, so no one is building Siri for pennies.

This fresh model scored 82 percent on the Measuring Massive Multitask Language Understanding (MMLU) test, a test exam consisting of about 16,000 multiple-choice questions from 57 academic subjects. When the MMLU was first introduced in 2020, most models performed quite poorly on it, which was a goal because the models had become too advanced for previous test exams. GPT-3.5 scored 70 percent on this test, GPT-4o scored 88.7 percent, and Google says Gemini Ultra have the highest score in history 90 percent. For comparison, competitive models Claudius 3 Haiku AND Gemini 1.5 Flash obtained 75.2 percent and 78.9 percent, respectively.

It is worth noting that researchers are wary of benchmarks such as MMLU because the way they are conducted varies slightly from company to company. This makes it challenging to compare the results of different models because New York Times reported. There’s also the problem that the AI ​​potentially has these answers in its dataset, which essentially allows it to cheat, and there’s typically no outside evaluators involved in the process.

For developers eager to build low-cost AI apps, the launch of GPT-4o Mini gives them another tool to add to their inventory. OpenAI let financial technology startup Ramp test the model by using GPT-4o Mini to build a tool that extracts expense data from receipts. So instead of laboriously wading through text fields, a user can upload a photo of their receipt and the model will sort everything for them. Superhuman, an email client, also tested GPT-4o Mini and used it to build an auto-suggestion feature for email replies.

The goal is to provide something lightweight and affordable for developers to build all the applications and tools they couldn’t afford with a larger, more costly model like GPT-4. Many developers would turn to Claude 3 Haiku or Gemini 1.5 Flash before paying the eye-watering compute costs required to run one of the most strong models.

So why did it take OpenAI so long? Godement said it was “pure prioritization,” as the company was focused on building bigger and better models like GPT-4, which required a lot of “human and computational effort.” Over time, OpenAI noticed a trend among developers eager to exploit smaller models, so the company decided it was time to invest its resources into building GPT-4o Mini.

“I think it’s going to be very popular,” Godement said. “Both among existing apps that are using all the AI ​​in OpenAI and among a lot of apps that were previously released on the price list.”

Latest Posts

More News