Wednesday, December 25, 2024

OpenAI Unveils GPT-4o Mini, Smaller, Cheaper AI Model

Share

OpenAI introduced the GPT-4o mini on Thursday, its latest diminutive AI model. The company says GPT-4o miniwhich is cheaper and faster than OpenAI’s current state-of-the-art AI models, is rolling out to developers, as well as through the ChatGPT web and mobile app for consumers, starting today. Enterprise users will have access next week.

The company says the GPT-4o mini outperforms industry-leading diminutive AI models on reasoning tasks involving text and vision. As diminutive AI models evolve, they are becoming increasingly popular with developers due to their speed and cost-effectiveness compared to larger models like the GPT-4 Omni or Claude 3.5 Sonnet. They are a useful option for vast, elementary tasks that developers may repeatedly call on the AI ​​model to perform.

The GPT-4o mini will replace the GPT-3.5 Turbo as the smallest model OpenAI offers. The company says its latest AI model scores 82% in MMLU, a benchmark for measuring reasoning, compared to 79% for the Gemini 1.5 Flash and 75% for the Claude 3 Haiku, according to data from Artificial analysisOn the MGSM test, which measures mathematical reasoning, the GPT-4o mini scored 87%, compared to 78% for Flash and 72% for Haiku.

Chart comparing diminutive AI models with Artificial Analysis. The price here is a combination of input and output tokens.
Image sources: Artificial analysis

In addition, OpenAI claims that the GPT-4o mini is significantly cheaper to run than its previous Frontier models and over 60% cheaper than the GPT-3.5 Turbo. The GPT-4o mini currently supports text and vision in the API, and OpenAI says the model will support video and audio capabilities in the future.

“For every corner of the world to benefit from AI, we need to make models much more affordable,” OpenAI’s Product API lead Olivier Godement told TechCrunch. “I think GPT-4o mini is a really big step forward in that direction.”

For developers building on the OpenAI API, GPT4o mini costs 15 cents per million input tokens and 60 cents per million output tokens. The model has a context window of 128,000 tokens, about the length of a book, and a knowledge cutoff in October 2023.

OpenAI didn’t reveal exactly how massive the GPT-4o mini is, but said it’s roughly in the same category as other diminutive AI models like the Llama 3 8b, Claude Haiku, and Gemini 1.5 Flash. However, the company claims that the GPT-4o mini is faster, more economical, and smarter than the industry’s leading diminutive models, based on pre-launch testing in the LMSYS.org chatbot arena. Early independent tests seem to confirm that.

“Compared to comparable models, GPT-4o mini is very fast, with an average output speed of 202 tokens per second,” George Cameron, co-founder of Artificial Analysis, told TechCrunch in an email. “That’s more than 2X faster than GPT-4o and GPT-3.5 Turbo, and presents a compelling offering for speed-sensitive use cases, including many consumer applications and agent-based approaches to using LLM.”

Separately, OpenAI on Thursday announced up-to-date tools for enterprise customers. blog postOpenAI announced the launch of its Enterprise Compliance API to lend a hand organizations in highly regulated industries such as finance, healthcare, legal services, and government comply with data logging and auditing requirements.

The company says these tools will allow administrators to audit and take action on ChatGPT Enterprise data. The API will provide time-stamped records of interactions, including conversations, file uploads, workspace users, and more.

OpenAI is also giving admins more granular control over Workspace GPT, a custom version of ChatGPT built for specific business exploit cases. Previously, admins could only fully allow or block GPT activities created in their workspace, but now workspace owners can create an approved list of domains that GPTs can interact with.

Latest Posts

More News