Meta Delivers Largest and Best Open Source AI Model Yet

Share

Back in April, Meta announced that it was working on something modern in the AI ​​industry: an open-source model that would perform on par with the best private models from companies like OpenAI.

Today, that model is out. Meta is releasing Llama 3.1, its largest open-source AI model ever, which the company says outperforms Anthropic’s GPT-4o and Claude 3.5 Sonnet in several benchmarks. It’s also making Meta’s Llama-based AI assistant available in more countries and languages, while also adding a feature that can generate images based on someone’s specific likeness. CEO Mark Zuckerberg now he predicts that by the end of this year Meta AI will be the most popular assistant, overtaking ChatGPT.

Llama 3.1 is significantly more complicated than the smaller Llama 3 models that came out a few months ago. The largest version has 405 billion parameters and was trained on over 16,000 of Nvidia’s ultra-expensive H100 GPUs. Meta doesn’t disclose the cost of developing Llama 3.1, but given the cost of the Nvidia chips themselves, it’s a safe and sound guess that it was hundreds of millions of dollars.

So, given the costs, why does Meta continue to hand out Llama with a license that only requires approval from companies with hundreds of millions of users? letter published on Meta’s corporate blogZuckerberg says open-source AI models will outpace and evolve over proprietary models faster, just as Linux became the open-source operating system that powers most of today’s phones, servers, and gadgets.

“The turning point in the industry where most developers start using open source software first and foremost”

He compares Meta’s investment in open-source AI to the earlier Open Compute Project, which he says saved the company “billions” by helping outside companies like HP improve and standardize Meta’s data center designs while the company built its own capabilities. Looking ahead, he expects the same energetic to play out in AI, writing, “I believe the release of Llama 3.1 will be a turning point in the industry, where most developers will start using open-source software first.”

To lend a hand get Llama 3.1 out into the world, Meta is working with more than two dozen companies, including Microsoft, Amazon, Google, Nvidia, and Databricks, to lend a hand developers deploy their own versions. Meta says it costs about half as much to run Llama 3.1 in production as OpenAI’s GPT-4o. It’s providing the model’s weights so companies can train it on custom data and fine-tune it to their liking.

a:hover]:text-gray-63 [&>a:hover]:shadow-underline-black dark:[&>a:hover]:text-gray-bd dark:[&>a:hover]:shadow-underline-gray [&>a]:shadow-underline-gray-63 dark:[&>a]:text-gray-bd dark:[&>a]:shadow-underline-gray”>Chart: Meta

a:hover]:text-gray-63 [&>a:hover]:shadow-underline-black dark:[&>a:hover]:text-gray-bd dark:[&>a:hover]:shadow-underline-gray [&>a]:shadow-underline-gray-63 dark:[&>a]:text-gray-bd dark:[&>a]:shadow-underline-gray”>Chart: Meta

Not surprisingly, Meta hasn’t said much about the data it used to train Llama 3.1. AI insiders say they’re keeping the information under wraps because it’s a trade secret, while critics say it’s a ploy to delay the inevitable flood of copyright lawsuits.

Meta will say it used synthetic data—that is, data generated by a model rather than humans—to make the 405-billion-parameter version of Llama 3.1 an improvement over the smaller 70-billion and 8-billion versions. Ahmad Al-Dahle, Meta’s vice president of generative AI, predicts Llama 3.1 will be popular with developers as a “teacher for smaller models that are then deployed” in a “more cost-effective way.”

When I ask if Meta agrees with growing consensus that the industry is running out of high-quality training data for models, Al-Dahle suggests that a plateau is approaching, although it may be further away than some think. “We definitely think we have a few more [training] “it runs,” he says. “But it’s hard to say.”

Meta’s first red teaming (or adversarial testing) of Llama 3.1 included looking for potential cybersecurity and biochemistry exploit cases. Another reason to test the model more intensively is what Meta describes as emerging “agent” behaviors.

For example, Al-Dahle tells me that Llama 3.1 is able to integrate with the search engine API to “retrieve information from the web based on a complex query and invoke multiple tools in sequence to perform tasks.” Another example he gives is asking the model to plot the number of homes sold in the United States over the past five years. “It can retrieve [web] search for you, generate Python code and execute it.”

Meta’s own implementation of Llama is its AI assistant, which is positioned as a general chatbot like ChatGPT and can be found on almost every part of Instagram, Facebook, and WhatsApp. Starting this week, Llama 3.1 will first be available via WhatsApp and the Meta AI website in the US, then on Instagram and Facebook in the coming weeks. It’s being updated to support modern languages, too, including French, German, Hindi, Italian, and Spanish.

While the most advanced Llama 3.1 405 billion parameter model is free on Meta AI, the assistant will switch you to the more narrow 70 billion model after an unspecified number of prompts in a given week. This suggests that the 405 billion model is too high-priced for Meta to run at full scale. Spokesperson Jon Carvill told me that the company will provide more information on the prompt threshold after evaluating early exploit.

Meta AI’s modern “Imagine Me” feature scans your face with your phone’s camera and then lets you insert your image into the images it generates. By capturing your image this way, rather than through your profile photos, Meta hopes to avoid creating a deepfake machine. The company sees demand from people who want to create more types of AI media and share it on their feeds, even if that means blurring the lines between what’s perceptibly real and what isn’t.

Meta AI will also be coming to the Quest headset in the coming weeks, replacing the voice command interface. Similar to the implementation in Ray-Ban’s Meta glasses, you’ll be able to exploit Meta AI in the Quest to identify and learn about what you’re looking at, in a passthrough mode on the headset that shows the real world through the display.

“I think the entire industry is still in the early stages of product-market fit.”

Aside from Zuckerberg’s prediction that Meta AI will be the most-used chatbot by the end of this year (ChatGPT has over 100 million users), Meta hasn’t yet revealed any usage numbers for its assistant. “I think the industry as a whole is still in the early stages of product-market fit,” Al-Dahle says. Even given how overhyped AI may already be, it’s clear that Meta and other players believe the race is just beginning.

Latest Posts

More News