Wednesday, January 15, 2025

Court records show that Meta executives were obsessed with defeating OpenAI’s GPT-4 internally

Share

According to them, executives and researchers leading Meta’s AI efforts were obsessed with beating OpenAI’s GPT-4 model while developing Llama 3. the internal messages were declassified by the court on Tuesday in one of the company’s ongoing AI copyright cases, Kadrey v. Meta.

“Honestly… our goal has to be GPT-4,” said Ahmad Al-Dahle, Meta’s vice president of Generative AI, in an October 2023 message to Meta researcher Hugo Touvron. “We have 64,000. GPUs! We must learn to build boundaries and win this race. “

While Meta provides open AI models, the company’s AI leaders have been much more focused on beating competitors that typically don’t open up their model weights, such as Anthropic and OpenAI, and instead lock them behind an API. Meta management and researchers have identified Anthropic’s Claude and OpenAI’s GPT-4 as the gold standard worth working towards.

French AI startup Mistral, one of Meta’s biggest open competitors, was mentioned several times in internal messages, but the tone was dismissive.

“Mistral is peanuts to us,” Al-Dahle wrote in a message. “We should have done better,” he said later.

Tech companies are currently racing to outshine each other with cutting-edge AI models, but these lawsuits show just how competitive Meta’s AI leaders really were – and apparently still are. At several points in the exchange, Meta’s AI leaders talked about how they were “very aggressive” in getting the right data to train Llama; at one point, the executive even said in a message to co-workers that “literally all I care about” is “Llama 3.”

Prosecutors in the case allege that Meta executives sometimes cut corners in their mad race to deliver artificial intelligence models while educating themselves on copyrighted books.

Touvron noted in the message that the combination of datasets used in Llama 2 was “bad” and talked about how Meta could apply a better combination of data sources to improve Llama 3. Touvron and Al-Dahle then talked about clearing the path to apply the dataset LibGen, which includes the copyrighted works of Cengage Learning, Macmillan Learning, McGraw Hill and Pearson Education.

“Do we have the right data sets there?[?]Al-Dahle said. “Is there something you’ve been wanting to use but can’t for some stupid reason?”

Meta CEO Mark Zuckerberg has previously said he is trying to close the performance gap between Llama’s AI models and the closed models of OpenAI, Google and others. Internal communications reveal there is a lot of pressure within the company to do just that.

“This year, Llama 3 competes with the most advanced models and is a leader in some areas,” Zuckerberg said in letter from July 2024. “We expect future Llama models to become the most advanced in the industry from next year.”

When Meta finally released Llama 3 in April 2024, the open AI model was competitive with Google’s leading closed models, OpenAI and Anthropic, and outperformed Mistral’s open options. However, the data Meta used to train its models – data that Zuckerberg reportedly greenlit despite its copyright status – is the subject of scrutiny in several ongoing lawsuits.

Latest Posts

More News