What the RIAA lawsuits mean for artificial intelligence and copyright

Share

Udio and Suno are not, despite their name, the hottest modern restaurants on the Lower East Side. These are AI-powered startups that allow people to generate impressively realistic-sounding songs – complete with instrumentation and vocal performances – based on prompts. On Monday, a group of major record labels sued them, alleging copyright infringement “on an almost unimaginable scale,” arguing that the companies can only do so because they illegally gobbled up immense amounts of copyrighted music to train their artificial intelligence models.

These two lawsuits contribute to the growing legal problems in the artificial intelligence industry. Some of the most successful companies in this field have trained their models on data obtained through the platform Scraping not allowed huge amount of information from the Internet. ChatGPT, for example, he was initially trained on millions of documents collected from links posted on Reddit.

The lawsuits, initiated by the Recording Industry Association of America (RIAA), concern music, not the written word. But as Novel York Timesin their lawsuit against OpenAI, they pose a question that could change the technological landscape as we know it: Can AI companies just take whatever they like, turn it into a billion-dollar product, and claim it’s fair employ?

“This is a key issue that needs to be addressed because it affects different industries,” he said. Paweł Faklerpartner at the Mayer Brown Law Firm specializing in intellectual property matters.

Both Udio and Suno are quite modern, but have already made a splash. Suno was launched in December by a Cambridge team that previously worked for Kensho, another artificial intelligence company. It quickly formed a partnership with Microsoft that integrated Suno with Copilot, Microsoft’s AI chatbot.

Udio was only launched this year, raising millions of dollars from large players in the world of technology investing (Andreessen Horowitz) and the world of music (for example Will.i.am and Common). The Udio platform was used by comedian King Willonius to generate “BBL Drizzy,” a Drake diss track that went viral after producer Metro Boomin remixed it and released it publicly for anyone to rap on.

In the lawsuit, the RIAA uses lofty language, claiming that this action is about “ensuring that copyright continues to encourage human inventiveness and imagination, as it has for centuries.” This sounds good, but ultimately the incentive in question is money.

The RIAA says generative AI poses risks to the record labels’ business model. “Instead of licensing copyrighted sound recordings, potential licensees interested in licensing such recordings for their own purposes could generate AI-like sound at virtually no cost,” the lawsuit states, adding that such services could “[flood] market via “copycats” and “lookalikes,” thus upending the established sample licensing business.”

The RIAA is also seeking damages of $150,000 for each infringing work, which, given the immense datasets typically used to train artificial intelligence systems, is a potentially astronomical number.

The RIAA lawsuits included examples of music generated with Suno and Udio and comparisons of their musical notations to existing copyrighted works. In some cases, the generated songs contained similar brief phrases – for example, one began with the line “Jason Derulo” in the exact same rhythm that the real Jason Derulo begins many of his songs with. Others had extended sequences with similar notation, as in the case of a song inspired by Green Day’s “American Idiot”.

One song began with the line “Jason Derulo” being sung in the exact rhythm that the real Jason Derulo begins many of his songs with

This it seems pretty damning, but the RIAA isn’t alleging that these particular similar-sounding songs infringe copyright — rather, it’s alleging that the AI ​​companies used copyrighted music as part of their training data.

Neither Suno nor Udio have publicly released their training datasets. Both companies are vague about the sources of their training data—although that is normal in the AI ​​industry. (OpenAI, for example, has avoided questions whether YouTube videos were used to train the Sora video model.)

The RIAA lawsuit notes that Udio CEO David Ding stated that the company trains with the “highest quality” music that is “publicly available,” and that Suno’s co-founder wrote on Suno’s official Discord that the company trains using ” proprietary connections and public data.”

Fakler said the inclusion of marking examples and comparisons in the lawsuit is “bizarre,” saying it goes “well beyond” what would be necessary to provide a valid basis for the lawsuit. First, the labels cannot own the rights to the compositions of the songs allegedly used by Udio and Suno for training purposes. Rather, they own the copyright to the sound recording, so showing similarity in the musical notation does not necessarily help in a copyright dispute. “I think it was really designed with optics and PR purposes in mind,” Fakler said.

Moreover, as Fakler noted, it is legal to create sound recordings that sound similar if you have the rights to the source work.

When reached for comment, a Suno spokesperson shared a statement from CEO Mikey Shulman, saying its technology is “transformational” and that the company does not allow suggestions that include the names of existing artists. Udio did not respond to a request for comment.

But even if Udio and Suno used record labels’ copyrighted works to train their models, there is a very significant question that may overshadow everything else: is it fair employ?

Fair use is a legal defense that allows you to use copyrighted material to create a substantially new or transformed work. The RIAA argues that startups cannot invoke fair use, claiming that the Udio and Suno products are intended to replace real recordings, that they are generated for commercial purposes, that the copying was extensive rather than selective, and lastly that they were created in In this way, the product poses a direct threat to the factory’s operations.

According to Fakler, startups have a solid fair employ case, provided that the copyrighted works were only copied temporarily and their defining features were isolated and abstracted into the AI ​​model’s weights.

“It’s bringing it all out in the same way a musician learns it by playing music.”

“That’s how computers work – they have to make these copies and then analyze all this data to isolate the non-copyrighted content,” he said. “How to construct songs that will be understood by the listener as music and that will have various features that are commonly found in popular music? It’s bringing it all out in the same way a musician learns it by playing music.”

“In my opinion, this is a very strong fair use argument,” Fakler said.

Of course, the judge or jury may disagree. And what is uncovered in the discovery process – if these lawsuits come to fruition – could have a big impact on the case. Which pieces of music were downloaded and how they ended up in the training set can be important, and details about the training process can undermine the fair use defense.

There will be a very long journey ahead for all as RIAA and similar lawsuits will be heard in the courts. From text and photos to audio recordings, the issue of fair employ arises in all of these cases and throughout the AI ​​industry.

Latest Posts