Thursday, December 26, 2024

Should artists be paid for training data? OpenAI’s vice president wouldn’t say

Share

Should artists whose work was used to train generative AI, such as ChatGPT, be compensated for their contributions? Peter Deng, vice president of consumer products at OpenAI — the makers of ChatGPT — declined to answer when asked on the SXSW main stage this afternoon.

“That’s a great question,” he said when SignalFire venture partner (and former TechCrunch writer) Josh Constine, who interviewed Deng extensively, asked the question. Some in the crowd of onlookers shouted “yes” in response, which Deng confirmed. “I hear from the audience that yes. I hear from the audience that yes.”

That Deng dodged the question is not surprising. OpenAI is in a fragile legal position when it comes to how it uses data to train generative artificial intelligence systems, such as the DALL-E 3 art creation tool that is incorporated into ChatGPT.

Systems like DALL-E 3 are trained on a huge number of examples – graphics, illustrations, photos, etc. – usually from public websites and datasets on the Internet. OpenAI and other generative AI providers argue that fair exploit – a legal doctrine that allows the exploit of copyrighted works to create a derivative work as long as it is transformative in nature – protects their practice of taking public data and using it for training purposes without compensation or even recognition artists.

In fact, OpenAI recently argued that it is impossible to create useful AI models without copyrighted material. “Training artificial intelligence models using publicly available online materials is fair use, supported by long-standing and widely accepted precedents,” the company writes in a January report blog post. “We believe this rule is fair to creators, essential to innovators, and critical to U.S. competitiveness.”

The developers, not surprisingly, disagree.

A class action lawsuit brought by artists, including Grzegorz Rutkowski, known for his work on Dungeons & Dragons and Magic: The Gathering, against OpenAI and several of its rivals (Midjourney and DeviantArt) is going to court. Defendants claim that tools such as DALL-E 3 and Midjourney replicate artists’ styles without their express consent, allowing users to create recent works resembling the artists’ originals, for which the artists receive no compensation.

OpenAI has licensing agreements with some content providers, such as Shutterstock, and allows webmasters to block their web crawler from pulling training data from their site. Additionally, like some of its rivals, OpenAI allows artists to “opt out” and remove their work from the datasets the company uses to train its image-generating models. (Some artists are like that described However, the opt-out tool, which requires you to submit an individual copy of each photo for removal along with a description, is cumbersome.)

Deng said he thinks artists have more freedom to create and exploit generative artificial intelligence tools like DALL-E, but he’s not sure exactly what form that might take.

“[A]artists need to be a part [the] ecosystem as much as possible,” Deng said. “I believe that if we can find a way to speed up the flywheel of art creation, we will really lend a hand the industry a little more… In a sense, every artist has been inspired by artists who have come before them, and I wonder how much of that will be accelerated by this.

Latest Posts

More News