Cloud announced plans on Monday to launch a marketplace next year where website owners can sell AI modelers access to scrape their sites’ content. The marketplace is the latest step in a larger plan by Cloudflare CEO Matthew Prince to give publishers more control over how and when AI bots scrape their sites.
“If we don’t reward creators in any way, they’re going to stop creating, and that’s the problem that needs to be solved,” Prince told TechCrunch.
As a first step in its modern plan, Cloudflare on Monday launched a free monitoring tool for customers called AI Audit. Site owners will get a dashboard to view analytics about why, when, and how often AI models crawl their sites for information. Cloudflare will also let customers block AI bots on their sites with the click of a button. Site owners can block all web scrapers with AI Audit, or allow specific web scrapers if they have offers or believe their scraping is beneficial.
The AI Audit demo shared with TechCrunch shows how site owners can apply the tool. It can see where every scraper that visits your site is coming from, and it also offers select windows to see how many times scrapers from OpenAI, Meta, Amazon, and other AI model providers have visited your site.
Cloudflare is trying to solve a problem hanging over the AI industry: How will smaller publishers survive in the AI era if people apply ChatGPT instead of their website? Currently, AI model providers scour thousands of petite websites for information to feed their LLMs. While some larger publishers have struck deals with OpenAI to license content, most websites get nothing, but their content continues to be fed into popular AI models every day. This could disrupt many websites’ business models, cutting into traffic they desperately need.
Earlier this summer, AI-powered search startup Perplexity was accused of scraping websites that intentionally indicated they didn’t want to be indexed using the Robots Exclusion Protocol. Shortly after, Cloudflare rolled out a button to give customers the ability to block all AI bots with a single click.
“It was an expression of the frustration we were hearing where people felt their content was being stolen,” Prince said.
Some website owners told Business Insider that AI bots are crawling their sites so much, it was like a DDoS attack paralyzed their servers. Having your site scraped can not only be annoying, but it can literally enhance your cloud bill and impact your service.
But what if you want to block Perplexity bots but not OpenAI? Prince tells TechCrunch that Cloudflare customers are asking for tools that let them choose which AI models have access to their sites. Cloudflare’s modern tools, launching today, will let customers block some AI bots while allowing others through.
Even immense publishers that have licensing deals with OpenAI—such as TIME, Condé Nast, and The Atlantic—have relatively little insight into how much ChatGPT is scraping their sites, according to Prince. Many of them have to accept what OpenAI tells them, but the answer determines whether publishers get a good licensing deal or not.
Cloudflare’s marketplace, which will launch next year, is designed to enable smaller publishers to strike deals with AI model providers.
“Let’s give you all the ability to do what only Reddit and Quora and the big publishers in the world have done before,” Prince said. “What if we let you set, in practice, a price for accessing and downloading content to these systems.”
While it’s a bold idea, Cloudflare isn’t sharing a fully fleshed-out idea of what its marketplace will look like. Prince says websites could charge AI model providers based on the rates at which they scrape individual websites, but it’s unclear how much they’d actually pay. He also says websites could charge a monetary fee for scraping, or simply ask AI labs to give them credit. The details are unclear.
While AI companies may initially be reluctant to pay for content they currently get for free, the Cloudflare CEO says he believes it will ultimately be good for the AI ecosystem. Prince says the current landscape, where some AI companies never pay for content, is not sustainable.