After striking deals with Google and OpenAI, Reddit CEO Steve Huffman is calling on Microsoft and others to pay up if they want to continue collecting data from the service.
“Without these agreements, we have no say or knowledge in how our data is viewed and what it’s used for, which has put us in a position now of blocking people who have not been willing to come to terms with how we would like our data to be used or not used,” Huffman said in an interview this week. He singled out Microsoft, Anthropic and Perplexity specifically for refusing to negotiate, saying blocking those companies has been “really intrusive.”
Reddit has been ramping up its crackdown on crawlers in recent months. In early July, its robots.txt file was updated to block web crawlers it doesn’t have agreements with. That’s when people started noticing that Reddit’s results were only perceptible in Google results — where Reddit gets paid to display its data — and not on other search engines like Bing.
Huffman said Microsoft uses Reddit data to train its AI and summarize its content in Bing results “without telling us,” and that Reddit data has also been sold through Bing’s API to other search engines. In the interview, he referenced Microsoft AI CEO Mustafa Suleyman’s recent comment at a conference that public data on the internet is “freeware.”
“Microsoft, Anthropic and Perplexity have acted as if all content on the Internet is free to them,” Huffman said. “That is their true position.”
In response to the recent disappearance of Reddit search results from Bing, Microsoft’s head of search, Jordi Ribas, he said on X that “Reddit blocked Bing from indexing their site for search purposes, favoring another search engine and impacting competition from Bing and search engines based on Bing.” Microsoft spokeswoman Caitlin Roulston said separately Edge last week that “we comply with the guidelines of websites that do not want content on their sites used in our generative AI models.”
“The traditional exchange of value in search engines has changed”
Huffman pointed to OpenAI’s recent announcement of SearchGPT, which will be able to display Reddit results thanks to a deal the two companies struck earlier this year, as a model he wants to replicate. None of the content licensing deals Reddit has struck so far cover exclusive uses of its data, according to spokesman Tim Rathschmidt.
In calling for licensing agreements, Reddit joins the ranks of established media publishers (including The Verge is parent company, Vox Media) to get paid to feed their content into generative AI. “I think the traditional value exchange between search engines has changed,” Huffman said. “Search, summarization, and training are all merging, and the exchange of indexing value for return traffic is getting opaque.”
Spokespeople for Microsoft, Anthropic and Perplexity did not comment on this article by the time of publication.
