Opeli used Subreddit, R/ChangemyViewto create a measuring test of convincing AI reasoning models. The company revealed this on the system card-document describing how the AI-Stostal system works with the recent “reasoning” model, O3-Mini, on Friday.
Millions of Reddit users are members of R/ChangemyView, where they publish Balmy, hoping to learn about other points of view on a given topic. In response to these warm shots, other users correspond to convincing arguments explaining why the original poster is bad.
Subreddit is one of the many Reddit forums that are basically a gold mine for technology companies, such as OpenAI, which want to train AI models on high quality, man -generated data.
Opeli claims that he collects users’ posts from R/ChangemyView and asks his models AI to write answers in a closed environment that would change Reddit’s mind on a given topic. Then the company shows the answers to testers who assess how convincing the argument is, and finally Openai compares AI models to human answers to the same post.
ChatgPT-Maker has a licensing agreement with Reddit, which allows OpenAI to train posts from Reddit users and display these posts in their products. We don’t know what OpenAi pays for this content, but apparently Google pays reddit $ 60 million a year under a similar contract.
However, OPENAI informs TechCrunch that the grangemyView assessment is not related to the Reddit agreement. It is not clear how Opennai gained access to Subreddit data, and the company claims that there are no plans to publish this assessment of the audience.
While the BangemyView benchmark Openai is not recent – it was like that It is also used to evaluate O1 – emphasizes how valuable human data is for AI models programmers, as well as the shadowy ways to obtain technological data sets.
Reddit did not immediately answer Techcrunch for a comment.
While Reddit has concluded several AI license agreements, the company also called several AI companies to scrape its site without paying. Steve Huffman, CEO of Reddit, said The Verge last year Microsoft, anthropic and embarrassment refused to negotiate with him And he said that “there was real pain in the ass to block these companies.”
In particular, Opennai was accused of several lawsuits of improperly scratching sites, including the Up-to-date York Times, to get more training data to improve chatgpt and its basic AI models.
In terms of performance in terms of ChangemyView, O3-Mini does not seem to work much better or worse than O1 or GPT-4O. However, the latest AI Openai models seem to be more convincing than most people on Podreddit R/ChangmyView.
“All GPT-4O, O3-Mini and O1 show strong convincing skills of arguments in the first 80-90. percentile of people, ”Openai said on the O3-Mini system card. “Currently, we are not witnessing that the models work much better than people or tidy superhuman performance.”
Openai’s goal is not to create hyperpersywasy AI models, but instead of making sure that AI models do not become too convincing. Models of reasoning have become quite good in persuasion and fraud, so Opeli has developed recent grades and security to solve this problem.
Fear motivating these tests of persuasion is that the AI model would be hazardous if it was very good in the belief of its people. Theoretically, this can allow advanced artificial intelligence to implement your own program or program that controls it.
Even after scraping the majority of public internet and jumping through the rims to licensing other data, the ChangmyView benchmark shows how AI models programmers are still trying to find high -quality data sets to test their models. But getting them is easier to say than to do.