Wednesday, December 25, 2024

OpenAI says it is taking a ‘considerate approach’ to releasing tools that can detect ChatGPT transcripts

Share

OpenAI has created a tool that could potentially catch students who cheat by asking ChatGPT to write their papers — but according to The Wall Street Journalthe company is considering whether to actually release this version.

In a statement to TechCrunch, an OpenAI spokesperson confirmed that the company is investigating the text watermarking method described in the Journal article, but said it is taking a “cautious approach” due to the “complexity of the method and its likely impact on the broader ecosystem beyond OpenAI.”

“The text watermarking method we are developing is technically promising, but it does have significant risks that we are considering as we explore alternatives. These include vulnerability to bypass by malicious actors and the potential risk of disproportionately impacting groups such as non-English speakers,” the spokesperson said.

This would be a different approach than most previous attempts at AI-generated text detection, which were largely unsuccessful. Even OpenAI itself shut down its previous AI text detector last year due to its “low accuracy rate.”

With text watermarking, OpenAI would focus solely on detecting handwriting from ChatGPT, rather than third-party models. It would do this by making tiny changes to the way ChatGPT selects words, essentially creating an concealed watermark in the handwriting that could later be detected by a separate tool.

After the publication of the journal article, OpenAI also made an update blog post from May about its research into detecting AI-generated content. The update says that text watermarking proved to be “very accurate and even effective against local manipulations such as paraphrasing,” but proved “less robust against global manipulations; such as using translation systems, rephrasing with a different generative model, or asking the model to insert a special character between each word and then remove that character.”

As a result, OpenAI writes that this method is “easily bypassed by bad actors.” OpenAI’s update also reflects the spokesperson’s view on non-English speakers, writing that text watermarking can “stigmatize the use of AI as a useful writing tool for non-native English speakers.”

Latest Posts

More News