Hospitals apply a transcription tool based on the hallucination-prone OpenAI model

Share

A few months ago, my doctor showed off an AI-powered transcription tool he was using to record and summarize patient encounters. In my case, the summary was okay, but the researchers cited by ABC News found that this isn’t always the case with OpenAI’s Whisper, which forms the basis of a tool used in many hospitals – sometimes it just makes things up.

Whisper is used by the company called Nabla for a medical transcription tool that is estimated to have transcribed 7 million medical conversations ABC News. According to the outlet, it is used by more than 30,000 doctors and 40 health systems. Nabla is reportedly aware that Whisper may be hallucinating and is “dealing with the issue.”

A group of researchers from Cornell University, the University of Washington and others found in the study this Whisper hallucinated approximately 1 percent of the transcripts, producing entire sentences sometimes containing violent feelings or nonsensical phrases during silence in the recordings. The researchers, who took audio samples from AphasiaBank TalkBank as part of the study, noticed that silence was especially common when a person with a language disorder called aphasia spoke.

One researcher, Allison Koenecke of Cornel University, posted examples like the one below in: thread about studies.

The researchers found that the hallucinations also included made-up medical conditions or phrases you might expect from a YouTube video, such as “Thank you for watching!” (OpenAI reportedly used to transcribe over a million hours of YouTube videos to train GPT-4.)

There was a test presented in June at the Association for Computing Machinery FAccT conference in Brazil. It is unclear whether it has been peer-reviewed.

OpenAI spokeswoman Taya Christianson emailed a statement to: Edge:

We take this issue seriously and are constantly working on improvements, including reducing hallucinations. When using Whisper on our API platform, our terms of apply prohibit apply in certain high-stakes decision-making contexts, and our model card for open source apply includes recommendations for apply in high-risk domains. We thank the researchers for sharing their findings.

The AI Sckool

Categories

Hospitals apply a transcription tool based on the hallucination-prone OpenAI model

Penalties: Does the team that kicks first have a better chance of winning?

3 questions: Beyond data-driven aesthetics

Almost anyone can now sell you GLP-1 on the Internet

7 Real Python Projects You Can Build in 2026 (with Guides)

Start building with Nano Banana 2 Lite and Gemini Omni Flash

More News

What’s going on with Alexa+?

The winter storm tested power grids that are strained to accommodate AI data centers

Google DeepMind employees ask leaders to ensure their “physical safety” from ICE

Google Photos now lets you describe how to turn images into videos

Penalties: Does the team that kicks first have a better chance of winning?

3 questions: Beyond data-driven aesthetics

Almost anyone can now sell you GLP-1 on the Internet