AI transcription tool ‘hallucinates’ medical interactions
Clip: 1/25/2025 | 5m 53sVideo has Closed Captions
What to know about an AI transcription tool that ‘hallucinates’ medical interactions
Many medical centers use an AI-powered tool called Whisper to transcribe patients’ interactions with their doctors. But researchers have found that it sometimes invents text, a phenomenon known in the industry as hallucinations, raising the possibility of errors like misdiagnosis. John Yang speaks with Associated Press global investigative reporter Garance Burke to learn more.
Major corporate funding for the PBS News Hour is provided by BDO, BNSF, Consumer Cellular, American Cruise Lines, and Raymond James. Funding for the PBS NewsHour Weekend is provided by...
AI transcription tool ‘hallucinates’ medical interactions
Clip: 1/25/2025 | 5m 53sVideo has Closed Captions
Many medical centers use an AI-powered tool called Whisper to transcribe patients’ interactions with their doctors. But researchers have found that it sometimes invents text, a phenomenon known in the industry as hallucinations, raising the possibility of errors like misdiagnosis. John Yang speaks with Associated Press global investigative reporter Garance Burke to learn more.
How to Watch PBS News Hour
PBS News Hour is available to stream on pbs.org and the free PBS App, available on iPhone, Apple TV, Android TV, Android smartphones, Amazon Fire TV, Amazon Fire Tablet, Roku, Samsung Smart TV, and Vizio.
Providing Support for PBS.org
Learn Moreabout PBS online sponsorshipJOHN YANG: As artificial intelligence increasingly becomes part of daily life, both its benefits and its pitfalls are becoming apparent.
Take medical centers.
Many of them use an AI powered tool called Whisper to transcribe patients' interactions with their doctors.
But researchers have found that it sometimes invents text.
It's what's known in the industry as hallucinations that raises the possibility of errors like misdiagnosis.
Garance Burke is an Associated Press global investigative reporter who's been looking into this.
Garance, I first want to give folks an example of what researchers found.
Here's what a speaker said and after she got the telephone, he began to pray.
Simple sentence, but here's what was transcribed.
Then he would in addition to make sure I didn't catch a cold, he would help me get my shirt, kill me.
And I was he began to pray.
What sorts of other hallucinations have been found?
GARANCE BURKE, Associated Press: Yeah, so in talking with more than a dozen engineers and academic researchers, my co-reporter Hilka Shellman and I found that this particular AI powered transcription tool makes things up that can include racial commentary, sometimes even violent rhetoric.
And of course, what we're talking about here, you know, incorrect words regarding medical diagnoses.
So that obviously leads to a lot of concerns about its use, particularly in really sensitive settings like in hospitals.
JOHN YANG: We asked OpenAI about this and here's what they told us.
They said we take this issue seriously and are continually working to improve the accuracy of our models, including reducing hallucinations.
For Whisper, our usage policies prohibit use in certain high stakes decision making contexts, and our model card for open source use includes recommendations against use in high risk domains.
Given those warnings, why do so many medical centers use this?
GARANCE BURKE: You know, I think we're at a time when a lot of healthcare systems are looking to AI and AI agents to do things that human beings do more efficiently and at scale.
There's a lot of talk about the promise for AI to help unlock new kinds of diagnoses in electronic health care records that haven't been possible before.
But here's really the concern is that if these AI models are not, you know, up to the task of the very kind of precise and specific language that is used to analyze these kinds of electronic health records, we could end up with some really problematic transcriptions that have nothing to do with what a patient actually told a doctor.
JOHN YANG: In healthcare settings, did you find any places that checked the accuracy of these transcriptions?
GARANCE BURKE: You know, there are some places that have adopted OpenAI's Whisper model and really sought to fine tune it and then keep the original audio, say, of what the doctor discussed with the patient so they can fact check that original recording against whatever the AI model wrote down as having been said.
But we did find one company that just threw out the original audio, which obviously you know, could raise some real red flags.
If what the AI said transpired is really the only record that exists.
JOHN YANG: Is there any government regulation of this or the possibility of government regulation?
GARANCE BURKE: There has not been real rigorous regulation of this kind of use of AI powered transcription models.
What we're seeing is that, you know, healthcare systems themselves really took note of our reporting and started asking a lot of questions about whether their use of this AI model was the best way to go forward.
So I think we may see a certain amount of self-regulation and we'll see what happens under the new Trump administration.
JOHN YANG: You mentioned that it sometimes adds racial commentary.
Is there any effect or potential effect on the equity issues in medicine?
GARANCE BURKE: Well, I think one of the issues that we ran across here is this AI model just inserting the word black in a transcription that some of the researchers analyzed.
If you have an AI tool that is just kind of fabricating racial content, you do have to wonder how that could possibly add to some of the racial disparities that we've seen historically in healthcare settings.
JOHN YANG: Is there anything patients can do to protect themselves?
GARANCE BURKE: You know, we spoke to one person who said that she decided to opt out of having her daughter's doctor's visit recorded simply because she had concerns about the privacy of their family's intimate medical history being shared with a big tech company.
So I think there are some opt out provisions that patients can look into.
I do know that a lot of these models are being fine-tuned by hospitals and healthcare systems that really do have patients well-being in mind.
But as with everything involving AI, it's a good opportunity for patients to check in as to how their data is being used and who actually ends up with ownership of it.
JOHN YANG: Garance Burke of the Associated Press, thank you very much.
GARANCE BURKE: Thanks so much for having me.
The effect of removing medical debt from credit scores
Video has Closed Captions
The effect of removing medical debt from millions of Americans’ credit scores (7m 4s)
News Wrap: Hegseth sworn in as Trump’s defense secretary
Video has Closed Captions
News Wrap: Pete Hegseth sworn in as Trump’s defense secretary (2m 15s)
Uncertainty looms over Israel’s fragile truce in Lebanon
Video has Closed Captions
Uncertainty looms over Israel’s truce in Lebanon ahead of troop withdrawal deadline (8m 19s)
Providing Support for PBS.org
Learn Moreabout PBS online sponsorshipMajor corporate funding for the PBS News Hour is provided by BDO, BNSF, Consumer Cellular, American Cruise Lines, and Raymond James. Funding for the PBS NewsHour Weekend is provided by...