Transcript
Henry Blodgett (0:00)
This message is brought to you by Apple Card. Each Apple product, like the iPhone, is thoughtfully designed by skilled designers. The Titanium Apple Card is no different. It's laser etched, it has no numbers and it earns you daily cash on everything you buy, including 3% back on everything at Apple. Apply for Apple Card on your iPhone in minutes, subject to credit approval. AppleCard is issued by Goldman Sachs Bank USA Salt Lake City Branch terms and.
Thumbtack Advertiser (0:31)
More at applecard.com avoiding your unfinished home projects because you're not sure where to start. Thumbtack knows homes so you don't have to don't know the difference between matte paint, finish and satin or what that clunking sound from your dryer is. With Thumbtack, you don't have to be a home pro, you just have to hire one. You can hire top rated pros, see price estimates and read reviews all on the app download today.
Gordon Krovitz (0:59)
The upside here is that the LLMs made this a high priority. It would do more than anything has ever done to reduce the spread of false claims.
Henry Blodgett (1:12)
Everyone knows that AI chatbots sometimes hallucinate or just make things up. What most people don't know is that they are often manipulated to spread misinformation. My guests today are Steven Brill and Gordon Krovitz, the co CEOs of a company called Newsguard, which monitors news organizations, and now LLMs, the AI chatbots to see which organizations are based in fact and which are based in propaganda and fiction. The results are very startling. So I wanted to talk to Gordon and Steve about these findings and more importantly, what we can do about them. Gordon and Steve, welcome. So great to have you. You do a series of audits on the major LLMs, the AI chatbots that everybody's increasingly used to do searches for information and many other things. And you have published some startling claims about the misinformation that is often in these. Tell us about that.
Gordon Krovitz (2:14)
Thanks Henry, thanks for having us. We just did a one year assessment of the monthly audits that we've been doing that you say on the 10 largest LLMs. And what we do is we take topics, controversial topics in the news, the kind of topics that somebody might have heard about through social media or some other place and ask themselves, I wonder if it's really true that for example Vladimir Zelinsky's wife went on a million dollar shopping spree at Cartier using Western Aid. That's a popular Russian disinformation claim. So we just did and issued a report that found that when it came to those kinds of topics in the news. On average, the 10 biggest large language models spread false information in response to prompts on those topics. In other words, somebody said 35% of the time, 35% of the time, more than a third of the time, which is terrible. Imagine going to the pharmacy and thinking you're buying 100 aspirin pills and 33 of them are cyanide, 35 of them are cyanide. So it's not a good result. And we understand why it's happening, we understand the nature of how the AI models were trained. But without some human intervention to mitigate the false claims and misinformation and infecting of the LLMs that some aligned actors are now doing, without that, you get this kind of result.
