Loading summary
Alok Jha
Hello. This is a free episode of Babbage. To listen to the show every week, you'll need to be an Economist subscriber. For more information, search online for Economist Podcasts plus or click the link in the show notes.
Natasha Loder
Everyone knows that it's not Valentine's without.
ChatGPT
1-800-Flowers.Com, so whatever you do, don't forget the flowers. Right now, when you order early from.
Natasha Loder
1-800-Flowers, you can get a gorgeous bouquet of assorted roses starting at $24.99.
ChatGPT
It's the perfect way to say I.
Natasha Loder
Love you without breaking the bank. Don't put this off. Quantities are limited, so lock in your roses starting at $24.99 today at 1-800-flowers.com acast. That's 1-800-flowers.Com acast.
Alok Jha
The Economist.
Gerald Lipp
The process of breast screening is where two human readers look at the mammograms and then they make a decision on that. And if they disagree, that's the third human reader.
Alok Jha
Gerald Lipp is a radiologist in charge of screening for breast cancer in the northeast of Scotland. Gerald and his team recently wanted to find out how an artificial intelligence tool could help their Service.
Gerald Lipp
We had 10,000 women go through the system, and where two humans said there was nothing seen on the mammograms, this is when the AI kicked in. And the AI then would give us a reading and say, there's something on this mammogram. And they'll have a circle on the mammogram and say, would you like to look at this area a bit more closely? One of the very first few patients we did, there was a lady named Barbara, and Barbara was a lady who was reaching her 70s. So this would have been her last time coming to the screening program. We wouldn't have seen her again. And the AI did highlight a very subtle area in the right breast, which two human readers had already said was okay, and they weren't worried about it. The expert readers, ourselves, with the help of the AI augmented by it, looked at this area again and said, let's give her a chance. Let's call her back and take a look at this. I think I was the one actually who did her ultrasound and her biopsy found a tiny little thing, probably about the size of 5 millimeters in the breast. And when the results came back, lo and behold, Barbara was one of 11 women that we managed to pick up using the AI as an extra reader or a fail safe. Where the humans had said this was normal, and in her case it was a significant cancer, because otherwise it would just have been growing in the background without her knowing.
Alok Jha
You might have heard stories like Barbara's before. In the past five years or so, computer vision tools have been shown to spot everything from fractures on X rays to the early signs of disease from retinal scans. They've even been used to help with surgeries. That list of uses of AI is set to explode in the coming years.
Gerald Lipp
I know I'm a very good radiologist already because the breast screening program shows that we are good, but the AI gives us that little edge in screening. We look at a lot of normal mammograms. You're looking at six to eight in a thousand mammograms to find cancer. That's 900 mammograms that are normal. So if I can automate and take away some of that 900 mammograms in 1000, that actually gives me more time to be patient.
Alok Jha
Facing new, bigger AI models promise to make access to healthcare easier, help doctors spot disease faster, and transform how medical research is done. But promise is one thing. Practice is another. How soon will these technologies really benefit patients? This podcast is the first of two episodes on how the world of healthcare is grappling with the potential of AI. First, we'll examine where and when patients could benefit from AI models in healthcare. And next week we'll gaze into the future. How will medical researchers and doctors use the latest AI models to understand and treat disease in completely new ways? I'm Alok Chha and this is Babbage from the economist today. Part 1 How AI can improve the experiences of Patients to talk about the emerging role of artificial intelligence in healthcare is the Economist's health editor, Natasha Loder. She recently put together a special report on the topic. Thanks for joining me, Natasha.
Natasha Loder
Thank you for having me.
Alok Jha
Now, Natasha, artificial intelligence and healthcare is a topic that, you know, we've covered quite a lot in the past. There have been examples of useful places where it's been applied and people have talked about its potential for a while. Just right now, what is the specific focus that makes it interesting for you?
Natasha Loder
Well, I suppose it's generative AI in the last year that has been particularly creating quite a buzz around what could be possible in medicine using both the AI tools of the past and the ones that we're going to invent. And, you know, the possibilities lie beyond not just diagnostics, but also tools for giving personalized health advice. The ability to triage lots of patients very quickly, to write patient notes and summarize meetings with patients, and Even to discover new insights from medical data and do drug discovery. So there's a sort of really wide range of possible uses of AI.
Alok Jha
Okay, so just tell me then, in your opinion, if you were looking at this in the broadest buzz strokes, what's the biggest problem that you see healthcare workers being able to tackle with AI sort of in the future?
Natasha Loder
I mean, fundamentally, what AI is able to do is replicate quite sophisticated form of human knowledge. And so if you think about what are the problems writ large in healthcare, it is that there just aren't enough doctors or nurses, psychiatrists, you name a medical profession. There aren't enough of them to service the needs of the seven or so billion people on the planet.
Alok Jha
So the access problem, isn't it? That seems to be a really interesting and intriguing way that this might go. If you've got generative AIs that can replace some of the functions of clinical staff, essentially you just have more people able to access those services. That's what you're saying.
Natasha Loder
Yeah. And we're in this really difficult situation at the moment where globally our population is getting older, healthcare costs are rising, there's a real burden of chronic diseases, and there's this sort of massive shortage of healthcare workers. And so even in the short to medium term, if we can allow these technologies to be harnessed, there's actually a lot of potential for improving productivity and meeting human need, even if we don't. In the big picture, things solve healthcare for, you know, the next 10 to 20 years. So that's the question is how much can we wring out of this technology and how quickly?
Alok Jha
Okay, well, that's what we'll explore next in the show. Natasha, we'll come back to you a bit later on. One of the tools that's using computer vision to help diagnose patients is called mia. It's an AI model built by the company Chiron Medical. And it's a system that Gerald Lipp, who we heard from at the start of the show, has been using to detect cancers from mammograms in Scotland.
Peter Kachikamen
Many people don't necessarily know that usually two radiologists are required to look at every single mammogram in breast cancer screening.
Alok Jha
Peter Kachkameti is the chief executive of Chiron.
Peter Kachikamen
You can imagine it's large volume, and a large volume can be looked at. Two doctors, whenever they disagree, a third doctor is brought in. But even with two doctors, it's estimated and expected that more than one in five cancers will be missed. So obviously that means a lot of improvement. Can be made. So what can the AI do? Well, it can increase the cancer detection. We have results that show we can increase cancer detection by up to 13%. With that, we can expect that we can actually halve the number of cancers missed in breast screening.
Alok Jha
Okay, well, talk to me about the technology then. But what kind of AI model is behind the tool that you're describing? How does it work?
Peter Kachikamen
Yeah, so MIA is based on a large number of varied AI components. This means convolutional neural networks, primarily.
Alok Jha
Okay, so the input is the scan or mammogram, and the data is passed through layers of the neural network. Each one looks for specific things in the image that might make up parts of a lesion or some other problem. And once that data has passed through all the layers of the neural network, the algorithm can determine whether or not that patient could have cancer. And MIA passes images through several different networks like this before it comes to a decision.
Peter Kachikamen
So think about a large group of experts, like, I don't know, hundreds of radiologists sitting in a room looking at every single mammogram, and then they're making a decision collectively in order to have the best possible results. And if you think about it, the AI can find information on its own in a data set, for instance, among the breast cancer screening images that humans may not be able to process. So maybe it's different motifs, maybe it's finer tune of grayscale, et cetera. And from a practical point of view, this means the AI can have very different capability from the doctors. And that means we need to use the AI and optimize it to be the best complementary capability to the doctors, rather than trying to compete with the doctors.
Alok Jha
Tell me what the model is trained on. How do you make the neural network able to recognize what a breast cancer looks like on a mammogram?
Peter Kachikamen
So we used the longitudinal data on the whole patient. So for negatives, we wanted to make sure that for multiple years, there were no positive signs for the patients. And for positives, we wanted to make sure we account for all the different aspects of a positive cancer. And it is important that we are training the algorithm on as much of the truth as we have access to, like, biopsies and other follow ups rather than what the doctors say, because otherwise we would be just approaching what a doctor can do and wouldn't be able to develop complementary capability.
Alok Jha
So we're talking about images from mammograms, but also data from biopsies that are linked to those images, perhaps. And all of this is labeled so that the model knows which ones contain potential tumors and which ones don't.
Peter Kachikamen
Yes, and the labeling process is one of the key elements of machine learning engineering, really understanding the domain, what constitutes a positive, what's a negative, and to understand when something is certain or not.
Alok Jha
And typically how much data is going into it. I mean, how many scans, how many biopsies, what sort of order of magnitude are we talking about?
Peter Kachikamen
The order of magnitude is a couple of million. It's important for AI that it's not necessarily the number of cases that you need to look at, but you need to optimize for the maximum amount of information that it is trained on. So for instance, if I have an algorithm that is trained on one particular country in Europe, and if you get more of those cases, that's much less valuable than you get different ethnicities from across the globe. From Asia, from South America, Africa.
Alok Jha
I see. So you want as diverse a set as possible around the world.
Peter Kachikamen
Diverse in a specific way that it gives you the highest information so that you can reduce confounders.
Alok Jha
Okay, so where is MIA currently in clinical use then?
Peter Kachikamen
It is in clinical use in Hungary and in the uk.
Alok Jha
And just talk me through how a radiologist would be using it. So, you know, a patient comes in, they have their scan, there's a mammogram available. Does that go into the model sort of in parallel with the radiologists looking at the same thing?
Peter Kachikamen
Yeah. So once the patient is scanned, the AI will make a determination in a couple of seconds, so even before the patient leaves the room. But those results may be held back in order to support an unbiased process. It's a very good idea for the doctors to look at Images without the AI's input first.
Alok Jha
So they don't get to get biased on it.
Peter Kachikamen
Exactly. So that they don't get biased so that the maximum human capability is used. And then the AI then determines whether an adjustment is to be made, for instance, or not. And then it could flag up to the lead radiologist, hey, maybe cancer is missed, or your radiologists are potentially over diagnosing and things like that. It's always a human radiologist who makes the final decision. They have the authority. The AI is a tool to flag when the AI is certain. And so the doctors have a high chance of changing the patient outcome for the better.
Alok Jha
Peter. Well, that's been fascinating. Thank you very much for your time.
Peter Kachikamen
Thank you so much.
Alok Jha
There's huge potential for computer vision systems in radiology to detect a specific disease from a medical image. But AI'S ability to make life easier for patients goes far beyond that, as Jason Hoskin, one of our producers, has been finding out.
Jason Hoskin
As you walk down the streets of Boston, it's hard not to be reminded of the city's role as a hub of innovation. Central to scores of bright ideas in biotechnology are two world leading institutions, the Massachusetts Institute of Technology and Harvard University.
Pranav Rajputkar
My interest is in how we build AI doctors.
Jason Hoskin
That's Pranav Rajputkar. He's an assistant professor at Harvard's medical school.
Pranav Rajputkar
Five years back, if you are someone working in this space, you would have typically a modality specialty and a disease specialty. So, for example, I was interested back in the day on working on chest X rays, and particularly on chest X rays to identify signs for pneumonia. But the way the field has evolved over time, what's been increasingly clear is that you can start to think about what it takes to not solve one problem, but to solve the full range of problems that a radiologist works on. So now my group works on thinking about interpreting chest X rays and or CT scans and or MRI scans. And we don't think of it from a disease by disease perspective, but really think about it from what we call a generalist medical AI perspective, where you're trying to build these AI models that can do a lot more than just one disease detection.
Jason Hoskin
And so why has the field changed? What's the kind of technology that's enabled that?
Pranav Rajputkar
Well, there have been three key techniques that have really created advancements. One of them is self supervised learning. The idea of self supervised learning is to be able to learn without requiring lots of annotations. And we've develop some innovations on that front along with several other groups in the world.
Jason Hoskin
So by annotations you mean that images are labeled in the training set. So in the breast cancer example, experts would go through and tell the model what a positive case for cancer looks like and what a negative case looks like, and it learns from that. But self supervised learning is making that process less onerous. So that's the first thing that's changed. What's the next thing then?
Pranav Rajputkar
The second set of techniques that's been really taking off is multimodal learning. And in multimodal learning, we're interested in not just learning about a single modality from a single modality. For example, I'm not just interested in learning from chest X rays. I might be interested in learning from chest X rays and the associated clinical reports in order to understand, for example, what chest X ray matches which clinical report. Or I might have Some associated clinical information, like what the patient history is and how we feed in both the image information and that clinical context information has been something we've just gotten much better at modeling over the last few years.
Jason Hoskin
Okay, and what's the final thing?
Pranav Rajputkar
The third bucket is generative AI to generate language to describe what we see in the image. And this is very close to how radiologists do it. They will write a block of text that goes through comprehensively every organ, every process, and commenting on it. And this is a technology that's really been unlocked in the last couple of years and allowed us to make these much more interesting models that can start to look like what we expect from radiologists.
Jason Hoskin
And your system incorporates all three of those things?
Pranav Rajputkar
That's correct.
Jason Hoskin
So how well does it work?
Pranav Rajputkar
That's a great question. So, for instance, we published a study on an algorithm called Check zero, where you were able to achieve the same performance as radiologists on detecting diseases from chest X rays, but this time using almost no annotations that were explicitly provided by radiologists. One year later, we're at an even better milestone, where recently our group has shown that we can directly output these clinical reports. And when we give those reports to radiologists, and some of the time they come from the model, and some of the time they come from another radiologist, we ask, how often do you prefer the report that was generated by the model? And we're finding out it's about comparable at this point.
Jason Hoskin
So I think some of the stuff that you've been saying so far, you're trying to build AI doctors, the algorithms are getting as good as radiologists, this will cause a bit of fear in people, right? Are the AIs replacing the doctors?
Pranav Rajputkar
Well, I think there is a conventional approach to thinking about this, which is that these are not going to replace radiologists. And I think that's, for the major part, just wrong. Right now, you just have a big supply shortage. So in India, the volume has gotten so high, chest X rays are not even reported by radiologists. That means you have someone who is frankly not comfortable reading chest X rays a lot of the time, and they're being pressured to be able to make these decisions, which have big consequences downstream for patients. So now think about bringing AI into the picture. If you had autonomous radiologists and you deployed those to be able to provide either preliminary reads or service in areas where there weren't experts to be able to do the interpretation, just imagine how much positive benefit that has to society. We're just not in a situation where there's an alternative. Training of radiologists globally is at a standstill. What you have an exponential curve in is the rise in the imaging demand of an aging population. This is a 5% compound annual growth rate. And so you think about this problem just getting worse and worse and worse. And the solution that we have in the horizon is going to be AI models that can do these reads.
Jason Hoskin
I can see that in areas where it's difficult to access care, an autonomous radiologist would be really useful. But people would definitely be freaked out by the idea of that, particularly when you're thinking about things like trust.
Pranav Rajputkar
You know, I often ask this in seminars. I give. I ask people, how many of you trust your doctor? And it's surprising that not a lot of hands actually go up. And it's interesting to ask, why do we trust doctors? Well, one is they have a certification that says, I'm qualified to be able to be a radiologist, and they have to renew it every few years. And so that is a system we have set up in a community that understands how we think about the licensing of human doctors. So similarly, I think we need principles for licensing these AI medical doctors.
Jason Hoskin
Okay, now, all of this futuristic technology that you've talked about relies on generative AI, and I think most people's interactions with generative AI will have been playing around with the likes of ChatGPT. You know, maybe people have used them to check out symptoms that they might have, but ChatGPT obviously isn't licensed as a medical device. Is that sort of general technology useful, though?
Pranav Rajputkar
So I was recently trying to find out what I can do for some back pain that I had. And when I looked at Google versus what I was able to understand and collect from ChatGPT, there's a vast difference in terms of the depth of the information that it can provide you. Now, of course, that comes with the question, well, how will I validate what it's telling me is true? Well, I can always look up on independent sources what those independent sources have to say about what ChatGPT thinks about my medical concern. But this has actually not been the exception as much as I thought it would have been. Whether we like it or not, we have been at least on the patient side, using these systems. The caveat here is also that these big companies typically don't want to be making medical recommendations, and so they tune their model in order to be able to refrain from making medical recommendations. And that's probably a good thing from a regulatory standpoint. Then they don't have to worry about being classified as a medical device. But I think it does come with a trade off where a early version of ChatGPT was actually evaluated in the New England Journal of Medicine for its ability to answer medical questions and turns out it actually did pretty well. That sort of gives you an insight into that. These systems are powerful and frankly, there are a lot of places where having such tools is going to make an incredible difference to the quality of healthcare that can be received.
Jason Hoskin
That's fascinating. Pranav, thank you so much for talking to me.
Pranav Rajputkar
Thanks for having me.
Alok Jha
You can see why large language models such as the one that powers the likes of ChatGPT could help to understand an illness better. After all, they've ingested pretty much all of the data available on the Internet and try to make sense of it. That includes factual websites provided by hospitals or healthcare systems, articles in medical journals, and even first hand accounts posted online by doctors or patients. That ginormous training set might also include inaccuracies or misinformation, though. What's more, the way that generative AI produces its results makes it hard to fully trust. But can those challenges be overcome? Could the conversational powers of ChatGPT be harnessed to create reliable medical bots? That's coming up.
Jason Hoskin
My dad works in B2B marketing. He came by my school for career day and said he was a big roas man. Then he told everyone how much he loved calculating his return on ad spend. My friends still laugh at me to this day.
ChatGPT
Not everyone gets B2B, but with LinkedIn you'll be able to reach people who do. Get $100 credit on your next ad campaign. Go to LinkedIn.com results to claim your credit. That's LinkedIn.com results. Terms and conditions apply. LinkedIn, the place to be to be.
Alok Jha
Today on Babbage, we're asking how medical AI tools can improve the experiences of patients. Natasha Loader, the Economist health editor who we heard from earlier, has been looking into the use of chatbots in healthcare. Natasha, we just heard that Chat GPT can in fact be a useful tool for medical advice. You're sitting here in the studio with me, so let's test Chat GPT out for some medical problems that you know, you might be having. I have to say I'm skeptical, but you've got chatgpt on your phone there and the latest update means that it speaks now. So why don't you give it some symptoms and see if it can work out what it thinks is wrong with you.
Natasha Loder
Okay. All right.
ChatGPT
Hi, I'm having Issues right now. Could not understand what you said.
Natasha Loder
Hi, can you help me with something?
ChatGPT
Of course. What do you need help with?
Natasha Loder
Well, I've got a medical problem that I'm worried about. I've been quite short of breath today and I have a sharp pain in my chest just slightly to the left.
ChatGPT
It sounds like you're experiencing symptoms that could be serious. It's really important to seek immediate medical attention for symptoms like shortness of breath and sharp chest pain. Can you call a healthcare provider or go to an emergency room?
Natasha Loder
Yeah, I could do that.
ChatGPT
That sounds like a good plan. It's important to get checked out by a professional as soon as possible with symptoms like those.
Alok Jha
Ask chatgpt what, yeah, he or it actually thinks is wrong with you.
Natasha Loder
What do you actually think is wrong with me though? I kind of need to know.
ChatGPT
I'm not a doctor so I can't diagnose what's wrong. But symptoms like shortness of breath and sharp chest pain could be caused by a variety of things from something less serious like muscle strain, to more urgent issues like a heart problem or a lung issue.
Natasha Loder
What kind of heart problem could it be?
ChatGPT
Chest pain and shortness of breath can be associated with several heart related issues. Here are a few possibilities. 1. Heart attack. This occurs when blood flow to a part of the heart is blocked for a long enough time that part of the heart dies. 2. Angina. This is a symptom of coronary artery disease and occurs when the heart attacks jaw or back. 3. Pericarditis. This is an inflammation of the lining around the heart determines the exact cause and appropriate treatment.
Alok Jha
Now that's interesting because there are so many guardrails out there with that version of ChatGPT that you're using that clearly it doesn't want to tell you exactly what it's doing. And it's saying it's not a doctor and it's saying you should get medical advice, which is probably sensible. Right? That's exactly the right thing to do. But it took several minutes there before it suggested that you might be having a heart attack, which, I mean, it's not funny, it's. You'd be lying on the floor at that point incapable of moving. How useful is that really in medical advice?
Natasha Loder
Well, it's clearly not intended to be useful.
Alok Jha
Not that version, no.
Natasha Loder
But I mean, we're a long way technologically from being in a situation where we could recommend that anyone having these symptoms could just go and safely have a conversation with a chatbot and that would be the best course of Action. I mean, it is good advice as far as I can tell.
Alok Jha
Correct.
Natasha Loder
I'm not a doctor, but if you're going to dispense diagnoses, there are rules for how you do this and there are tests that you have to do and you know, there are regulations you have to follow. And that's not what this particular bit of technology has been through.
Alok Jha
I think that it's unfair to just expect the publicly available ChatGPT model to be giving us medical advice unless I think it's not terrible.
Natasha Loder
I mean, I mean, I think, look, I think the reason this is a good test is that we know that people have been googling Dr. Google, as they call it, exactly, for years. And we now know that people are not only chatgpting, if that's the way of putting it, their symptoms, we know that doctors are doing it. And we also know that people are solving medical mysteries using these tools.
Alok Jha
But despite the promise, then can you explain to me why a generalized model of the type that powers ChatGPT isn't ever going to be a medical device?
Natasha Loder
The missing bit of information we don't have is we don't know the number of people who are using these tools and getting the wrong answer or getting false reassurance that they're fine when they're not. And that's the kind of problem that we haven't even begun to solve with these tools. And that is why we can't rely on them. Also, because they're generative, they come up with a different answer each time you ask them. So, you know, who knows after the 10th time of asking that question whether it would give me the same accurate advice?
Alok Jha
And of course, you know, we've talked about hallucinations before. There's a generative AIs hallucinate, they make facts up. It's hard to understand how they come to their conclusions sometimes. So these are all problems for trust. And of course trust is probably the most important thing when it comes to medicine because you need to understand how whoever's treating you, giving you advice, how they come to their conclusions and what happens if things goes wrong as well. So I suppose the next question is, could you make a more specialized healthcare chatbot and are people trying to do these things and what benefits would they offer?
Natasha Loder
Well, look, you can get medical chatbots. There's one called ADA Health, which is a text based chatbot. So you type in your symptoms. It's a symptom checker. It doesn't always promised diagnoses, but it's been crafted over Many years and had all the sort of data that it needs built into it so that it can essentially assess what you type in and ask questions and then kind of weigh up what the symptoms could possibly be and then give you a range of likely answers. And that's a regulated medical device. So we have that. The problem of course, is that with these kind of chatbots is they're a bit clunky. You have to go through a question and answer. It can be quite a lengthy process. You may have a cold and you may have to sit down answering lots of very different questions, some of which seem quite strange. But. So the question is, can we bridge the two systems? Can we bridge this accurate text based chatbot with this amazing tool that is conversational chatgpt? Can we sort of have an interface that interrogates this much more accurate medical chatbot? And that is very much what people are thinking. And so the next step in the evolution could be where you have a conversation with something that sounds like ChatGPT, but kind of under the hood, you have a kind of much more solid, robust, explainable and provable AI chatbot.
Alok Jha
I mean, there are ways of making sure that the outputs of a large language model are more accurate and they, you know, not inventing information. I mean, this is a problem across all sectors. And the solutions I've heard of are things like only allowing the large language model to access data from a particular database. So you've picked papers or a database of medical information that is correct and you ask the large language model a question, but say to it only find your answer from that particular database. You know, it's called retrieval augmented generation. And then another way of improving the responses from a large language model is to take a generalized large language model and fine tune it with specific labeled information that is very highly accurate, let's say medical information. Which means that when you ask it a question, it's not a generalized LLM anymore. It's got specific talents, knowledge about certain conditions, so it's less likely to give you bad information.
Natasha Loder
You know, ultimately this is all going to have to go through regulators, they're going to have to have some oversight over whatever people come up with. But I think there's a long way to go on the technological front before we can say, actually no, this is too difficult and it can't be done.
Alok Jha
So the AI of today does seem to have a lot of promise. But what's going on in clinics right now, I said at the start of the show that we'd look at the experiences of patients in the age of AI. But I highly doubt that what we've discussed so far is your experience when you visit your doctor or go to a hospital. It certainly isn't mine. Producer Jason Hoskin has been investigating the reason why.
Hugh Harvey
If you develop an AI algorithm that can triage something or provide an indicative diagnosis, then the minimum thing that you have to do is run it by the regulators and get a regulatory clearance to allow you to put that device on market.
Jason Hoskin
Hugh Harvey used to be the clinical director at Chiron Medical, the company that we heard from earlier that uses AI to detect breast cancer.
Hugh Harvey
Chiron was the first AI system ever to get regulatory approval. So I did that. It was the first European clearance for a deep learning computer vision system. Yeah.
Jason Hoskin
Hugh now runs a consultancy called Harjin Health. They help other AI companies comply with medical device regulations.
Hugh Harvey
It's a step that is slightly onerous on first timers in industry because you have to develop a quality management system. You have to show that any software used for a medical purpose is safe, effective and cyber secure, which I'm sure patients would agree are the minimum things they'd want to see from such devices.
Jason Hoskin
And how do they do that? How do they show that they're safe for all these things?
Hugh Harvey
So you start off with your hypothesis that your device can do X functionality. You then do a literature review to determine how well that is currently done under standard of care within the hospital system. You then develop your device in a documented process to show that you've identified risks and you've mitigated against those, and to show that you have met the user requirements as you've identified. And then you run some form of clinical investigation to show that your device can do what you say it does, and you present that data to a regulator.
Jason Hoskin
So the tools that are being deployed in clinical settings right now, they're quite narrow and rigid in terms of what they can do, but the new models, they're more generalist. And how much more of a challenge are generative AI models that by definition give a unique output every time? Are they more of a bottleneck and a challenge?
Hugh Harvey
Just to lay the ground, the background context here is that not a single large language model has received repertory approval for use for a medical purpose. And as you say, this can be due to the fact that these systems have a propensity to hallucinate, which, to all our best efforts, we can't control to a certain extent, and they're non deterministic. And what that means is that if you give a large language Model a certain prompt or a certain input, you will not get the same output every single time. And that is something that the regulators so far have not been happy to give regulatory clearance for. Its accuracy as well. It's very, very difficult to measure the accuracy of a general system if you have something that has potentially infinite inputs and infinite outputs. We don't know how often it's going to get it wrong. And it's not just how often it gets it wrong, it's how badly it gets something wrong. And until we can measure that these things aren't safe to be on market as an intended medical device. It is quite hard to conceptualize an academic framework by which you would evaluate that across its entire generalizability. And I've not seen anyone do that, you know, comprehensively yet.
Jason Hoskin
Do you know of large language models and generative AIs that want to get regulatory approval?
Hugh Harvey
There are many that want to get regulatory approval, absolutely. I had three inbound emails this morning with people asking how can we get regulatory approval. I was actually at a conference last week with a bunch of regulators and it was almost unanimously decided that current regulatory frameworks and the current evidence being generated by vendors is not sufficient to allow any of these to be used, especially in high risk situations. You know, if you have a generalist model, you are claiming it can solve pretty much any medical problem. Well, that includes children with cancer or neonatal sepsis, which we all agree is a seriously high risk situation. And we should be focusing on more low risk use cases, perhaps things in the administrative realm, rather than trying to use it directly into the diagnostic realm.
Jason Hoskin
Do you think this is something that needs to be overhauled though? Because clearly they have great potential in medical settings.
Hugh Harvey
No, I would put it the other way. I would actually say the developers of large language models need to actually improve their evidence generation. If you want to make great claims, you need to provide exceptional evidence. And if we are going to unlock this apparent revolution in healthcare, surely then someone needs to demonstrate these things can be safe, effective and cyber secure, rather than say to the regulators, well, could you just lower the barrier? Another point I'd like to make is that if someone invented a drug tomorrow, it would take them 10 years to get it to market. And I don't see any reason why we should be rushing large language models into hospitals, ignoring evidence and atypical timeframes it takes to generate it.
Jason Hoskin
That's quite a useful comparison. Something else that's been put to us on the show is perhaps the regulatory and policy framework for AI should be something equivalent to licensing doctors. Do you think that is maybe a good way of thinking about it?
Hugh Harvey
I don't actually. I would disagree with that. I think comparing large language models to fully functioning human doctors is anthropomorphizing the technology, which I think we should really stay away from. I think it's rather dangerous to assume this technology is as capable as a human when it's not. Yes, large language models are broader in capability than some of the more linear models that came before it, but it's by no means at the intellectual level of a human. And so to say, well, we should use the same kind of regulatory structure that we'd evaluate humans on is, I think, a dangerous thing to do. Large language models don't have moral accountability and they don't have, you know, decades of experience like doctors do. Then again, if we were going to regulate it like we do doctors, well, it takes five to six years for a doctor to go through medical school. Then it takes them, I don't know, seven to 10 years training to become a consultant. So, okay, if you want to do that for large language models, let's wait 10 to 13 years and see your evidence at the end of it. And look, I'm coming across as quite pessimistic and I realize that, but I think we'd all agree that actually what we want to do is take our time and get the safest, most effective products to market and that that is the reason why we have these regulations in the first place.
Jason Hoskin
Hugh, thank you very much.
Hugh Harvey
Thank.
Pranav Rajputkar
You.
Alok Jha
Despite the challenges around implementing generative AI systems such as large language models into healthcare, more of the narrower, more specific tools are getting into clinics like mia, the cancer detection model we explored earlier. So what does all that mean for patients in the future? To round off this episode, I'm once again joined by the economist Natasha Loder. Natasha, we've talked about diagnosis and advice and things. Where else could chatbots and these sorts of artificial intelligence models be useful in looking after patients then as they're more integrated with a health system?
Natasha Loder
So really it depends on your timescale as to sort of what the answer is. Ultimately, you know, they could be a sort of first line service 24 7, giving advice, triage, even diagnostics. But really we're going to start with low risk activities such as advice and symptom checking. Where you will see AIs more advanced AIs, more ambitious AIs appear earlier are in places where you have no care, rural communities, developing countries particularly where you're supporting patients. That have chronic health needs in countries with very few doctors. And in those sorts of settings, the risk of having some sort of AI system supporting patients has to be set aside the fact that they're getting none at all. That's how healthcare often views any kind of innovation. It's, you know, what are the risks, what are the benefits? And so in countries that have no healthcare or little healthcare available for certain populations, it may be that AI tools are really going to be a sort of quick answer. And then in the longer term for you and me, perhaps we'll be using these as personal doctors. They'll have our own personal medical information and be able to give us quite sophisticated answers to questions that we might have.
Alok Jha
Yeah. So they'll be able to take your many years of medical information, blood tests, anything else, any drugs you might have had, and give you quite specific advice in a way that you just can't do without a human doctor right now. And even then, it's limited to whatever that doctor specifically has available in terms of information or memory, in terms of how they treated you.
Natasha Loder
I mean, this is a classic problem. You can go and see your doctor and they may have a few moments to scan your recent medical history. They're not going to go back through years of notes. And yet the information in there could be highly salient to the problem you're having today. And so AIs have the potential to make medicine much more sophisticated, to reduce the amount of error, the misdiagnoses. It's extraordinary to me how we could really start to meet the needs of so many more people with this technology.
Alok Jha
Natasha, we've covered a lot of ground in this episode and looked at different ways that different types of AI models could be used in healthcare. You know, you've done a lot of reporting on this just recently, and I'm just curious where you land on this. You know, we don't want to get too excited about things just because new technology arrives, but then there's so much potential for, as you say, more access to healthcare, better healthcare. Give me a scale of how excited I should be about all of this.
Natasha Loder
Well, it's all about time scale and looking forward in the next 10, 20 years. It's transformative. It's going to revolutionize diagnostics, treatment, patient care, kind of almost every aspect, I would think, of healthcare, including foundational models which are going to essentially absorb biology and medicine and make it more understandable and explainable. But in the short to medium term, it's less clear to me how rapidly these technologies will disseminate. And that's because there are a whole bunch of kind of ethical, regulatory and technical challenges need to be overcome to ensure that it's deployed safely. And that's going to require regulators to actually spend some more money. They need to move more quickly, they need to put more effort into oversight of these technologies, because if we're going to introduce them safely, that's what it's going to take. But yeah, I'm very excited. I can't wait to see what's coming down the road.
Alok Jha
I think that you've managed to make that answer a very sensible one, because isn't it Loads of AI people always talk about how these sorts of technologies are overestimated in the short term and underestimated in the long term. And in case anyone doesn't know what a foundation model is, we'll be talking about that in the next episode of this two part series on AI in healthcare, one of the most important and interesting uses of AI in healthcare, and look forward to speaking to you again next week to look at how all of this is going to change the medical research side of things.
Natasha Loder
Sounds good.
Alok Jha
The story of AI in healthcare doesn't end there. To catch the second part of this series of Babbage and much, much more, you'll need to become a subscriber to Economist Podcast plus just follow the link in the Show Notes. Thanks to Gerald Lipp, Peter Kachikamen, Neti Pranav, Rajpakar, Hugh Harvey and the economist Natasha Loda. You can read Natasha's full 10 page report on the promise of AI in healthcare on the Economists app. It's really worth it to get into the details of her reporting. There's a link in the Show Notes. Thank you for listening. Babbage this week was produced by Jason Hoskin and Lawrence Knight. Mixing and sound design is by Nico Rofast. The Executive producer is Hannah Mourinho. I'm Alok Jha and in London. This is the Economist.
Jason Hoskin
My dad works in B2B marketing. He came by my school for career day and said he was a big roas man. Then he told everyone how much he loved calculating his return on ad spend. My friends still laugh at me to this day.
ChatGPT
Not everyone gets B2B but with LinkedIn you'll be able to reach people who do get a hundred dollar credit on your next ad campaign. Go to LinkedIn.com results to claim your credit. That's LinkedIn.com results. Terms and conditions apply. LinkedIn the place to be to be.
Podcast: Babbage from The Economist
Host: Alok Jha
Release Date: May 22, 2024
Duration: Approximately 47 minutes
[01:07] Gerald Lipp, a radiologist overseeing breast cancer screening in Northeast Scotland, introduces the integration of artificial intelligence (AI) into mammogram analysis. Traditionally, two human readers evaluate each mammogram, with a third consulted in cases of disagreement. Gerald shares a compelling case where AI identified a subtle sign of cancer in an elderly patient, Barbara, which human readers had previously overlooked. This collaboration between AI and radiologists led to the early detection of a 5mm tumor, demonstrating AI's potential as an invaluable second pair of eyes in medical diagnostics.
Gerald Lipp:
“We wouldn’t have seen her again. And the AI did highlight a very subtle area in the right breast... Barbara was one of 11 women that we managed to pick up using the AI as an extra reader or a fail safe.”
[01:27]
Alok Jha elaborates on the burgeoning applications of AI in healthcare, from identifying fractures in X-rays to assisting in surgeries. However, Gerald emphasizes that while radiologists are proficient, AI provides an additional advantage by automating the review of numerous normal mammograms, thereby allowing more time for patient care.
Gerald Lipp:
“The AI gives us that little edge in screening... if I can automate and take away some of that 900 mammograms in 1000, that actually gives me more time to be patient.”
[03:21]
[05:02] Natasha Loder, The Economist's health editor, discusses the surge of generative AI in medicine over the past year. She highlights its capabilities beyond diagnostics, including personalized health advice, patient triage, note-taking, meeting summaries, and drug discovery.
Natasha Loder:
“The possibilities lie beyond not just diagnostics, but also tools for giving personalized health advice... discovering new insights from medical data and do drug discovery.”
[05:20]
She identifies the primary challenge AI addresses: the global shortage of healthcare professionals amidst an aging population and rising healthcare costs.
Natasha Loder:
“There just aren't enough doctors or nurses... if we can allow these technologies to be harnessed, there's actually a lot of potential for improving productivity and meeting human need.”
[06:16]
Peter Kachikamen, CEO of Chiron Medical, discusses MIA, an AI tool designed to enhance cancer detection in mammograms. Currently deployed in Hungary and the UK, MIA utilizes convolutional neural networks to analyze mammogram images, significantly increasing cancer detection rates by up to 13%.
Peter Kachikamen:
“AI can increase the cancer detection... with that, we can expect that we can actually halve the number of cancers missed in breast screening.”
[08:18]
He explains that MIA operates alongside radiologists, providing quick assessments and highlighting areas of concern without introducing bias, ensuring that final decisions remain under human oversight.
Peter Kachikamen:
“It's always a human radiologist who makes the final decision. They have the authority. The AI is a tool to flag when the AI is certain.”
[13:16]
Pranav Rajputkar, Assistant Professor at Harvard Medical School, delves into the evolution of AI in radiology. Initially focused on specific tasks like pneumonia detection in chest X-rays, his team now aims to develop generalist AI models capable of interpreting various imaging modalities and multiple diseases simultaneously.
Pranav Rajputkar:
“We're trying to build these AI models that can do a lot more than just one disease detection... a generalist medical AI perspective.”
[15:49]
He identifies three key advancements enabling this progress:
Pranav Rajputkar:
“Generative AI allows us to make these much more interesting models that can start to look like what we expect from radiologists.”
[17:21]
As AI models grow more sophisticated, questions arise about their role in replacing human professionals. Pranav Rajputkar addresses the misconception that AI will replace radiologists, emphasizing the current shortage of medical professionals and the potential for AI to alleviate this gap.
Pranav Rajputkar:
“We just have a big supply shortage... AI models that can do these reads are going to be the solution we have on the horizon.”
[19:10]
He also touches upon the trust issues surrounding AI, comparing regulatory frameworks for AI to those for human doctors. However, Hugh Harvey, former Clinical Director at Chiron Medical, argues that AI should not be regulated like human doctors due to fundamental differences in accountability and operational dynamics.
Hugh Harvey:
“Large language models don't have moral accountability and they don't have... decades of experience like doctors do.”
[39:27]
Hugh Harvey discusses the stringent regulatory requirements AI models must meet to be approved as medical devices. He highlights the challenges generative AI poses, such as unpredictability and the inability to consistently demonstrate safety and effectiveness across infinite scenarios.
Hugh Harvey:
“We don't know how often it's going to get it wrong... Until we can measure that these things aren't safe to be on market as an intended medical device.”
[36:24]
He emphasizes the necessity for robust evidence and adherence to regulatory standards to ensure AI tools are both safe and effective for clinical use.
The podcast explores the practicality of using AI chatbots like ChatGPT for medical advice. Natasha Loder conducts a live test, revealing that while ChatGPT can provide general information, it lacks the specificity and reliability required for medical diagnoses. The conversation underscores the limitations of current AI models in handling sensitive health issues reliably.
Natasha Loder:
“We’re a long way technologically from being in a situation where we could recommend that anyone having these symptoms could just go and safely have a conversation with a chatbot.”
[28:36]
Alok Jha:
“How useful is that really in medical advice?”
[28:32]
The discussion points to the need for specialized, regulated medical chatbots that can offer accurate and consistent advice, potentially bridging the gap between general AI capabilities and clinical requirements.
In concluding the episode, Natasha Loder envisions a future where AI serves as a first-line service for advice, triage, and diagnostics, especially in underserved areas. She anticipates AI becoming integrated with personal medical records, enabling highly personalized and efficient patient care.
Natasha Loder:
“AI has the potential to make medicine much more sophisticated, to reduce the amount of error, the misdiagnoses... It’s extraordinary to me how we could really start to meet the needs of so many more people with this technology.”
[41:27]
Alok Jha balances optimism with caution, acknowledging AI's transformative potential while recognizing the ethical, regulatory, and technical challenges that must be addressed to ensure its safe and effective implementation in healthcare.
Natasha Loder:
“In the short to medium term, it's less clear to me how rapidly these technologies will disseminate... I’m very excited. I can’t wait to see what’s coming down the road.”
[44:12]
The episode sets the stage for a forthcoming second part, which will delve deeper into AI’s impact on medical research and the future landscape of healthcare.
AI as a Diagnostic Aid: AI models like Chiron Medical's MIA are proving effective in enhancing cancer detection rates, serving as valuable tools for radiologists.
Generative AI's Expanding Role: Beyond diagnostics, generative AI holds promise for personalized health advice, efficient patient triage, and drug discovery.
Regulatory Challenges: The integration of AI into healthcare faces significant regulatory hurdles, particularly for generative models that lack consistency and accountability.
Ethical Considerations: Trust and reliability are paramount in medical AI applications, necessitating rigorous validation and oversight.
Future Integration: AI has the potential to revolutionize healthcare by increasing access, reducing errors, and providing personalized care, especially in areas with limited medical resources.
Gerald Lipp:
“Barbara was one of 11 women that we managed to pick up using the AI as an extra reader or a fail safe.”
[01:27]
Natasha Loder:
“The possibilities lie beyond not just diagnostics, but also tools for giving personalized health advice...”
[05:20]
Peter Kachikamen:
“AI can increase the cancer detection... with that, we can expect that we can actually halve the number of cancers missed in breast screening.”
[08:18]
Pranav Rajputkar:
“We're trying to build these AI models that can do a lot more than just one disease detection... a generalist medical AI perspective.”
[15:49]
Hugh Harvey:
“Large language models don't have moral accountability and they don't have... decades of experience like doctors do.”
[39:27]
Natasha Loder:
“AI has the potential to make medicine much more sophisticated, to reduce the amount of error, the misdiagnoses... It’s extraordinary to me how we could really start to meet the needs of so many more people with this technology.”
[41:27]
The episode "AI and Health Part One: DrGPT Will See You Now" provides a comprehensive exploration of how AI is transforming healthcare. From enhancing diagnostic accuracy to addressing the global shortage of medical professionals, AI's potential is vast. However, realizing this potential requires overcoming significant regulatory, ethical, and technical challenges. As AI continues to evolve, its integration into healthcare promises to revolutionize patient care, making it more accessible, efficient, and personalized.
Produced by Jason Hoskin and Lawrence Knight. Mixing and sound design by Nico Rofast. Executive Producer: Hannah Mourinho.