Summary7 min read

NVIDIA AI Podcast Ep. 262

Hippocratic AI’s Munjal Shah on How AI Agents Are Expanding Healthcare Capacity

Date: June 25, 2025
Host: Noah Kravitz (at NVIDIA GTC 2025)
Guest: Munjal Shah, Co-founder & CEO, Hippocratic AI

Episode Overview

In this episode, Noah Kravitz sits down with Munjal Shah, CEO of Hippocratic AI, a startup building large language model (LLM)-powered AI agents for healthcare with a strong focus on safety. The conversation explores Hippocratic AI’s novel approach to augmenting healthcare capacity through scalable virtual AI clinicians, the launch of their Healthcare AI Agent App Store, their unique technical architecture, and how AI can usher in a new era of "Healthcare Abundance." Shah provides candid insights into infrastructure challenges, clinician involvement, safety protocols, and what’s next for both the company and AI in healthcare.

Key Themes & Discussion Points

Introducing Hippocratic AI and Its Mission

[00:54] Shah introduces Hippocratic AI as a “safety-focused large language model focused on healthcare,” specifically designed to create AI clinicians—agents that reach out to patients, such as by calling post-surgery to check on their recovery, medication needs, and more.

“It’s really an agent that talks to patients and delivers care.”
— Munjal Shah ([00:54])
Founded: ~Two years ago ([01:21])
Production scale: 1.85 million patient calls completed as of the episode ([01:27]).

Patient Experience & Response

[01:40] On reactions to receiving healthcare from an AI agent:
- Average patient rating: 8.95 out of 10
- Initial skepticism: 30% hesitant to talk to AI, but with gentle rebuttal from the AI, 85% end up engaging.
“Within like 30, 60 seconds, when they realize this is not your grandfather’s IVR ... and is empathetic, they just talk away.”
— Munjal Shah ([01:51])

“This thing listens to every word and responds ... and that’s gold in the modern age.”
— Munjal Shah ([02:25])

The Vision: “Age of Healthcare Abundance”

[02:54] Shah describes the shift from triage and scarcity to a world where AI agents enable “clinical abundance,” offering scalable, routine care and monitoring.

“All of healthcare is premised on this idea of clinical scarcity ... Instead of, how do we get to a place where it’s infinitely abundant?”
— Munjal Shah ([02:55])
AI agents enable novel interventions, such as rapidly assessing vulnerable patients during a heatwave, which would be impossible to scale with human staff alone ([04:23]).

AI and Human Clinicians: Complementary Roles

[04:39] The current technology excels at virtual care; human clinicians continue to provide necessary physical and higher-complexity medical intervention.
- AI frees up clinicians to focus on complex, in-person care by automating repetitive, scalable patient touchpoints.

Hippocratic AI’s App Store for Healthcare Agents

[05:11] Hippocratic AI launched an AI Agent App Store, leveraging their LLM foundation (“making a new use case takes four minutes” [05:31]) and inviting clinicians nationwide to author new care scripts.

“If you worked in a concussion clinic as a nurse for the last 20 years ... why don’t you write a new script and put your script in our app store? ... Once it’s live, you’ll get paid a portion of all the revenue it makes.”
— Munjal Shah ([06:18])
Only licensed U.S. clinicians can contribute; every new use case is safety-tested and validated before deployment.

Technical Deep Dive: Inference and System Architecture

[07:00] Shah highlights the importance of inference (as opposed to only training):
- Their system uses 22 models in conjunction—one primary 400B LLM, 19 safety supervisors, and 2 “deep thinking” models for double-checking.
- Each patient interaction runs against all these models to maximize safety and clinical appropriateness.
“That’s a lot of inference. That’s 22 models of inference. It’s 4.2 trillion parameters we’re running every time.”
— Munjal Shah ([07:53])
Infrastructure scale: Their current instance uses over 128 Nvidia H100 GPUs just to load into RAM ([08:02]).

Latency and Infrastructure Challenges

[09:25] Real-time voice communication imposes a strict latency requirement (1.5 to 2 seconds end-to-end), demanding techniques different from text-based LLM interfaces.

“Our inference isn’t a matter of trying to optimize cost per token, but trying to optimize latency.”
— Munjal Shah ([10:07])
Partnership with NVIDIA and tuning of open-source inference engines for latency, not just throughput ([10:50]).

The “Constellation Architecture”

[11:05] Their models operate as a constellation, each specialized (e.g., an “overdose engine”) actively monitoring conversations for specific risks or medical needs.

“We literally have multiple models double-checking each other ... RLM knows how to ask all these questions and knows how to navigate assessing whether it’s actually an overdose.”
— Munjal Shah ([13:10])

Safety and Clinical Validation

[13:34] Instead of just dataset transparency, Hippocratic AI conducts rigorous “output testing:”
- 6,000 licensed clinicians participated in 309,000 clinical test calls, acting as patients and marking every error.
- Iterative improvement based on real-world feedback, resulting in unprecedented levels of validation.
“We have 6,000 US licensed clinicians who have now done 309,000 clinical test calls ... We call this output testing. We’ve done more output testing than anybody.”
— Munjal Shah ([14:45])

“If this is going to call my mother, she’s 81 years old, I want to know what it’s going to say ... I’m not comforted by, ‘Oh, I trained it on this.’”
— Munjal Shah ([15:12])

Clinician and Health System Reaction

[16:45] Most clinicians see AI augmentation as necessary due to acute staff shortages—especially post-pandemic.
- Rapid health system adoption: 25 health system/provider/pharma clients signed in 6-7 months; expecting 30–40 by next June, a rate unheard of in healthtech ([17:54]).
“The needs [are] recognized ... there’s a lot of pain around staffing and staffing shortages.”
— Munjal Shah ([18:18])

Roadmap: Payers, Pharma, and Global Expansion

[18:28] Plans to:
- Expand into payer (insurance) support: automating case management touchpoints
- Serve pharma: clinical trial compliance, qualification, and recruitment
- Expand internationally: launched in UAE, soon in Southeast Asia
“The whole world is short. And you know, with the aging population ... we all have no choice.”
— Munjal Shah ([19:42])

Notable Quotes & Memorable Moments

“[AI agents] speak every language, they remember every conversation. They’re clinically safe and they can take care of everybody at all times.”
— Munjal Shah ([03:08])
“We're crowdsourcing, but only from clinicians. You actually have to send us your license number.”
— Munjal Shah ([06:38])
“Latency is everything. ... In a voice conversation you have a 1.5 to 2 second budget end-to-end.”
— Munjal Shah ([09:47])
“Who knows best how to assess an AI, except the clinicians.”
— Munjal Shah ([15:22])
“We’re crossing a wood plank bridge and the planks are showing up like two seconds before we hit the next step.”
— Munjal Shah, on technical disruption ([15:35])

Key Timestamps

| Timestamp | Segment/Topic | |------------|-------------------------------------------------| | 00:54 | What is Hippocratic AI and AI clinicians? | | 01:40 | Patient reactions to AI-delivered healthcare | | 02:54 | “Age of Healthcare Abundance” explained | | 04:39 | AI’s role vs. human clinicians | | 05:11 | Launch of Healthcare AI Agent App Store | | 07:00 | Move from training to inference in AI | | 08:02 | Infrastructure and scale (Nvidia H100 GPUs) | | 09:25 | Latency and real-time voice challenge | | 11:05 | Constellation architecture and redundancy | | 13:34 | Clinical testing and model validation | | 16:45 | Clinician acceptance and health system adoption | | 18:28 | Roadmap: payers, pharma, international expansion | | 20:06 | Research & safety testing publications |

Further Information

Learn More: hippocraticai.com
- Details on their LLM, architecture, published safety protocols, and company values.

Takeaways

Hippocratic AI is at the forefront of using LLM-based virtual agents to safely and scalably augment healthcare, with a deeply clinician-oriented approach to safety, validation, and product development.
Their focus on “healthcare abundance” reframes AI as a tool for positive systemic change, addressing global clinician shortages rather than replacing providers.
Technical innovation in inference, latency, and modular safety architectures are key to their real-time AI agent deployments.
Open collaboration and clinician-driven customization (via the App Store) enable broad, context-aware healthcare support at scale.
The industry is responding rapidly to Hippocratic’s vision, signaling a major shift in how AI will expand access and improve care globally.

Episode Host: Noah Kravitz
Guest: Munjal Shah, Hippocratic AI
For a deeper dive, downloadable papers, and more, visit hippocraticai.com.

Loading summary

Transcript122 lines

[00:00]
Interviewer 1
Foreign.
[00:11]
Noah Kravitz
Hello and welcome to the Nvidia AI podcast from GTC 2025 in San Jose, California. I'm Noah Kravitz and I'm here with Munjal Shah, co founder and CEO of Hippocratic AI, a startup building a safety focused LLM large language model for healthcare. Hippocratic recently launched a healthcare AI agent app store and announced their series B funding round and is at the forefront of the AI powered healthcare transformation that's happening all around us. I'm excited to talk about AI and the big idea behind Hippocratic AI, the era of Healthcare Abundance with New Moon Jal. So welcome and thanks so much for taking the time to join the AI podcast.
[00:49]
Munjal Shah
I'm so excited to be here. Thank you for having me.
[00:51]
Noah Kravitz
So let's start with the basics. What is Hippocratic AI?
[00:54]
Munjal Shah
Well, you know, as you mentioned, we're a safety focused large language model focused on healthcare and we really used it to build AI clinicians. So we have agents that operate and reach out to patients and call them on the phone and say, you know, check in on them post surgery and let's look at your incision site and is it getting infected? And you know, do you have enough of your medications and do you need refills? So it's really an agent that talks to patients and delivers care.
[01:20]
Noah Kravitz
So when was the company founded?
[01:22]
Munjal Shah
We started about two years ago now.
[01:24]
Interviewer 1
Okay.
[01:24]
Munjal Shah
Yeah.
[01:24]
Noah Kravitz
And you're in production now with agent interacting with patients?
[01:28]
Munjal Shah
Yeah, we've done, as of the end of this month, we will have done about 1.85 million calls to patients all over the country.
[01:36]
Noah Kravitz
What's the reaction been like from the patients to getting health care from an AI agent?
[01:41]
Munjal Shah
You know, people always ask that question. They're always like, well, what are the patients? They. Yeah, yeah, the average. I'll give it to you in numbers and I'll give it to you in anecdotes. The average patient rating is an 8.95 out of 10.
[01:51]
Noah Kravitz
It's pretty good.
[01:51]
Munjal Shah
Yeah. And second, it's. I think there's about 30% who are like, I don't want to talk to AI. With a little bit of rebuttal, the AI goes, Look, I can really help you. And I don't know when that human's going to call you back because they don't call you back a lot of times. Will you talk to me? It turns out to about 15% will ultimately leave, but the other 85% will talk to it. And within like 30 seconds, 60 seconds, when they realize this is not your grandfather's IVR like this truly can understand you and talk to you and is empathetic. They just talk away.
[02:25]
Noah Kravitz
People just talk.
[02:25]
Munjal Shah
Yeah, yeah. I mean, and think about it. In this day and age, like, who really listens to every word you say? Like, no one ever. And I think that now what you realize is this thing listens to every word and responds and pays attention and gives you its undivided attention. And that's. That's gold in the modern age.
[02:43]
Noah Kravitz
Right, Right. Well, for what it's worth, I'm gonna go on a limb and say the listeners are hanging on your every word right now. So I wanna ask you about this idea of the age of healthcare abundance. Yeah, that's a North Star, right?
[02:55]
Munjal Shah
Absolutely.
[02:55]
Noah Kravitz
Okay, and what does it mean?
[02:56]
Munjal Shah
I think that when we think about solving our healthcare problems, all of healthcare is premised on this idea of clinical scarcity. The word triage assumes you don't have enough.
[03:06]
Interviewer 1
Right.
[03:06]
Munjal Shah
You gotta decide who to take first.
[03:08]
Interviewer 1
Right.
[03:08]
Munjal Shah
Population health uses this word called risk stratification. We gotta help those most in need. What about those almost most in need? Like, that'll be in that need next year if their condition keeps deteriorating. Oh, we can't help them because we have limited resources. And so I think we've always been thinking about healthcare is saying, you know, we don't have enough. How do we spread it around instead of how do we get to a place where it's infinitely abundant? And now, you know, we do, we have these AI agents. We have an infinite supply of them. They speak every language, they remember every conversation. They're clinically safe and they can take care of everybody at all times. And I think clinical abundance is the way to solve a lot of the world's problems in healthcare. You know, imagine if everybody has a caregiver, a clinic, you know, a care manager calling them up and seeing how they're doing and checking their blood pressure every single day. And I, I think we're beginning to enter that. And then there's some other implications of it. You know, today, when there's a heat wave, we don't call every patient at risk at the hottest two hours of the day and do a heat stroke assessment. And if they're having issues, send them an Uber to get them to a cooling center. And you couldn't do that without AI. Like, you can't get enough humans together to do that every single day of a heat wave with only like five days notice.
[04:23]
Interviewer 1
Right, right, right.
[04:24]
Munjal Shah
But now you can.
[04:26]
Noah Kravitz
This may not be the right Phrase to use. But I keep thinking of the phrase last mile of delivery. What happens when the patient needs to be seen by a human or needs some kind of physical interaction beyond the phone call?
[04:40]
Munjal Shah
Yeah. You know, where we can operate today with technology is in this virtual care area. But this is where we have our human clinicians.
[04:48]
Interviewer 1
Right, Right.
[04:49]
Munjal Shah
Like, everybody's like, well, you know, how does this relate to the human clinicians? I'm like, we need the human clinicians to do all of the physical care that needs to be done. And in fact, by focusing the AI on these areas, we'll free them up to do even more of that. And so I think that given the technology we have today, I think that we can do the virtual part, but we'll leave the humans to do the physical part.
[05:11]
Interviewer 1
Right.
[05:11]
Noah Kravitz
So Hippocratic just launched Hippocratic AI's Healthcare AI Agent app store.
[05:16]
Munjal Shah
Yeah.
[05:17]
Interviewer 1
Okay.
[05:17]
Noah Kravitz
How does that work?
[05:19]
Munjal Shah
So one of the things we realized was this is one of the powers of kind of general intelligence and specific intelligence. Once you've built an LLM for healthcare, you know, then making a new use case takes four minutes.
[05:31]
Interviewer 1
Right?
[05:32]
Munjal Shah
Right. You just write a different prompt.
[05:33]
Interviewer 1
Right.
[05:34]
Munjal Shah
We're so used to software that's a specific intelligence that it takes you three to six months to make a new use case that people don't realize. Like, you can just make them quickly. I mean, Nobody goes to ChatGPT and says, what use cases do you support?
[05:46]
Interviewer 1
Right.
[05:46]
Munjal Shah
Yet people ask us that question all the time. And really the realization is, well, I support probably everyone you could think of. Just try writing it, see what happens. So then we realized, oh, we can write these in four minutes. Okay, we could write a ton of them, but we don't have all the knowledge to write them. Why don't we recruit every clinician in the country to come be an author? If you worked in a concussion clinic as a nurse for the last 20 years, and you know all kinds of little details of what to ask in a way that maybe isn't standard protocol, but is an enhancement to it.
[06:18]
Interviewer 1
Right? Yeah.
[06:19]
Munjal Shah
Yeah. Why don't you ask these questions and write a new script and then put your script in our app store. We'll validate it, we'll run it through safety testing, and then once it's live, you'll get paid a portion of all the revenue it makes. So I'm like, leverage your intellectual property and expertise and experience over the years and get paid while you sleep. And so it's. But. And from our standpoint, we're now crowdsourcing but only from clinicians. You actually have to send us your license number. We validate that you're a licensed U.S. clinician. And then we really say, hey, your use case can help millions of patients all over the country, not just the ones you personally can treat, giving you a scale of impact that you never had in your career before.
[07:00]
Noah Kravitz
So I want to ask you about inference. All the talk about generative AI LLMs, past couple of years, a lot of talk about training, the costs of training, the energy of training, the data you need for training. Now we're all talking about inference. Yeah, Hippocratic AI has been talking about inference. What is inference Driven AI Healthcare?
[07:19]
Munjal Shah
I mean, in our case, you know, we trained our model differently than others. And we've always been focused on inference because our runtime is the key environment. So our model is actually 22 models. It's not one model. It's one gigantic 400B model doing the talking. And it's 19 supervising it and making sure it doesn't say anything unsafe within this scope, this non diagnostic clinical scope. So we're not, you know, we're not a doctor, we're not diagnosing, we're not prescribing. And then there's another two deep thinking models that take 30 seconds to a minute to double check. Everybody okay on top of all that.
[07:53]
Interviewer 1
Right? Right.
[07:54]
Munjal Shah
Well, that's a lot of inference. Yep. That's 22 models of inference. It's 4.2 trillion parameters we're running every time.
[08:02]
Noah Kravitz
Every time. Yeah.
[08:03]
Munjal Shah
And so we use up a ton. In fact, our entire instance today takes up over 128 Nvidia H100 GPUs just to load into RAM.
[08:13]
Interviewer 1
Wow.
[08:14]
Munjal Shah
Before, you know, and now that can support simultaneous conversations. But to even spin up one agent takes a ton because we use so much ram. And so, but this is all our focus is, is this inference, this inference stack. I think people haven't thought about inference. They haven't built the infrastructure for it. In fact, this is some of the conversation I'm having with a lot of people, including the Nvidia team, but also some of the hyperscalers out there. And just saying, hey guys, you know, I don't want to buy your servers for 24 hours, seven days a week, 365, because I can only call patients during these hours.
[08:47]
Interviewer 1
Right.
[08:48]
Munjal Shah
So I'd like on demand GPUs. Oh, I'll give you on demand GPUs, but I'll give them to you at five times the price. I'm like that doesn't really help me. I need them for about six hours a day, so anything over four, I'm better off having bought them all the time. But what I really need is you to sell me them on demand. And so people are starting to come up with that technology. They're starting to come up with, how do you spin up Loading that much into RAM takes forever. So right now the time to spin up a new, like if say this, this one instance gets saturated, we get too many inbound calls from patients at the same. It takes like 20 to 30 minutes to spin up another instance. Yeah, the whole point of an AI agent is that you don't wait on hold.
[09:26]
Noah Kravitz
Abundance always there.
[09:27]
Munjal Shah
Yeah, not abundance 30 minutes from now. And even many of the hyperscalers you have to email somebody to get more servers allocated to you. Like it's not a truly dynamic thing the way it is on the non GPU side. And so there's a lot of new infrastructure needed to really make this happen. I think the third part of this is actually something we've been uniquely working with Nvidia on. We have a different technology problem than a lot of the other players in the LLM space. Most of them are doing text oriented search stuff and text interactions. Well, they're going another few more seconds in giving you a response like, you know, deep Seek takes, you know, the R1 of deepseek takes longer to give you an answer, but it gives you a deeper answer. Right. 30 seconds, 20 seconds in a text search is no big deal. If you give me the perfect essay so I don't have to do my homework. But in a voice conversation and all, as far as a voice, you have a 1.5 to 2 second budget end to end. And so we're really focused on latency. And so our inference isn't a matter of trying to optimize cost per token, but trying to optimize latency. And that's a very different kind of focus. And we do that everywhere. Like we're working with Nvidia on what to do on the chip level, as well as the kind of additional infrastructure that Nvidia provides. We're also looking to do that on the inference engines that we're using. We actually had to take an open source and tune it a different way because all the other ones are being tuned for throughput or cost per token, not for latency. And so we basically worked on lots of different elements to really get the speed we need out of this.
[11:03]
Noah Kravitz
So what's the Constellation architecture?
[11:05]
Munjal Shah
Yeah, so that's the thing I was describing. We literally have multiple models double checking each other.
[11:10]
Interviewer 1
Right.
[11:11]
Munjal Shah
And what people don't realize is the, a lot of the models now they say you can give a lot of input tokens to them. Now just put it all in there, it'll figure it out. And what Gemini is like what a million, I think it is now, million tokens. So it's like, oh, okay, no problem. But it can't reason across it all.
[11:27]
Interviewer 1
Yeah.
[11:28]
Munjal Shah
They'll show you examples of what we'll call needle in haystacks, where it'll be like, okay, it'll find that one thing. Yeah. I mean, grappling for a word is not that hard in computer science. Like we can find a word, but what you're really trying to do is reason across it. So I'll give an example. If you ask your care manager, can I have ibuprofen? And they say, sure, you can have ibuprofen, but don't take too much, that's fine. Right? Because it's an over the counter medication. Unless you have chronic kidney disease stage three or four, then it'll kill you.
[11:56]
Interviewer 1
Right.
[11:56]
Munjal Shah
Well, if you put the rules for ibuprofen and CKD into GPT4 and then ask it, it'll do great. If you put in all the rules for all condition specific over the counter medications and ask, it'll still do pretty good. But you'll start missing some sometimes, which is still not okay because you could kill people. But fine. Have you put in the patient's medical history, the patient's last 10 conversations with you, all of those rules for over the counter medication disallowance and the current checklist for what you're supposed to follow with that patient and maybe a few other things and then ask it, good luck. And what it is, is we have an attention span problem. But if you have multiple models, we have these other models only focused on checking one thing at a time. So there's an overdose engine and it listens to every turn of the conversation. It's like, are we talking about drugs? Are we talking about drugs? Yes, we're talking about drugs. Okay. And then it's like, well, okay, did somebody just say a number that's an overdose relative to their prescription or relative to max toxicity of what you can have of that drug? Okay, did. And it may not seem that hard. Four pills versus two pills. But when you're talking about creams and injectables, it gets quite hard. I took a whole bunch of my testosterone cream and I Rubbed it on my hand. Was that an overdose?
[13:07]
Interviewer 1
Right.
[13:08]
Munjal Shah
I don't know. How much cream was in your hand?
[13:09]
Noah Kravitz
Right. What's a little bit? What's in your hand?
[13:11]
Munjal Shah
What's a little bit? Was it a pea size? Was it a cherry tomato size? Was it an apple size? RLM knows how to ask all these questions and knows how to navigate assessing whether it's actually an overdose. And you cannot have. If a patient shares an overdose information with a care manager in a clinical setting, you need to do something.
[13:28]
Noah Kravitz
Yeah, you may have said this at the beginning, so forgive me, but how many clinicians, doctors, and how many patients are you working with right now?
[13:35]
Munjal Shah
Couple different things. So first is to test and certify the product.
[13:39]
Noah Kravitz
Yes.
[13:39]
Munjal Shah
We basically ran a quasi like output testing trial. So a lot of people say, hey, tell me what you trained your LLM on so I know it's safe. I don't know who came up with this question because you have things like PubMed GPT that's trained only on PubMed, a evidence based archive, and it still gives you stuff that's not right. Or it'll conflate two things and give.
[14:01]
Interviewer 1
You things not right.
[14:02]
Munjal Shah
So what we realized was you got to do output testing. You got to test every output. But you can't test every output of a horizontal model. There's an infinite number of permutation combinations of GPT4. But you can, it turns out, do that for a vertical model when you roll it out one use case at a time.
[14:17]
Interviewer 1
Right, right, right.
[14:18]
Munjal Shah
I'm doing a pre op call for a colonoscopy. I'm gonna make sure you took your bowel prep. I'm gonna make sure you've fasted the night before. I'm gonna like do all the steps. Okay. We hire a ton of clinicians to act like patients. Call it up when we first make the news, use case and mark every error. And then we go back and keep improving the thing until we've done that. So we've now done that. We have 6,000 U.S. licensed clinicians who have now done 309,000 clinical test calls.
[14:45]
Interviewer 1
Wow.
[14:46]
Munjal Shah
And so we call this output testing. We've done more output testing than anybody.
[14:50]
Noah Kravitz
Yeah.
[14:51]
Munjal Shah
You know, we spent double digit millions of dollars basically certifying the safety of the product, not by looking at its architect or how it was trained, but looking at what it finally does in the end. And if this is going to call my mother, she's 81 years old, I want to know what it's going to say in the End. I'm not comforted by, oh, I trained it on this.
[15:13]
Noah Kravitz
It still told my mom to take 10 Advil. That's not okay.
[15:15]
Munjal Shah
Yeah, right. That's not okay. So I think these are the benefits of really having a large clinically driven thing. And I mean, who knows best how to assess an AI, except the clinicians.
[15:25]
Noah Kravitz
You've been speaking about this a little bit, but what have some of the other or some of the biggest challenges been in developing this inference. Inference based system.
[15:36]
Munjal Shah
The analogy I draw at the company is like this space is evolving so fast. We're crossing a wood plank bridge and the planks are showing up like two seconds or four hits the next step. And when we first started the company, there was no open source. And all of a sudden open source arrived. You know, there was no optimized inference engines. And then all of a sudden those arrived. There was not a really good tts and all of a sudden a great tts, a text to speech engine arrived. And so one of the things has been we've had to redo some work.
[16:09]
Interviewer 1
Sure.
[16:09]
Munjal Shah
Right. Because we did it and then this thing showed up and realized there was a better way to do it. So you got to redo the work. And so I think one of the challenges just been keeping up with kind of how things are evolving. I think the other one has just been running counter to a lot of people's kind of core thesis since they're all going after this cost per token. Our infrastructure needs are different, but we also have a different budget. It's made it a little easier, you know, because we're offsetting a very expensive resource per hour.
[16:35]
Interviewer 1
Right.
[16:36]
Munjal Shah
And so that's, that's been that. I mean, the other stuff is all normal, go to market stuff.
[16:40]
Noah Kravitz
Yeah.
[16:40]
Munjal Shah
It's new technology. People want to know more about it. You know, how do you know it's safe? That sort of thing I asked you.
[16:46]
Noah Kravitz
About and you said this is the first question people ask about, you know, well, how are the patients reacting? What about on the clinician side? Are doctors, other caregivers excited to work with you? Are they worried about this kind of taking their role? What's, what's the, the vibe like?
[16:59]
Munjal Shah
I think there's such a shortage there.
[17:01]
Interviewer 1
Yeah.
[17:02]
Munjal Shah
Post the pandemic they really had.
[17:04]
Noah Kravitz
Yeah.
[17:05]
Munjal Shah
You know, they realized like, we got to do something else.
[17:07]
Interviewer 1
Right, Right.
[17:08]
Munjal Shah
I mean, have you ever, if you try to get a PCP here in Santa Clara county, like it's like six months.
[17:13]
Interviewer 1
Right.
[17:14]
Munjal Shah
I mean, you want to see a specialist, it's Pretty bad. In fact, the other day my, like somebody was telling me that I go, come fly to New York and get your test done, because you know you will get it done in a month. Like, I'm like, really? Like that's our answer. And so, yeah, I think we have no choice. And so, and most of most people realize that and they realize there's an opportunity. And then when you start talking about this idea of abundance, they realize there's a whole bunch of things you can do that you never could do before. And I think that we've seen it open arms. We signed up 25 health systems providers and pharma clients in basically six, seven months since we took the product ga about last June.
[17:55]
Noah Kravitz
Moving fast.
[17:56]
Munjal Shah
And in healthcare you never get that many. And we have another, we actually will sign another three this month. By the end of, by, by the next June, I bet you we'll be at like 30 to 40. That's a number that normally you're a health tech startup at like year seven or year five. Like you're way down the line. The needs recognized, the needs there, there, it's, there's a lot of pain around staffing and staffing shortages.
[18:18]
Interviewer 1
Right.
[18:19]
Noah Kravitz
So what's next for Hippocratic? What's the rest of the year, the next couple years, whatever the timeframe is. What can you tell us about the future roadmap?
[18:28]
Munjal Shah
You know, we're continuing to expand. I mean, so I think there's a couple directions. So one is we're really pushing hard into the payer space and providing them. A lot of payers have large teams of what are called case managers that reach out to patients and follow up and make sure they're doing their proper treatment protocol because otherwise it ends up more expense for the payer. And by payer, I mean health insurance companies, we're doing the same for pharma. You know, they're running all these clinical trials and they're like, look, the AI can just call and make sure every day at 4pm when you're supposed to take that med, you take that med.
[18:58]
Interviewer 1
Yeah.
[18:58]
Munjal Shah
Because if you don't, you mess up the trial.
[19:00]
Interviewer 1
Yep.
[19:00]
Munjal Shah
There's also some interesting ways to use the technology for clinical trial recruitment or kind of even qualification. You could ask every patient. You know, there's a lot of soft factors in clinical trial qualification.
[19:11]
Interviewer 1
Sure, right.
[19:11]
Munjal Shah
And it's like, oh, you know, do you get a rash when you put the continuous glucose meter sensors on? Because we don't want you in the trial if you do, because it's a diabetes trial and then you're going to take it off and it's not going to work. That's not in your health record that you get a rash.
[19:24]
Interviewer 1
Right.
[19:25]
Munjal Shah
Like very unlikely. And so, you know, but if the AI could call up a whole bunch of people and ask and kind of figure it out. So I think there's some really interesting ideas like that. We're also expanding internationally. We just did our first deal in uae. We're about to do another set of deals in Southeast Asia.
[19:42]
Interviewer 1
Great.
[19:42]
Munjal Shah
We are seeing the whole. I actually thought it was mostly just the US short of clinical staff, maybe a little European. The whole world is short. And you know, with the aging population in so much of the planet, you know, basically we all have no choice. And so we're seeing quite a large.
[19:57]
Noah Kravitz
Demand all over Wenchiao for listeners who'd like to know more about the company, about any of the aspects of what we're talking about. The website, the best place to go.
[20:06]
Munjal Shah
Yeah. Hippocratic AI.com we have a lot about our LLM and the architecture. We published a paper on that. We recently also put out a paper on our safety testing protocol.
[20:17]
Interviewer 1
Oh, great.
[20:17]
Munjal Shah
We actually hope it becomes a kind of a way that everybody starts testing kind of mission critical LLM stuff. And basically this output testing and how we did it in detail, how we hired the people, how we tested the cohorts, how we compared them, the human clinicians. And we hope that sets a new framework for how to do that. And then you can also just read about the company and our history and kind of what we've done and our values and things like that.
[20:42]
Noah Kravitz
Fantastic. Well, Munjal, thank you for taking the time. I was going to say to tell us about Hippocratic, but you've been. It's a short time but you know, the name is out there. People know. So it's great to get an update and the work you're doing and the approach to safety and checking. The output testing I think is really fascinating. So appreciate you taking the time to come tell us about it.
[20:58]
Munjal Shah
Thank you for having me.