Scaling Scientific R&D with AI Supercomputing Infrastructure — with Thomas Fuchs of Eli Lilly - The AI in Business Podcast

Summary6 min read

Episode Overview

Podcast: The AI in Business Podcast
Title: Scaling Scientific R&D with AI Supercomputing Infrastructure
Guest: Thomas Fuchs, Chief AI Officer at Eli Lilly
Host: Daniel Faggella (with interviewer Matthew DeMillo)
Date: May 19, 2026

This episode focuses on Eli Lilly's initiative to deploy an AI-powered supercomputing platform to transform scientific research, development, and manufacturing in the pharmaceutical sector. Thomas Fuchs provides an inside look at why AI infrastructure is now a strategic priority, how decades of experimental data—including negative results—are driving new scientific advances, and what measurable impact supercomputing brings to drug discovery and development.

Key Discussion Points & Insights

1. The Strategic Role of AI Supercomputing in Pharma

[03:01–04:34]

AI now touches every part of the pharmaceutical value chain—from discovery and development to manufacturing and even finance.
Eli Lilly leverages AI to enhance small molecules, large molecules, and genetic medicines, as well as clinical operations via LLMs (for regulatory and medical writing) and digital twins in manufacturing.
Computation is fundamental: Supercomputing is necessary to handle large models, extensive data sets, and foster new scientific exploration.

Notable Quote:
“It’s really a scientific instrument. It’s like a telescope for an astronomer to look further back in time and into space or a new microscope. And it allows all our PhDs and machine learning scientists to just have a much, much wider horizon.”
— Thomas Fuchs [07:45]

2. Legacy Infrastructure Meets Modern AI Needs

[04:34–08:12]

Pharmaceutical enterprises are capitalizing on their long history and legacy data, but infrastructure modernization is required for AI capabilities.
Eli Lilly’s new supercomputer—Nvidia DGX SuperPod B300—is described as “enormously powerful” and is poised to be the most powerful in the industry.
Clear business value and impact metrics are a must for every AI project:
- From financial gains (e.g., fraud detection) to scientific advances (e.g., better molecule prediction), every AI initiative is assessed for measurable benefit.

Notable Quote:
“Just one of these new B300 GPUs is as powerful as 7 million of the supercomputers we had in ‘89. So it shows you how great they are. And we actually going to have a thousand of these.”
— Thomas Fuchs [06:40]

3. Integrating Collaborative AI Science: The Role of TuneLab

[08:12–10:04]

Eli Lilly’s collaboration with TuneLab provides a controlled hub for sharing models and datasets with external biotech partners, academia, and industry.
Only validated, effective, and secure models/data are shared. Partner and co-developed models will join this ecosystem.

Notable Quote:
“We have very, very stringent and precise process metrics internally to actually measure performance, validate them and test them... And then in the future, models we co-develop together, not only with startups and industry, but also with academia.”
— Thomas Fuchs [09:25]

4. Rethinking ROI: Beyond Speed and Cost

[10:04–13:50]

Drug development has traditionally been slow (10–15 years; $1.5–2 billion per drug). While AI can help, some biological processes cannot be “accelerated.”
The key ROI from AI isn’t just speed or cost—it’s the ability to do better experiments, ask deeper questions, and ultimately develop safer, more effective drugs.
AI helps manage the virtually infinite chemical search space: “more potential drug-like small molecules than atoms in the known universe.”
Immediate ROI is seen in manufacturing, where AI optimizes processes (e.g., speeding medicine drying, delivering millions more doses quickly).

Memorable Moment:
“For every molecule that worked, we had millions that failed. It failed because it didn’t bind or it was toxic... and all these never get published. And that’s that humongous space of these negative results. That helps us then to build better molecules.”
— Thomas Fuchs [15:52]

5. The Value of “Negative Data” in Science

[13:50–17:50]

Negative data—failed experiments—was traditionally ignored but is now seen as a “truth reservoir” that AI can leverage to avoid known pitfalls and design better drugs.
Distinction: Negative data (failed outcomes) vs. false data (erroneous or hallucinated results).
Internal data practices use LLMs to orchestrate research, but the most critical scientific models are tailored to molecular complexity—not language.

Notable Quote:
“…If you trained… AI on only the positive outcomes, on the successful results that could get published… that’s a tiny fraction of what would happen… And that’s why of course in our case, we can tap into decades of these results.”
— Thomas Fuchs [15:45]

6. Human–Machine Collaboration & the Limits of Current AI

[17:50–21:30]

There’s an evolving recognition that language models alone can’t capture scientific complexity; specialized models are needed.
The human element is still critical in research; AI augments, rather than replaces, scientists for the foreseeable future.
Fuchs questions current LLM limitations and suggests future AGI will require internal world models and fundamentally new architectures.

Notable Quote:
“We don’t see robots that took millions of years to use our digits. Language [is] much, much younger, mostly used for communication… and again, our language is just not enough to even describe [biology], but makes it exciting.”
— Thomas Fuchs [19:15]

7. Measuring Scientific Value: Real Metrics for AI Supercomputing

[21:30–23:04]

Classic ML projects (fraud, sales, process efficiency) have clear metrics.
For scientific R&D, metrics include:
- Number of candidate drugs co-developed with AI.
- Lab experiments avoided/pruned.
- Improved predictions (in silico, in vitro, in vivo).
- Enhanced discovery of rare disease therapies via nucleotide/RNA models.
The goal: Measurable improvement in scientific output, not just faster or cheaper processes.

Notable Quotes & Memorable Moments

“[The supercomputer] is really a scientific instrument. It’s like a telescope … or a new microscope. It allows our PhDs and machine learning scientists to just have a much, much wider horizon.”
— Thomas Fuchs [07:45]
“For every molecule that worked, we had millions that failed… and all these never get published. And that’s that humongous space of these negative results. That helps us then to build better molecules.”
— Thomas Fuchs [15:52]
“The complexity of a single cell goes far beyond human language can even describe. So you would constrain yourself if you constrained yourself to language-based models.”
— Thomas Fuchs [15:25]
“We are far from AI going rogue and doing everything on its own. These are really tools that help our research teams to co-develop what they do.”
— Thomas Fuchs [21:51]

Key Timestamps for Important Segments

| Timestamp | Topic | |-----------|-----------------------------------------| | 03:01 | Defining the supercomputer’s strategic role in AI transformation | | 05:38 | What the Nvidia DGX SuperPod B300 brings to business and R&D | | 09:08 | Criteria for sharing models/data via TuneLab with external partners | | 11:32 | Time, cost, and the deeper ROI of AI in drug discovery | | 13:50 | Leveraging “negative data” for scientific gains | | 15:15 | Security, LLM orchestration, and negative vs. false data | | 17:50 | Understanding AGI limits, human-AI collaboration | | 21:30 | Measuring real scientific impact and ROI of supercomputing |

Final Takeaways

AI supercomputing is now as vital a scientific tool as any lab instrument, expanding researchers’ ability to test, validate, and discover.
Decades of both positive and negative experimental data uniquely position established pharma companies to build better AI models than startups or newcomers.
AI’s true ROI in pharma is not just reducing cost or accelerating timelines, but enabling better experiments, higher quality candidates, and faster, more scalable manufacturing.
Eli Lilly’s approach integrates robust IT modernization, deep scientific rigor, and a collaborative, secure AI ecosystem that leverages both internal and external expertise.

For business leaders:
This episode makes clear that scalable AI infrastructure is no longer “just IT”—it’s becoming central to competitive strategy in research-driven industries, with measurable scientific and commercial impacts.

Loading summary

Transcript21 lines

[00:13]
A
Welcome everyone to the Emerge AI and Business Podcast. Today's guest is Thomas Fuchs, Chief AI Officer at Eli Lilly. Thomas joins Emerge as Matthew DeMillo to explain how Eli Lilly is building an AI ready supercomputing platform to strengthen discovery, development and manufacturing. He describes how large scale compute lets researchers work with bigger models, use decades of experimental results and explore a wider range of potential molecules. He also outlines where this is already changing scientific work from cutting down unnecessary lab experiments to improving early prediction of molecular properties and speeding key manufacturing steps. Just a quick note for our audience that the views expressed by Thomas Fuchs on today's program do not reflect that of Eli Lilly or its leadership. Do you sell AI products or services? Emerge gives you access through trusted content and real conversations. Then our leading AI brands like Nvidia and Google Cloud work with Emerge to reach Fortune 500 AI buyers. Download our media kit at go.emerge.com partner. That's go.emerj.com p a r t dash e r Now the conversation with Thomas.
[01:38]
B
Thomas, thank you so much for being with us on today's show.
[01:41]
C
Thank you for having me, Matt. It's really a pleasure to be with you.
[01:44]
B
Absolutely. We're seeing pharmaceutical leaders really pushing AI in a lot of different spaces. We've seen things like digital twins at places like Pfizer. I think these are moving from not just what's front facing for customers, but now the thought is moving more towards the infrastructure of the very company itself. And a lot of pharmaceutical enterprises have proven that AI can generate value in research and development in very, very, very incremental ways. But scaling those gains has exposed a much harder problem. Most legacy infrastructure was never designed to support that large scale parallel scientific discovery for Eli Lilly. And a great pleasure having you on the show to give us an inside look at this. But you're launching a new AI supercomputing platform meant for solving for performance, security and organizational alignment all at the same time, while ensuring the technology could support real scientific rigor faster experiments. We're talking today about how Lilly approached those different challenges and why AI infrastructure is becoming a core strategic capability rather than a background IT decision. But just to start off, how is Eli Lilly defining the strategic purpose of the supercomputer within its broader digital and AI transformation roadmap?
[03:01]
C
So Matt, as you know, of course we are in very interesting times because AI by now is in everybody's language and everybody's mind. I do have a PhD in machine learning from a time when it was not cool yet. So it's Very, very nice to see the world change. And to that end, of course, it touches everything in the pharmaceutical value chain. You already mentioned discovery, which is of course a big part. And there we place bets for AI on small molecules, large molecules and genetic medicines. But the beauty is it of course goes far beyond that. If you think in the clinical space you would have large language models for medical writing or answering regulatory questions. And so for. And then in manufacturing we're also building digital twins, but in these cases it's digital twins of manufacturing processes or machines or robots and so forth. And then you go into discovery and finance. In all these areas, AI plays a big role. And to that end, if you really want to level up in these spaces, you of course need the compute to drive all of that, to drive the exploration, to build larger models, to build meaningful models and so forth, and take advantage of all the data you already have. And Lilly is a very old company. We are 150 years this year. And so there are at least decades of data we can bring to bear to really be at the forefront of building foundation models, frontier models in the discovery space of very dedicated physical model manufacturing. And the supercomputer is going to allow us to do that in all these spaces.
[04:35]
B
Yeah, the supercomputer itself and your partnership with Nvidia, which we're going to get to, I think really is part and parcel of these trends that we're seeing, especially from, you know, those centuries, centuries old Fortune 100 enterprises, that there's a certain way of going about this where we can take the benefits that we have from this legacy infrastructure, the data that's there, but also not just have it stand on stilt technology, really get to the heart of these infrastructure systems and modernize them for capabilities that in business goals that we'll need today. And we've had folks from Microsoft come on the show and talk about their partnerships with Nvidia. I've noticed there's always a system at the heart of it. And it sounds like you read the name off of this and it sounds straight from a Sci Fi novel. The system you're using is the Nvidia DGX SuperPod B300 system architecture. Those are fancy words, but what does that mean incrementally for the business and the advantages that you're bringing in, modernizing the legacy system.
[05:38]
C
So what it means, it's just a fancy name for enormously powerful computer chip. It is a graphical processing unit, which of course very good in linear algebra. And all these neural networks are based on Linear algebra. And they can just calculate these neural networks or whatever needs to be calculated when you actually do forward and backward passes through these very deep, very complex systems very, very fast. And Lilly always was on the forefront of actually building supercomputers. We had one in the 50s and 56 from IBM, we had the Cray supercomputer, Cray 2 supercomputer in the industry in 89. And just one of these new B300 GPUs is as powerful as 7 million of the supercomputers we had in 89. So it shows you how great they are. And we actually going to have a thousand of these. So it's going to be the most powerful supercomputer in our industry. But you're right, we don't do that for machine learning or AI sake. You have to bring value to the business. So all the projects we embark on in these areas have a very, very clear value definition, how it's going to serve, for example, discovery or development or manufacturing. And then they are associated with metrics we can measure so we can track very precisely. If you're making progress towards the goal. These could be a financial benefit in fraud detection applications. But often they are of course focused on how can we serve patients better. And that means can we predict properties of molecules better, can we design new molecules that work better towards a specific target, can we accelerate some of the development pipelines with AI? And then we are tracking against these metrics so we can really see what impact that COMPUTE has. But I think what's even more important actually than all these things is that if you have appliance like that, it's really a scientific instrument. It's like a telescope for an astronomer to look further back in time and into space or a new microscope. And it allows all our PhDs and machine learning scientists to just have a much, much wider horizon. While in the past you could just build very small models on small data, now you can think much, much bigger in these areas. And that in the past always led to breakthroughs as we had for example in computational pathology or in other areas.
[08:12]
B
Yeah, I think as we get farther down into COMPUTE and into deeper areas of compute, more powerful compute, that telescope metaphor is coming back over and over again. Had a recent interview with some folks really doing some fascinating stuff in quantum. And that tends to be, you know, we're looking into places where microscopes is another metaphor. We're looking into spaces more deeply than we ever have before. And that applies to our computing power as well. Staying in your partnerships and with respect to the point you had made, that yes, we have a legacy system, yes, we go back 100 years, but we've been doing this every couple of decades with the latest technology when we see a threshold within the market. You're also working with Toon Lab to operate as a collaboration hub. I'm just interested in the criteria that determines which model or data sets are made accessible to external biotech partners.
[09:08]
C
So that first of all, they have to be, of course, good models. So we have very, very stringent and precise process metrics internally to actually measure performance, validate them and test them. So we have to use them ourselves, we have to believe in them, they really have to bring value. Second of all, everything has to be pure and protected and make sure that everything we do also on the eyesight is safe and effective and accretive. So when all these areas, when all these thresholds are passed, then of course it depends what interest there is. And then we decide what to put on TuneLab, and it's not only on our own models. So TuneLab of course goes further. By now, then there are models also from partner companies and then in the future, models we co develop together, not only with startups and industry, but also with academia. So we do a lot of open source work and then work with academic partners in the US and beyond to build models for the whole ecosystem.
[10:05]
B
Absolutely. And that kind of articulates kind of the two directions we're seeing with R and D in terms of driving this computing power, driving this technology into these spaces, in that we're at the point where R and D, especially for the pharmaceutical space, we can develop more molecules, proteins to make, to make cures for rarer diseases than there are grains of sand on the planet. All at the same time. You have the larger business infrastructure, the thresholds that we've heard so often for decades that it takes, you know, 1.5 to $2 billion to bring a drug to market, and you're not going to bring more drugs to market than there are grains to sand on the planet at that speed, especially when it still takes 10 to 15 years for that process to really reach fruition. We hear again and again from technology that those numbers do stand to change from these technological developments, from driving AI into these systems. But all at the same time, we have R and D folks come on the show and say, you know what, maybe it's not really about time and cost savings, but maybe having better experiments, maybe asking deeper questions, and that's the real ROI of the AI that we're driving into these systems just wondering how you're thinking about the changing timelines for molecule discovery and optimization compared to previous benchmarks or really what is the ROI between the two.
[11:32]
C
It clearly depends on the area where you apply AI and discovery. Your sentiment is certainly correct. The goal is not to accelerate that dramatically because at one point, as soon as you of course have it in trials, biology has to play out. You cannot accelerate biology, so the whole, you cannot shrink the whole process that drastically. What you can do is you can de risk it by actually coming up with better molecules, safer molecules, and also of course come up with new ones you didn't think about, as you said. So if you think about the just combinatorial explosion, if you think of all potential drug like small molecules, that's more potential combinations than atoms in the known universe. So currently with the very manual approaches that are still used, you're just scratching the surface. And AI of course will help you to sift through that enormous combinatorial space much, much more effective. And that does benefit patients drastically because you can hopefully find molecules for targets you couldn't address in the past and again reduce timelines, hopefully reduce, for example, also animal testing with AI and just find molecules with better properties. But again, in our case we're going to use of supercomputer far beyond that. And that is, for example, in manufacturing where you have immediate return of investment. We had a large project where we accelerated, for example, the drying process for medicines for APIs, something that seems very mundane, but that helped to get millions of doses to patients faster and that's immediate return of investment. So globally seen for all our research and engineering efforts, it's a very simple bet because it just brings so much benefit. On the discovery side, of course, it's still early days, right? While model space you see tempering off of the basic models in the discovery space, it's still a wide open field. And that of course makes it very interesting from the algorithmic side, from the whole design, test, make and analysis loop. And I think that's also why in these areas, when you have a lot of data and produce a lot of data as we do, you can actually really move the needle drastically in a way nobody else could.
[13:50]
B
Absolutely. And almost needless to say that in these efforts, in driving technology into these spaces, where we end up with more molecules and cures than there are grains of sand on the planet, we have what gets called positive data, which is the data we know that is reliable and gets us those strong outcomes. And then we have this Space, it's a relatively new term, but called negative data, where we kind of threw that data out. If we put it in the wrong place, it might lead to some bad results. But what I found very interesting about especially what your team gave me in how this supercomputer is gonna work, is you guys are actually leveraging a lot of that negative data. It makes sense to me. I come from a bit of an academic discipline, had a philosophy teacher in college who was very much like, you would be surprised the amount of truth in BS if you know how to look at it the right way. And it seems a little bit like that's the discipline. But of course, for a lot of pharmaceutical leaders, enterprise leaders at home, that negative data can kind of be not just a no man's land, but a very dangerous place. Especially where you're trying to drive security and drive away from hallucinations. Especially if you can put a better definition maybe of negative data for the audience or how you look at it. Feel free, but very interested in how you're leveraging it. And any advice for leaders in ensuring that data against security and hallucinations concern.
[15:15]
C
You're bringing up very good points. So first, on the security side, in discovery, we are using large language models mostly just to orchestrate work in a gentic way. That's very useful. But the models that are really driving the design are different models and not language models. The complexity of a single cell goes far beyond human language can even describe. So you would constrain yourself if you constrained yourself to language based models. They go beyond these molecular models, diffusion models, genetic flow models that can come up new molecules. Then what we often mean when we say negative data is mean, negative results. So for example, again, if you trained, let's say a scientist or a chemist, just based on all literature and all publications, you would only train that AI on the positive outcomes, on the successful results that could get published. And that's of course a tiny fraction of what would happen. For every molecule that worked, we had millions that failed. It failed because it didn't bind or it was toxic or for any of these reasons. And all these never get published. And that's that humongous space of these negative results. And that's why of course in our case we can tap into decades of these results. So we know a lot of things that do not work. And that helps us then to build better molecules. And that's also one reason why language models will never be able to be actually a good scientist or solve these problems. You have to go beyond that.
[16:53]
B
Yeah. I've been in this seat for about three years and it's amazing how much the conversation has changed from in the initial, I call it the ChatGPT explosion of generative AI. How when it seemed like, oh, there's gonna be no speed limits on this. Oh, human beings might be obsolete in a couple years. It is amazing. Especially the last year of 2025. We're recording this right at the top of 2026 that we. Oh no, there's gonna be this speed limit. Oh, no. We're gonna need humans in this space for a much longer time. And we don't know if there's ever going to be any kind of handoff. Which is not to say that there aren't handoffs being made or there aren't places that we need to get human beings out of for much, much more proficient systems that deliver better results. But also very curious, you know, we talk about negative data again, this is slightly newish term for the audience. What's the difference between negative data and. And false data? Especially as you're trying to drive those pharmaceutical results.
[17:51]
C
So in that framing negative would be a negative outcome of an experiment. You set up an experiment, design a molecule, see if it helps for a disease, and then it doesn't work. So you know, that is a negative result in contrast to a positive result. If you think about false data, that's of course prevalent in models like language models or sometimes diffusion models, where you predict something and you have hallucinations, of course, of the language. It's actually a bad term if you call it hallucination or fabulation. It is actually. The model is exactly doing what it was designed for. It predicts the next word with the highest probability. And as soon as you ask something which isn't in Wikipedia or in hundred books, then it just can go off rails and just tells you something else. It's like a human, if you ask them and they don't really know. What we wanted was a language model that was more like a calculator, always correct instead of just coming up with an answer. And that's an inherent limitation of the current technologies. And that's why many of my colleagues and I think to actually get to AGI, you have to go beyond that. You have to have models that have a think in a thought vector space or have an internal world model where you can actually do predictions you can check against. That's very different. Also, we human mostly do not think in language. Language is not that old. It's 200 to 300,000 years old. That's why we don't see robots that took millions of years to use our digits language much, much younger, mostly used for communication. And then again, if you think about a cell, and that brings us back, we are so far from a digital twin of a cell because the complexity needed billions of years of evolution. And again, our language is just not enough to even describe it, but makes it exciting. And that's also why it's worthwhile to throw the compute and all that work of these fabulously talented people at it. Because at the end that can help us to lessen human suffering.
[20:04]
B
Yeah, you're opening up a very real can of worms there in AGI. And we have another podcast that our CEO and head of research, Dan Fagella hosts that really tackles that. But all to say, I think AGI especially because it's a bit of an event horizon, we don't, it's hard to tell what that looks like because it's going to be, in so many words, smarter than us. And it's just hard for us to imagine what that looks like. But also that also might be the point where we need, at least for pharmaceutical results, humans in the loop bridging those systems between deterministic and probabilistic systems, as you say, trying to get the LLM to act a little bit more like a calculator and being that bridge between that deterministic automatic result. I was speaking before about ROI and definitely something we're hearing more and more, especially from R and D guests on the program in the life sciences space and without, is that we're starting to move away from time and cost being the single filters of our understanding of roi. And yes, time and cost are important. We want to reduce those as much as we can, but maybe not at the expense of doing better work of having a deeper understanding of the systems. Just very interested in what metrics that Lilly will use to measure this supercomputer's ROI and scientific impact over the next three years.
[21:30]
C
Given that paradigm, again, in manufacturing or commercially are very hard metrics. Like in classic machine learning projects, you can just measure how much fraud you reduce, how much you increase the efficiency of the salesforce, how faster you are with your submissions and so forth forth in the discovery space. The ultimate goal is of course to have candidates that were co developed with AI. As you mentioned before, we are far from AI going rogue and doing everything on its own. These are really tools that help our research teams to co develop what they do. So you have to have candidates at the end until you get there. Of course, metrics are how can you more efficiently prune down the number of experiments you can do? Can you reduce the number of vet lab experiments with the same outcome? Can you better predict all these properties of the molecules and medicines? First of course in silico, then in vitro, in the cell cultures and then afterwards in vivo and so forth. And there are very, very hard metrics on all of these you can measure for these projects and they are very broad. So we talked a lot about molecular design, but it's also true for genetic medicines, which is a fabulous space of course, because you can start now finally target very rare diseases, for example, in the pediatric space, Hear genetic driven hearing loss and it's an example Lilly worked on on many, many different areas. And these new tools like nucleotide language models or RNA based models help us to address that.
[23:05]
B
Absolutely. And I think that definitely sets a way for listeners at home to think about, you know, these initiatives going forward, especially where they're getting into the very heart of their enterprise infrastructure and those lines are blurring between hey, what's the IT space here and and what's the fundamental way we do business and ensuring that very, very powerful technologies are working appropriately in those spaces. Thomas, very, very fascinating stuff that you've brought on today's show. Thanks for giving us an inside look at these new systems.
[23:35]
C
Thank you for the thought provoking conversation was really a joy. Matt, Looking forward to. Foreign.
[23:56]
A
I think there are three key takeaways from our conversation with Thomas. First, large scale Compute is becoming a scientific instrument that expands what research teams can explore and validate in discovery and development. Second, tapping decades of experimental results including negative outcomes, strengthens model performance and helps teams narrow in on better capability candidates earlier. And finally, AI Driven Compute is already improving day to day scientific work from reducing unnecessary lab experiments to accelerating key steps in manufacturing. Do you sell AI products or services? Emerge gives you access through trusted content and real conversations. Learn how leading AI brands like Nvidia and Google Cloud work with Emerge to reach Fortune 500 AI buyers. Download our media kit at go emerge.com box partner that's go.emerj.com partner for further executive level analysis and to join our network of leaders delivering workflow impact with AI, visit emerge.com on behalf of the team at Emerge. We'll see you on the next episode.
[25:11]
C
It.