Transcript
Sama Bali (0:05)
Welcome to Reshaping Workflows with dell Pro Max PCs and Nvidia, where innovation meets real world impact in high performance computing.
Logan (0:20)
Hello, welcome. We have another exciting episode of Accelerating Workflows with Dell Pro Max with Nvidia. My name's Logan, I'm our host. You've seen me several episodes now. You're probably very used to my face. And we have kind of taken you on a journey so far. We've talked about Delpro Max, we've talked about Nvidia RTX Blackwell launch announcements with John. We've talked a little bit more about, you know, the different products that are coming out within Delpro Max, specifically the 14 and 16, Delpromax Premium and then the 1618, Dell Pro Max Plus. But today we're going to do something a little bit different. We're going to talk more about kind of some software and some AI, which is right up my alley. So I have one of my favorite people with me, I have Sama from Nvidia. So, Sama, take a few seconds, introduce yourself, give everyone kind of your background, what you do at Nvidia, and then we'll hop right into it.
Sama Bali (1:17)
Thank you, Logan. So excited to be here. Hi everybody, I am Sama Bali. I lead AI Solutions Product Marketing at Nvidia. So my job is to take everything that we are building around Nvidia AI and then how that correlates with all the great things that we're doing with our entire line of Nvidia GPUs.
Logan (1:34)
That's fantastic. So in the spirit of AI, you know, coming off of GTC, obviously a lot of announcements, a lot of great stuff that's coming out from Nvidia Partners, et cetera. But one of them that, you know, has garnered a lot of attention has been around Nvidia nims. So if you haven't heard it the term, you probably will in short order. So let's kind of start simple, set the context. Sama, what exactly is an Nvidia nim? What does that stand for? And then we'll build upon it from there. So that's what is an Nvidia nims. And then what does NIM actually stand for?
Sama Bali (2:10)
So we've got NIM Microservices. That's the official branding for it. You saw at this gdc, we celebrated one year anniversary of these Nim microservices and essentially they stand for Nvidia Inference Microservices. And then we have the word microservices ahead of it as well. So it's kind of repetitive. But we go with NIM Microservices. What we realized soon was since 2023, there was a lot of interest with development of these AI models, these AI applications. And we wanted to make it simple, simple for everybody to build these AI application systems. You heard a lot about agentic AI and building of these agentic AI autonomous systems as well at gtc. But the heart of these are those AI models. And we also wanted to make sure that we're democratizing this use of AI models so that everybody, and that includes application developers, right? People who are just, who know how to code, who know how to interact with different kinds of APIs. Industry standard APIs have the ability to now add AI to their applications. And that's why we created NIM Microservices. So think of NIM Microservices as a standardized way to deploy and run AI models as containerized microservices. Now I'm going to break this down for you. When we say containerized, right? So these microservices are essentially pre built containers that include Nvidia software like Triton Inference Server, TensorRT, CUDA libraries, along with that AI model. So our job here is to make sure that we've got a NIM container for every AI model out there. So, and then along with the goodness of Nvidia Infant Services, which means that when you're deploying these NIM Microservices of an AI model instead of, instead of that AI model itself, one, the application developer does not have to do any fine tuning between the AI model and the GPU that you're running it on, right? We've made it really, really easy that you point to it and you're good to run altogether. So we have really reduced the time spent and then also making it easier where application developers do not need to have those specific AI skills of figuring out how to run an AI model. But then once you're running these on Nvidia GPUs, you're getting better throughput. You're getting, in some instances we are seeing more tokens getting generated because these are already fine tuned to run on Nvidia GPUs. So that's the containerized part. And then we said microservices. So we're making sure that you have the ability to move to the latest model as soon as you can, right? We're seeing newer versions of AI models come out every few months. As an app developer who's using AI for inference in their applications, you want to make sure that you have the ability to easily swap to the latest AI model out there. So hence why we're also producing these as microservices. So you have the ability to easily swap out the image of an older model with a newer version without really stopping your application workflow. So these are Nim Microservices. We've got the, you'll see we work with all kinds of partners out there. So we've got NIM models for open source models for our proprietary models, even Nim microservices for models which are produced by Nvidia too. So we do aim to have day one support for all models out there, which means that we host them on our website called build.Nvidia.com you can go, you can prototype on the website itself, and then in a few months we make each one of these generally available, which means we've done the proper QA testing for it. That means it's ready to be downloaded locally onto your Dell Promax PCs.
