
NVIDIA’s Sama Bali joins to explain how NIM microservices help developers build AI apps quickly, with less effort and no machine learning background.
Loading summary
Sama Bali
Welcome to Reshaping Workflows with dell Pro Max PCs and Nvidia, where innovation meets real world impact in high performance computing.
Logan
Hello, welcome. We have another exciting episode of Accelerating Workflows with Dell Pro Max with Nvidia. My name's Logan, I'm our host. You've seen me several episodes now. You're probably very used to my face. And we have kind of taken you on a journey so far. We've talked about Delpro Max, we've talked about Nvidia RTX Blackwell launch announcements with John. We've talked a little bit more about, you know, the different products that are coming out within Delpro Max, specifically the 14 and 16, Delpromax Premium and then the 1618, Dell Pro Max Plus. But today we're going to do something a little bit different. We're going to talk more about kind of some software and some AI, which is right up my alley. So I have one of my favorite people with me, I have Sama from Nvidia. So, Sama, take a few seconds, introduce yourself, give everyone kind of your background, what you do at Nvidia, and then we'll hop right into it.
Sama Bali
Thank you, Logan. So excited to be here. Hi everybody, I am Sama Bali. I lead AI Solutions Product Marketing at Nvidia. So my job is to take everything that we are building around Nvidia AI and then how that correlates with all the great things that we're doing with our entire line of Nvidia GPUs.
Logan
That's fantastic. So in the spirit of AI, you know, coming off of GTC, obviously a lot of announcements, a lot of great stuff that's coming out from Nvidia Partners, et cetera. But one of them that, you know, has garnered a lot of attention has been around Nvidia nims. So if you haven't heard it the term, you probably will in short order. So let's kind of start simple, set the context. Sama, what exactly is an Nvidia nim? What does that stand for? And then we'll build upon it from there. So that's what is an Nvidia nims. And then what does NIM actually stand for?
Sama Bali
So we've got NIM Microservices. That's the official branding for it. You saw at this gdc, we celebrated one year anniversary of these Nim microservices and essentially they stand for Nvidia Inference Microservices. And then we have the word microservices ahead of it as well. So it's kind of repetitive. But we go with NIM Microservices. What we realized soon was since 2023, there was a lot of interest with development of these AI models, these AI applications. And we wanted to make it simple, simple for everybody to build these AI application systems. You heard a lot about agentic AI and building of these agentic AI autonomous systems as well at gtc. But the heart of these are those AI models. And we also wanted to make sure that we're democratizing this use of AI models so that everybody, and that includes application developers, right? People who are just, who know how to code, who know how to interact with different kinds of APIs. Industry standard APIs have the ability to now add AI to their applications. And that's why we created NIM Microservices. So think of NIM Microservices as a standardized way to deploy and run AI models as containerized microservices. Now I'm going to break this down for you. When we say containerized, right? So these microservices are essentially pre built containers that include Nvidia software like Triton Inference Server, TensorRT, CUDA libraries, along with that AI model. So our job here is to make sure that we've got a NIM container for every AI model out there. So, and then along with the goodness of Nvidia Infant Services, which means that when you're deploying these NIM Microservices of an AI model instead of, instead of that AI model itself, one, the application developer does not have to do any fine tuning between the AI model and the GPU that you're running it on, right? We've made it really, really easy that you point to it and you're good to run altogether. So we have really reduced the time spent and then also making it easier where application developers do not need to have those specific AI skills of figuring out how to run an AI model. But then once you're running these on Nvidia GPUs, you're getting better throughput. You're getting, in some instances we are seeing more tokens getting generated because these are already fine tuned to run on Nvidia GPUs. So that's the containerized part. And then we said microservices. So we're making sure that you have the ability to move to the latest model as soon as you can, right? We're seeing newer versions of AI models come out every few months. As an app developer who's using AI for inference in their applications, you want to make sure that you have the ability to easily swap to the latest AI model out there. So hence why we're also producing these as microservices. So you have the ability to easily swap out the image of an older model with a newer version without really stopping your application workflow. So these are Nim Microservices. We've got the, you'll see we work with all kinds of partners out there. So we've got NIM models for open source models for our proprietary models, even Nim microservices for models which are produced by Nvidia too. So we do aim to have day one support for all models out there, which means that we host them on our website called build.Nvidia.com you can go, you can prototype on the website itself, and then in a few months we make each one of these generally available, which means we've done the proper QA testing for it. That means it's ready to be downloaded locally onto your Dell Promax PCs.
Logan
So, I mean, you kind of nailed everything like, I mean, I think you take like five or six questions all wrapped into one, which is great, and why you're one of my favorite people because you're so comprehensive and know your product line so well. So two things that I heard about Nvidia NIMS that stood out is one is kind of the containerized, easy to deploy, but then also up to date, real time with everything kind of wrapped around it, ready to go. And you made an interesting point is that, you know, by no means am I, you know, data scientist or anything like that. And I have personally ran into a few times where I have fine tuned a model or I have done some stuff with stable diffusion where, hey, there's a new thing out and I have to then go back and redo everything. So if you were looking kind of across, you know, AI developers, software developers, where's the real target market for NIMs? Let's say two big target markets or people who should be really interested if you're, you know, within this use case that should be looking at NIMs, I.
Sama Bali
Would say, number one are application developers, right? These are people who are creating applications. They know how to code with APIs. They know the usual app development workflow may or may not have that AI expertise, right? And at this point in time, almost every industry, almost every business function needs an AI augmented application. So NIM microservices are definitely designed and targeted towards application developers because, because you can just point an API to a NIM model and then continue with your application workflow. So that's the biggest target for us, we want to make sure that everybody, every developer out there has the ability to use AI in their applications. And I would say a secondary target and again a key target here is AI developers as well. Because in a lot of enterprises we are seeing, AI developers are the ones who are doing a lot of these fine tuning of models, figuring out which goes where, trying different kinds of models based on the use cases that they're going after. So AI developers, you know, by default become, become a big target for us as well.
Logan
I love that. So let's, I mean let me, let me kind of bring it with a good example here is that you know, where you should be think if you're an ITDM or you're, you know, a technology matter within a company, large or small is think about it like this is app development. Let's just very simple, you know, example whether you're an ISV and you're selling a software or B you actually have an app in the app store, whatever it is, think application development all the way across and you're like, hey, I want to do, I want to put a chatbot or I want to put you know, a non player character that references some underlying models in a rag database. Right. Is that for you to do that? Yeah, you have the person designed that can make the ui, but you might not have the data scientists on staff to be able to do that. What NIMS does is allow you with, and I don't want to because this will be my next question, but very simply put in a few little things and then you're off to the races or you're a little bit more mature organization, you've got data scientists you allow to free up their bandwidth of not doing well, I'm going to say recursive tasks of constantly optimizing. You take that out to allow them to focus on, you know, what they do best. So with that, how easy and how do you deploy a nim? Like where, like let's say, let's start from a technical standpoint. Where do we go and what do we do?
Sama Bali
Step one for trying these different kinds of NIM models is going to build.Nvidia.com, this is our one stop shop where we host all of the NIM models. So and it's a great website to go just try it around because you'll see we recommend models by different kinds of business functions or you know, functionalities of applications that you're trying to build. So it could be vision, language models, reasoning models and so on and then we also have models listed by industry. So if you are in, let's say gaming healthcare, we've got nims, you know, segregated by. So it's easier for you to find the right kind of AI model, you know, in terms of what you're trying to achieve with your AI application. So that's step one is going to that website. That's where you have the ability to go test out as many different kinds of models, right? This is hosted on Nvidia side. You don't have to pay for anything. You can easily go try a bunch of different models, you know, query a few and see how the results look like. Once you find the model for your specific use case, you can prototype right on the website itself. We've given you the ability you can have your application running locally on your workstation itself while pointing at a new model which is still hosted on Nvidia build Nvidia.com itself. Now let's say you say, okay, I did my prototyping, it was great, I want to continue. That's when you can, you have a few more options. You have the ability to then download that name onto your workstation. Now we've also made it easier for you to download these by introducing the Nvidia Developer program. So this is a free program. To join you, just sign up with your email address. It does not have to be a business or an EDU email address. It can be any generic email address of yours. But what that gets you is free access to all of these NIM models where you can download these NIM models. And as long as you are in that phase of experimentation, prototyping, you know, trying to learn from different kinds of models, it's, it's completely free of cost. So you can easily download that, prototype your application, test out your application, and then once you're ready to deploy that particular application along with the nim, you can use the Nvidia AI enterprise license. That gets you access to a lot of our security support structure. We want to make sure that once you're deploying your application into production, and that can be on a workstation or in the data center or cloud of your choice, you definitely are covered, especially from that support structure. Because especially for enterprises, we are seeing that going from prototyping to deployment and production is a big journey, right? And we've got the Nvidia AI experts who can definitely get you guidance and the right support structure to get you going to production side as well. So that's what the journey looks like. And Another key benefit of NIM microservices here is that these are optimized for all of our Nvidia GPUs, right? For our entire breadth of our GPU portfolio. So you can get started with prototyping and building your application locally on a Dell Promax system. But then once you're ready to, let's say, deploy it out, let's say you're deploying it to four GPU Dell Promax system now, where it is being used by a team of four or six, you can easily move that to a different point without any kind of changes. Right? Because these NIM Microservices are optimized for all GPUs, and the same goes for that. Once you're ready to deploy it for, let's say, your entire enterprise and you want to host it within your data center again, these NIM microservices can easily be moved. There is no extra optimization work required because we've done that hard work for you.
Logan
That's fantastic. So you mentioned you know, Nvidia AI Enterprise, which is the key kind of, I would almost say, you know, the hub and spoke of kind of everything. AI deployment from Nvidia every. All roads lead to Nvidia AI Enterprise. At the end of the day, if you are interested in NIMs, by default we're going to Nvidia AI Enterprise. Or are there other ways to access NIMS for a deployment in a sense outside of Nvidia AI Enterprise, or is it just through Nvidia AI Enterprise?
Sama Bali
I would say that if you are just trying to experiment and prototype with different kinds of NIM models initially we've got the Nvidia developer program. It's a free program, there are no strings attached to it. That's where we see most of our developers and that includes AI developers, data scientists and app developers kind of getting started with the process. Because you know, you have. There are so many AI models, you've got a very specific use case for yourself. You've got to try the different kinds of models to figure out, you know, this one fits the bill perfectly for me. I will say that we've got the developer program to kind of get you started to help you experiment, test, prototype. Once you're ready for deployment, then yes, you should get the Nvidia AI Enterprise license because we want to make sure that from a security support perspective, you are 100% covered in terms of your production deployment.
Logan
Yep, absolutely agree. So wanted to make Hammer that home. I knew the answer, but it's worth saying but There are other roads into with NIMs, specifically around Nvidia AI workbench that ties in with that. Let's talk a little bit about Nvidia AI Workbench in nims. As we know with Nvidia AI Workbench, one of my favorite tools. Tons of projects, free to get started, free to use, very customizable, lots you can do. It's a good place where I like to say, get kind of your training wheels around AI, where a lot of the heavy lift has been taken out for you, but you still get to learn a lot and do a lot. But there's the ability to integrate nims. So let's talk about two. Well, let's start with one is let's talk about kind of the PDF to podcast kind of blueprint for NIMS on workbench that we're going to be showing off, that we showed off at gtc. Can you tell everyone a little bit about that and you know, how it kind of functions with NIMS within Nvidia AI or within AI Workbench.
Sama Bali
So let me first start by explaining what are blueprints and then I'll talk a little bit about Workbench and then I'll talk about, you know, how we've integrated all these pieces together. So while once we introduced, you know, NIM microservices, these are the heart of your AI systems, we then started to get feedback from our customers that great, you've given us these optimized, you know, AI models to run on Nvidia GPUs. They were still struggling to build, you know, a lot of the common use cases. And by that I mean like a PDF extraction system, which is more like a RAG system in healthcare. They wanted like a virtual screening application applications. So for media entertainment, we were hearing about, you know, how they are analyzing tons and tons of these videos. They wanted a way for AI to, you know, search for a specific thing, summarize a report and things like that. We introduced Nvidia AI blueprints. These, think of these as reference AI workflows, right? We are giving you a complete architecture system. We're giving you exactly all the components that you need to build a specific AI use case. And that goes from building a lot of your agentic AI systems as well, where you can have things like what Logan mentioned, like PDF to podcast, where you can now build a research assistant for yourself who is listening to this tons and tons of data and then creating like a summarized audio version for you. So we, so these blueprints really are really There to make sure that you have an easy step by step guidance to build a lot of these AI applications. By the same time you have the ability to customize these based on your data, based on your specific needs. So think of blueprints as hellofresh. So if you are in the US, you know about HelloFresh, right? That, that's a. It's not, it's like a, it's like a food delivery system where they get you all the ingredients, they get you how to build that specific recipe, but you have the ability to add a few spices of your own so that it's to your taste and liking. So that's exactly what we have as AI blueprints, right? And each one of them has these NIM microservices embedded in them so that you have the best options out there to build your AI use cases at the same time. Shifting gears to AI Workbench. This is our development platform. It integrates with all kinds of ides. It gets you that one stop shop to manage your GPUs and to build a lot of these AI projects. And again the key part here that you have the ability to build your AI project and then move it around from let's say you started on workstation to data center to collaborate with more developers, move it to the cloud, move it back to the workstation as well. So that gives you the freedom to really build a lot of these AI projects and do in a one stop shop development environment. It is also a free tool. You can download it easily on Windows, Linux, even macOS if you want. Now with Workbench, since we were seeing a lot of traction with a lot of developers who wanted to build these AI projects starting on their workstations itself, we have created a bunch of projects as Logan mentioned. Each one of these projects, whether you're building a RAG application, whether you're building a virtual assistant, whether you're building, you're fine tuning a model itself. A lot of these projects already have NIMs embedded in them. So by that I mean we've created these again step by step instructions for you. You can also go ahead, clone a specific project and then, you know, fine tune and make your own changes within that as well. But then recently we've also added the ability to actually create these blueprint inspired applications. So you have the ability to use your workbench and then as part of it, follow that step by step guidance. It's all available on our GitHub pages and then create a PDF to podcast application where you can feed in a document of sorts, and it will create a summarized audio for you. The same goes for a different number of projects where all of the blueprints that are coming out from Nvidia and our partners will soon make it available that you can build these applications through the AI workbench environment, which I know a lot of developers enjoy, and then deploy it to their point of wherever they would like to deploy that.
Logan
That's fantastic. Let's talk. Let's go back to NIMS a little bit where, you know, if you go to just Nvidia.com, go to the US site and then go to AI under artificial intelligence, or you just Google Nvidia Nims, you're going to get to kind of a landing page where you're going to see all of the ones that are currently available. So, couple of quick questions, and I, and I want to call this out because I recently learned this, is that there are some NIMS that will run on a Dell Pro Max and there's some that will not. And within that there are some. There's a kind of a keyword where it's kind of run anywhere versus, you know, ones that are a preview. So if someone is going to that page and they have a Dell Pro Max or, you know, a legacy Precision workstation, which ones are they going to want to gravitate to?
Sama Bali
I would gravitate, definitely. Okay, let me take a step back. If you're just trying to try out these different kinds of, you know, NEM models, I would say go try all of them, right? It doesn't matter what the tag is. It's only when you want to download these locally onto your Pro Max system. That's when you should look for the tag which says run anywhere. Those have been an easy way to say that in the software world that these are generally available, right? That means that we've done, we've made sure that all the security components, all the software QA testing, everything is done on our end. And now it's generally available for, you know, everybody to download these on the system, use it, add it to their AI applications. I would say look for that run anywhere tag. Now, within that as well, a key component is to ensure that your AI model fits onto your GPU memory size. Right? We've got. Let's talk about llama 3 itself, right? It can go anywhere between 8 billion parameters to 70 billion to 405 billion parameters as well. Now, you have to make sure that that model you are selecting actually fits onto the GPU memory size. So 8 billion parameters will easily fit. I have a blog which talks about how you should be calculating the GPU memory size based on the precision, the parameters of a model as well. And I will send it to you, Logan, so that can be added here. But that's a key component of understanding the model size to the GPU memory and how you play with that as well. So look for that, run anywhere tag. Those models are all generally available, but at the same time, every model NIM model comes with a model card. It's linked on the website itself. You can see how much GPU memory is required. We also keep updating the list of GPUs that it has been tested on and it has been, you know, required amount of memory for it. So that's another page to go. Figure out how much GPU memory is required before you download that model.
Logan
Yeah, that makes total sense. I mean, that's a thing that you're going to see a little bit. And you know, as I've learned kind of more about nims is they are going to. They're a little more beefy, to be honest, than just the traditional model. But it is, you're also. It's like you kind of. I'm going to try, I'm going to do a bad analogy. I have not planned this, but it's, you know, you're going to dinner, right? And you see, you go to an all you can eat. Where I grew up was like a, you know, a small golden corral buffet. It wasn't very expensive. But there's other places where you are a hundred dollars a plate, but you get multiple courses, finer things. So it is, in terms of the GPU size, it requires a lot more because it has the optimizations, it has the security, it has everything that you will eventually need running and going for you. But that comes at the cost of the. Not a cost, but it comes at the expense of your GPU memory. And you know how much vram, which, you know, if you followed along and you've watched some previous episodes, that is not so much of a problem with the new Blackwell GPUs. But that's a conversation we've already been through, so we won't even go there. But you know, when I look at the models that you know are available for nims, you know, there's a good depth and breadth, right? There is traditional like chat, you know, querying, you know, models. There's some stability AI like sdxl, just for image and content generation. There's other ones that are Vision, like why not? I know why you're adding these models because they're obviously popular and stuff like that. But what's kind of the process on, Is it customer request? Is it, you know, the ask of the market? Are you looking to see which ones are most popular being downloaded from hugging face? How does a model ultimately end up as a nim.
Sama Bali
I think I mentioned in the beginning as well, we tend to work with all kinds of model providers. So proprietary, open source, you know, our own data scientists who are building Nvidia models. So we tend to work with everybody and our aim is to have every AI model as a name, microservice available. We try to do that generally on day one itself. But at the same time we are definitely looking at more and more of the customer feedback in terms of, you know, where, where, what kinds of applications are they trying to build? And it's. And you'll find everything possible. Like we've got a model for everything. You're trying to build a digital human, you're trying to build a weather forecasting application, you're trying to build a medical imaging AI system. So we literally have an AI model for almost every AI application that you're trying to build. But yes, we definitely look at customer demand requests constantly, are working with every kind of model provider out there so that our customers have the ability to build the best AI applications with the best inference and performance possible.
Logan
I love that. So one thing, and I always like to tie it back to use cases is, you know, we've talked about some of the blueprints, we've talked about that, but not necessarily that, but like in the wild use cases from real customers, you don't have to name names, we can keep it all proprietary, but maybe two or three real customer use cases that you've, you know, you've seen, you've learned about just to give people. And I think that's the thing around AI that I won't say is confusing, but it's the art of the possible. Right? Like I usually hate that term because it's like such a corporate term. But AI is so new, it's like what is possible? Because until you dream it up, we don't know what is possible, we don't know what is next. Right. Like agentic rag six months ago, eight months ago was like, what is this? This is not possible to do this. And now it's very possible to do this. So maybe two or three really good use cases that you've seen just to help people get the their idea wrapped around how NIMs can help accelerate their AI, their workflows, definitely.
Sama Bali
So I'll start with the typical ones. For now we've got a mega store where you go and buy a lot of your home furnishings. Like if you had to go paint your house, where would you go? That kind of an organization, right? They have deployed NEM Microservices to kind of help uplift and upskill a lot of their employees. So they're deploying these so that they can easily go find things, easily ask questions, get answer to these. So think of this as, you know, an employee chatbot of sorts which is deployed organization wide. We've got a major retailer, they make everything from cosmetics to shampoos to body wash to everything. They are using NIM Microservices as part of their, of creating their brand images, right. So they're, they're creating new marketing campaigns as soon as possible by deploying these NIM microservices within their applications, specifically be used by their marketing department. We're seeing a lot of weather companies now building these weather forecasting models using NIM Microservices because I think 2024 was definitely a year where we saw a lot of devastation because of it. So we're seeing more. A lot of these organizations now using these NIM Microservices just so that, you know, we can be prepared better. A lot of universities and health, health organizations are now using NIM Microservices to help predict then the best way to cure a lot of the diseases we have currently to best predict, you know, better vaccines out there to help mitigate a lot of these really harmful diseases that are going around us. So those are some of the ones which we are constantly seeing. We're seeing a telecom organization use it, we're seeing energy customers also use to optimize their resources using NIM Microservices. So the use cases are definitely endless. I would recommend going back and seeing our GTC keynote to see all kinds of use cases happening with NIM Microservices.
Logan
I love it. And you know, as you can see from that answer, it's not necessarily one industry, it's not necessarily one use case. I mean, we went from vaccine creation and genomics all the way to weather prediction and analysis to chatbots to, I mean, the world is kind of your oyster, right? And I think that that is the really interesting, I think the really big takeaway from NIM Microservices. Right? And let's be, let's be honest, if you've got a data scientist, you're doing your thing, use AI enterprise, test out a nims. Hey, it's great. But really for those that don't, that don't have a surplus of data scientists to do kind of this back end pre work or have the cycles to continue to update, optimize all of that, or if you have an organization of truly just software developers that, you know, they're great clickety clack, you know, APIs, but they're not a machine learning engineer. They don't have that skill set, they've never done it. You don't want to go out and pay for it. NIMS is a great solution for that, to inject AI really into any application. And that's kind of the key that I think people get when you think about nims. And to be honest, I am guilty of it. And it's because it is kind of a weird concept. You're like, hey, I'm containerizing and doing all this and I can't tell you how many times Sama and I were like. So I was like, where does it live? Does it live on the GPU or is it an API call? Like, is it what is happening? And once you understand it, really think about like the hellofresh, right? You're basically allowing to bring it in. You're basically bringing in the new microservice. Everything's kind of been done for you. You can create it that way. It's perfect. You want to optimize, change, do things a little different. Absolutely, yeah. You know, add your spice of life. It's all great. But it really helps those. And I think folks that are watching this podcast and ultimately the webinar that have been on the fence about AI and they're like, either one, I don't have the resources, I don't have the technical knowledge, like, or I'm just scared to jump in. NIMS are the easiest way to start, in my humble opinion. It's a couple lines of code, it's.
Sama Bali
You'Re ready, you've got the developer program, go go wild, right? Like you can go try download as many as you want. There is no restriction around it.
Logan
Exactly. And you make it free. And that's the best part about Nvidia is like, you're right, you go to the developer portal, you sign up and then, yeah, you can test iterate. As long as you're not going to production, it kind of is your heart's desires, right? This has been great. So I'm like, you know, we're getting kind of up with it towards the End of the episode. So I want to take, you know, the last kind of question is, what are maybe two or three key takeaways that you want everyone watching this, whether being a customer or someone in sales or, you know, folks that are just out in the ether, to walk away remembering about nims.
Sama Bali
Okay, I'll start my top three, right? The first one is that running these Nim Microservices, especially locally on Dell Promax Systems, is going to significantly speed up your development and testing process, right? You as a developer does not have to continue to wait for a data center or a cloud resource just to go try out a few AI models, right? You have the ability to download first. You have the ability to go try out these different kinds of AI models on Build Nvidia. Com. You can prototype with it right there as well. But then you have the ability to test it, create that application, prototype your entire application. And then once you're ready to move, that's when you can go ask for that data center resource. Because we are at this particular point in time where data center, cloud resources, it's getting very difficult to get your developer resources just for testing, experimentation, even learning to some point, right? It is, it is involving tech. We have to continue learning new techniques, figure out new models. So you don't want your developers to be sitting idle. You want them to be continuing to push that boundary of innovation. So NIM Microservices on your Dell Promax systems get you that best combination where they have the ability to go test, experiment, learn, prototype, and then once they're ready for production, that's when you can bring in your, your data center, your cloud resources as well. And then once you're ready to move, that's another benefit where with Nim Microservices you have that version control and the ability to reproduce that same consistent environment across your local workstations and data center itself. That means you are streamlining that entire transition process, you're reducing the deployment time, and you're minimizing that risk of error as well. And then the third key point are cost savings, right? As an IT decision maker, I want to make sure that one, I have a streamlined, standardized process across all of my resources. But then you're also avoiding any kind of unnecessary cloud computing resources, especially when your developers are just trying to test out a few things locally. Running these Nim models on workstations is not going to cost you at all accessing them. We've also made it free, so there's going to be significant cost savings for you in the larger run.
Logan
I love it. So I mean really it comes down to test to your heart's desire. Go play around. 2. When you're ready, we allow you to do that locally or whether that's in the cloud. And once you're ultimately ready to launch and support whether you want to put it, it comes with the support, the documentation, the know how to get it done. I think Nims. Nims for President 2025 maybe. I don't know. I will say. Well, with that Salma, really appreciate the time as always. It's always fantastic talking to you. So today, you know, I want to wrap up with, you know, AI can be a bit daunting at times. You know, as I went on my personal journey with AI, trying to learn, you know, at first it was a lot of late nights, a lot of reading Reddit, but really if you think about nims, it is almost, dare I say, a cheat code to kind of get started. That helps you avoid a lot of that initial confusion, helps you accelerate a lot faster. It helps you almost eliminate the need for someone who's maybe not skilled as maybe not an AI developer being able to bring AI into your, whether it be your application or whether it's inside your company infrastructure very quickly. So I hope you enjoyed. Obviously there'll be a lot more about nims. We'll probably definitely have to do a follow up, but with that, Sama, once again, thank you so much.
Sama Bali
Thank you for having me.
Logan
Of course, of course. Anytime. This is Logan with accelerating workflows with Dell Pro Max and Nvidia signing off. And we'll see you on the next one. This podcast was produced in partnership with Amaze Media Labs.
Podcast Information:
In this insightful episode of "Reshaping Workflows with Dell Pro Max and NVIDIA RTX PRO GPUs," host Logan Lawler engages in a deep conversation with Sama Bali, the Head of AI Solutions Product Marketing at NVIDIA. The discussion centers around NVIDIA's innovative NIM (Nvidia Inference Microservices) Microservices and their transformative impact on making AI accessible and efficient for developers and businesses alike.
Logan initiates the conversation by seeking clarity on what NIMs are. Sama Bali provides a comprehensive explanation:
[02:10] Sama Bali: "NIM Microservices stand for NVIDIA Inference Microservices. They are a standardized way to deploy and run AI models as containerized microservices. This approach simplifies the integration of AI into applications by providing pre-built containers that include essential NVIDIA software, such as Triton Inference Server, TensorRT, and CUDA libraries, along with the AI model itself."
Key Points:
Logan delves into who stands to benefit the most from NIMs. Sama identifies two primary target groups:
[06:53] Sama Bali: "Number one are application developers who are creating applications and may not have specific AI expertise. Number two are AI developers who fine-tune models and need efficient tools to optimize their workflows."
Key Points:
The conversation transitions to the practical aspects of deploying NIMs. Sama Bali outlines the deployment journey:
[09:17] Sama Bali: "Step one is visiting build.Nvidia.com, where all NIM models are hosted. You can prototype directly on the website. Upon deciding to proceed, you can download the desired NIM models to your Dell Pro Max workstation via the free NVIDIA Developer Program. For production deployment, the NVIDIA AI Enterprise license provides enhanced security and support."
Deployment Steps:
Logan highlights the synergy between NIMs and NVIDIA AI Workbench, prompting a detailed explanation from Sama:
[15:21] Sama Bali: "NVIDIA AI Blueprints are reference AI workflows that provide complete architectural guidance for specific use cases, such as a PDF to podcast application. These blueprints incorporate NIM Microservices, allowing developers to build and customize AI applications effortlessly within the AI Workbench environment."
Key Points:
When discussing compatibility, Sama Bali advises:
[20:20] Sama Bali: "For deploying NIMs locally on Dell Pro Max systems, look for the 'Run Anywhere' tag on the NVIDIA website. Ensure that the AI model fits within your GPU's memory capacity by referring to the model card provided for each NIM."
Key Considerations:
Logan encourages Sama to share practical applications, to which Sama responds with a variety of industry examples:
[26:06] Sama Bali: "A major home furnishings retailer uses NIMs to deploy employee chatbots, enhancing customer service and internal support. A cosmetics company leverages NIMs for rapid brand image creation in marketing campaigns. Weather organizations employ NIMs for sophisticated forecasting models, improving preparedness and response."
Highlighted Use Cases:
Logan inquires about how new models are incorporated into the NIM ecosystem. Sama Bali elaborates:
[23:59] Sama Bali: "We collaborate with various model providers, including proprietary and open-source developers, to ensure a diverse range of NIMs. Customer feedback and market demand heavily influence the addition of new models, catering to a wide array of AI applications across different industries."
Key Points:
As the episode nears its conclusion, Sama Bali summarizes the core benefits of NIM Microservices:
[30:53] Sama Bali:
- Accelerated Development: "Running NIMs locally on Dell Pro Max systems significantly speeds up development and testing, reducing dependency on data centers or cloud resources."
- Seamless Transition to Production: "NIMs offer version control and consistent environments across local and production deployments, streamlining transitions and minimizing deployment errors."
- Cost Savings: "Standardized processes and the ability to conduct extensive local testing without incurring cloud costs lead to substantial financial efficiencies."
Additional Insights:
In this episode, Logan Lawler and Sama Bali effectively demystify NVIDIA's NIM Microservices, showcasing how they revolutionize AI integration across various applications and industries. By providing a streamlined, cost-effective, and user-friendly approach, NIMs empower developers and organizations to harness the full potential of AI without the complexities traditionally associated with its deployment.
For those interested in leveraging AI to enhance their workflows, exploring NVIDIA's NIM Microservices on a Dell Pro Max system presents a compelling and accessible pathway to innovation.
Notable Quotes:
Sama Bali [02:10]: "NIM Microservices are a standardized way to deploy and run AI models as containerized microservices, making it easy for application developers to integrate AI without deep expertise."
Logan [06:53]: "Where's the real target market for NIMs? Application developers and AI developers who need efficient tools to integrate and optimize AI models."
Sama Bali [30:53]: "Running these NIM Microservices locally on Dell Pro Max systems significantly speeds up your development and testing process."
Subscribe: To stay updated with the latest advancements in high-performance computing and AI innovations, subscribe to "Reshaping Workflows with Dell Pro Max and NVIDIA RTX PRO GPUs."