NVIDIA’s Agentic AI for Container Security with Amanda Saunders and Allan Enemark - Software Engineering Daily

Summary7 min read

Podcast Summary: NVIDIA’s Agentic AI for Container Security with Amanda Saunders and Allan Enemark

Podcast Information:

Title: Software Engineering Daily
Host/Author: Software Engineering Daily
Episode: NVIDIA’s Agentic AI for Container Security with Amanda Saunders and Allan Inemark
Release Date: January 30, 2025
Description: Technical interviews about software topics.

Introduction

In this episode of Software Engineering Daily, host Gregor Vand engages in an in-depth discussion with Amanda Saunders, Director of Enterprise Generative AI Software at NVIDIA, and Allan Inemark from NVIDIA’s Morpheus Cybersecurity SDK team. The conversation centers around NVIDIA's innovative "blueprints"—reference workflows leveraging agentic and generative AI for various applications, with a particular focus on vulnerability analysis for container security.

Guest Backgrounds

Gregor Vand introduces the guests, highlighting their extensive experience at NVIDIA. Amanda Saunders has been with NVIDIA for a decade, evolving with the company's transition from graphics to machine learning and now generative AI. Allan Inemark has spent seven years at NVIDIA, moving from data visualization to cybersecurity, utilizing his background in industrial design to contribute to the Morpheus team.

Understanding NVIDIA Blueprints

Amanda Saunders provides an overview of NVIDIA blueprints, describing them as composite packages that integrate NVIDIA’s libraries, SDKs, microservices, and open-source tools into customizable reference workflows. These blueprints serve as starting points for developers, enabling them to rapidly build applications tailored to their specific needs.

“They are reference workflows. They take all the libraries, the SDKs, the microservices, even things from the open source and from the ecosystem around AI and simulation and everything like that. And it packages it up into something that can be taken, composed, customized to meet the needs of whatever company is taking them on.”
— Amanda Saunders [05:00]

Blueprints Beyond Vulnerability Scanning

When asked about other blueprint topics, Amanda mentions that NVIDIA has developed around 14 blueprints, categorized into four main areas:

AI: Including customer service agents and digital humans.
Omniverse: For digital twin platforms and simulations.
Bionemo: Targeted at the healthcare marketplace to leverage AI in digital biology.
Isaac Groot: Focused on robotics platforms.

Deep Dive: Vulnerability Scanning Blueprint

The conversation shifts to the specific blueprint for vulnerability scanning in container security. Allan Inemark explains the challenges of managing the vast number of Common Vulnerabilities and Exposures (CVEs), which are continuously growing and pose significant threats to containerized applications. Traditional methods of vulnerability analysis are time-consuming and labor-intensive, prompting NVIDIA to develop an AI-driven solution.

“Right. So if you're releasing thousands of containers and then you have tens of thousands of CVEs and each one is a real difficult thing to mitigate. It's hard to put out secure software.”
— Allan Inemark [09:57]

Leveraging Morpheus SDK and Agentic AI

Allan elaborates on NVIDIA’s Morpheus SDK, a cybersecurity AI framework designed to handle large-scale data processing efficiently using GPU acceleration. The blueprint utilizes Morpheus to create an event-driven, streaming pipeline that automates the mitigation of CVEs by leveraging large language models (LLMs).

“Cybersecurity is a data problem. And you know, there's so much data that it's hard to get grips on. You can't store it all. And if you have stored it all, it's hard to kind of munch through it.”
— Allan Inemark [10:08]

Model Selection and Embeddings

The blueprint employs NVIDIA’s NIM (NVIDIA-Integrated Models) microservices, specifically utilizing the Llama 3.17B model for its balance between performance and capability. Amanda highlights the flexibility of blueprints in accommodating new models as they are released.

“The blueprint today built on Llama 3.170 B because we think that works really well. But three months down the line, six months down the line, maybe we'll update it with some of the later models.”
— Amanda Saunders [21:42]

Allan further explains the role of embeddings in translating documents into a format understandable by the model, enabling efficient querying and context understanding.

“The embedding space is sort of how you translate a document to something that the model can understand.”
— Allan Inemark [22:43]

Technical Architecture: Plan and Execute Pipeline

Gregor Vand probes into the architectural choices behind the blueprint, focusing on the "plan and execute" paradigm. Allan describes this structure as mirroring the workflow of security analysts, breaking down tasks into parallelizable segments handled by different agents powered by LLMs.

“You're making an application, you're more of an agentic system application. Right.”
— Allan Inemark [39:36]

Real-World Reception and Community Engagement

Both Amanda and Allan discuss the positive reception of the vulnerability scanning blueprint within NVIDIA and among external partners. The open-source nature of the blueprints encourages community contributions, enhancing the tools through collaborative refinement.

“Developers have been really excited to have a starting point... It's just a lot. And so I think just having that starting point for people to then say, okay, well, I followed the blueprint, I did these things, but actually for my use case, if I make these changes, it works better.”
— Amanda Saunders [31:09]

Future Directions and NVIDIA’s Role in Security

Addressing NVIDIA’s future in the security landscape, Amanda clarifies that NVIDIA does not aim to become a cybersecurity solution provider. Instead, the company focuses on empowering cybersecurity professionals with advanced AI tools.

“We are an AI company. We understand AI, we're excited about it, we've got a lot of experience in it. So what we want to do is bring AI to cybersecurity providers, to cybersecurity problems.”
— Amanda Saunders [40:45]

Allan adds that NVIDIA's approach is to offer versatile frameworks like Morpheus and blueprints that can be adapted to a wide range of cybersecurity challenges, fostering innovation across various domains.

Personal Reflections and Advice

Towards the end of the episode, Amanda and Allan share personal insights on career development. Amanda emphasizes the importance of continuous learning and adaptability, while Allan reflects on the necessity of staying curious and open to evolving technologies.

“You can learn anything. As somebody who doesn't have a traditional technology background... as long as you stay open and keep learning, there's no telling where you can go.”
— Amanda Saunders [42:23]

“What stuff is able to be done right now? So on some level, just be open to it and stay curious.”
— Allan Inemark [43:01]

Conclusion

Gregor wraps up the discussion by encouraging listeners to explore NVIDIA’s build.Nvidia.com platform to access and experiment with the blueprints. Both guests express enthusiasm for the ongoing development and community engagement surrounding NVIDIA's AI-driven cybersecurity tools.

“Go to build.Nvidia.com that's their starting point. We've got all the latest models on there. We've got these blueprints that you can go and explore, and then your journey just starts there.”
— Amanda Saunders [45:11]

Final Thoughts

This episode highlights NVIDIA’s commitment to advancing cybersecurity through AI, offering developers and security professionals robust tools to tackle complex vulnerability management challenges. By providing customizable blueprints and fostering an open-source community, NVIDIA empowers users to build secure, efficient, and scalable containerized applications.

Notable Quotes:

Amanda Saunders [05:00]: “They are reference workflows. They take all the libraries, the SDKs, the microservices, even things from the open source and from the ecosystem around AI and simulation and everything like that. And it packages it up into something that can be taken, composed, customized to meet the needs of whatever company is taking them on.”
Allan Inemark [10:08]: “Cybersecurity is a data problem. And you know, there's so much data that it's hard to get grips on. You can't store it all. And if you have stored it all, it's hard to kind of munch through it.”
Amanda Saunders [21:42]: “The blueprint today built on Llama 3.170 B because we think that works really well. But three months down the line, six months down the line, maybe we'll update it with some of the later models.”
Allan Inemark [22:43]: “The embedding space is sort of how you translate a document to something that the model can understand.”
Amanda Saunders [31:09]: “Developers have been really excited to have a starting point... It's just a lot. And so I think just having that starting point for people to then say, okay, well, I followed the blueprint, I did these things, but actually for my use case, if I make these changes, it works better.”
Amanda Saunders [40:45]: “We are an AI company. We understand AI, we're excited about it, we've got a lot of experience in it. So what we want to do is bring AI to cybersecurity providers, to cybersecurity problems.”
Amanda Saunders [42:23]: “You can learn anything. As somebody who doesn't have a traditional technology background... as long as you stay open and keep learning, there's no telling where you can go.”
Allan Inemark [43:01]: “What stuff is able to be done right now? So on some level, just be open to it and stay curious.”
Amanda Saunders [45:11]: “Go to build.Nvidia.com that's their starting point. We've got all the latest models on there. We've got these blueprints that you can go and explore, and then your journey just starts there.”

Loading summary

Transcript67 lines

[00:00]
Amanda Saunders
Docker Container vulnerability analysis involves identifying and mitigating security risks within container images. This is done to ensure that containerized applications can be securely deployed. Vulnerability analysis can often be time intensive, which has motivated the use of AI and ML to accelerate the process. Nvidia blueprints are reference workflows for agentic and generative AI use cases. One of the most prominent blueprints is focused on vulnerability analysis for container security. Amanda Saunders is the Director of Enterprise Generative AI software at Nvidia and Alan Enemarck works on Nvidia's Morpheus Cybersecurity SDK team. Amanda and Alan joined the podcast with Gregor Vand to talk about blueprints and their application to vulnerability and container security.
[00:49]
Alan Inemark
Gregor Vand is a security focused technologist and is the founder and CTO of Mailpass. Previously, Gregor was a CTO across cybersecurity, cyber insurance and general software engineering companies. He has been based in Asia Pacific for almost a decade and can be found via his profile at Vand HK.
[01:22]
Gregor Vand
Hi Amanda and Alan, welcome to Software Engineering Daily.
[01:25]
Amanda Saunders
Great to be here.
[01:27]
Gregor Vand
Yeah, great to have you both here today. We're super privileged to have you from Nvidia and we're going to be talking all about blueprints and obviously we'll get into what they even are and especially in the context of security today. So as we often kick off, I think we don't often get to hear from people from Nvidia. So I'd love to hear from both of you kind of your paths to Nvidia and what you're doing there now.
[01:52]
Amanda Saunders
Absolutely, I'll kick off. My name's Amanda Saunders and I'm the Director of Enterprise Generative AI Software here at Nvidia. So my focus is on NIM Microservices, which are packaged AI models that are optimized to run on our infrastructure and blueprints. I've been at Nvidia 10 years, so I've had roles from everything when we started with graphics through machine learning and data science and now obviously into one of the most exciting topics that we get to see which is generative AI. So it's been a really fun ride.
[02:22]
Gregor Vand
Awesome. And Alan?
[02:23]
Alan Inemark
Yeah, I'm Alan Inemark. I'm part of the Morpheus team here at Nvidia and this is kind of the team working around cybersecurity cybersecurity issues and yeah, been it feels a bit unreal to say it, but at Nvidia for about seven years now, joined on as a data viz person on Rapids which is we're still working very closely with. But yeah, actually my original background is in design, industrial design, actually. And then kind of meandered my way to data viz and then to more product and now cybersecurity.
[02:56]
Gregor Vand
Awesome. Yeah. I mean, just before we kind of dive in, I guess both of you have been at Nvidia relatively long time now. I mean, can you just describe maybe very briefly, what's it felt like to kind of, I guess, Ahmad, I think you said 10 years ago, you've been there for 10 years. Yeah. How does that feel? Because, I mean, I think from the outside looking in, it's sort of. I mean, I was quite aware of Nvidia, given I like gaming and just keeping up with technological advancements generally. And then I think for most of the world, sort of Nvidia has only become a word that you even understand maybe one year ago, maybe maybe two. Yeah. How's it kind of felt from the inside?
[03:29]
Amanda Saunders
Yeah, I think, I mean, from the inside. You know, when I first joined, the only questions I got were about gaming. Even though I've been focused on enterprise since I joined. So there was a lot of, oh, so tell me about your gaming GPUs and things like that. And now the questions are obviously very different. It's all about AI and it's all about what we see the future in. So I think that's a very cool evolution. You know, the other thing I would say is, as somebody who's sort of watched the work that this company has done grow from graphics, and particularly gaming graphics into AI, I think what's just really fascinating is that these building blocks, it's not like we started something totally new. It's all building on top of that original mathematics of graphics that then transitioned seamlessly into AI. And again now this generative AI wave that we're seeing. So it's really fun to watch people as they see gaming, go to graphics, go to machine learning, data science, to where we are now to the point where we're actually bringing that AI back into gaming. And so it's kind of a full circle exercise, which I think is pretty cool.
[04:36]
Gregor Vand
Yeah, that's a fun loop. So let's dive into blueprints. So it's interesting. Blueprints had crossed my radar from someone else in Singapore, but admittedly I hadn't really gone in depth with them sort of until we were going to do this episode. And the more I've got into them, I'm just more and more impressed and kind of fascinated by kind of what they are. So maybe can we just talk about what are Nvidia blueprints and I guess why this initiative?
[05:01]
Amanda Saunders
Yeah, here, I'll jump in here on really, this was again, a journey. Like everything we do here at Nvidia, we've been building software, we've been building libraries, SDKs, now microservices. All of this is to package up software that can run on our accelerated infrastructure, on our GPUs, on our networking, so that customers can get the most out of it. Developers can build these applications. And as we built these building blocks, customers, developers, they were really excited. But what they started to say is, well, can you show us how to use them? Can you give us recipes, can you give us guidelines that help us as a starting point? And so we didn't want to build end applications, we wanted to build reference workflows. And that's what Nvidia blueprints are. They take all the libraries, the SDKs, the microservices, even things from the open source and from the ecosystem around AI and simulation and everything like that. And it packages it up into something that can be taken, composed, customized to meet the needs of whatever company is taking them on. So I think that's really the why. And then they're continuing to evolve to solve some of the most common use cases that we're seeing out there and bringing on new blueprints from not only Nvidia but also our partners to just really help people take advantage of this software as quickly as they can.
[06:22]
Gregor Vand
Yeah, and you know, we're going to be diving into sort of what are those blueprints specifically today around vulnerability scanning. And could you also maybe just before we do that, just. Yeah. What kind of. Maybe some other blueprints topics that sort of blueprints cover that come to mind?
[06:36]
Amanda Saunders
Yeah, I think, you know, as of recording, we have about 14 that are available. As I said, I think we introduced five at CES a couple of weeks ago. So we're continually adding to them. There's sort of four main categories that we're really building these out today. The first one is AI, very common, lots of things that we see there, whether it's customer service agents or digital humans. Obviously, as you said, the security blueprint we're going to get to dive into fits in there. Omniverse, which is our digital twin platform. Lots of blueprints around simulation and how we can advance those use cases. Bionemo, the healthcare marketplace is a really important one for us and we want to help those digital biology developers take advantage of all the new techniques in AI and then the latest ones that are going to be coming out are for Isaac Groot, which is our robotics platform. So as we see that start to grow again, how can we make it faster? How can we get developers started more easily? That's where we're going to start adding blueprints.
[07:35]
Gregor Vand
Nice. And yeah, I mean, just to kind of, I guess describe if I'm looking at just the build.Nvidia.com site and like, what does that blueprint kind of look like? You know, I've got kind of three tabs here, I've got experience, so sort of actually being able to just try out what sort of results that might come back if I was to use this blueprint. I've got the blueprint card, which is kind of giving me architecture, and then I've also got nim. Nim? Did you say Nim or Nim? How do you.
[08:05]
Amanda Saunders
Yep, Nim. So these are Nim microservices and they are just packaged, optimized AI models.
[08:11]
Gregor Vand
Awesome.
[08:11]
Amanda Saunders
Again, making that run really well on our software and really easy.
[08:15]
Gregor Vand
Yeah, I think we'll be sort of talking a little bit more about NIM and which NIMs, I guess are being used in this specific blueprint. So I think let's now just sort of dive into the one that we're talking about today, which is vulnerability scanning. Again, we've had a few episodes on in the past just about this topic in general, usually focused on maybe a particular product. So I mean, just kind of for listeners, maybe. I don't know, Alan, maybe we could just kick off with just simply talking. Very high level one is vulnerability scanning in the first place. For those that are familiar.
[08:45]
Alan Inemark
Yeah, absolutely. Cybersecurity, broad, broad topic. Lots of different areas of interest. This one in particular is around CVEs or, you know, common vulnerabilities and exposures. And that's basically this kind of registry, right. Where if there's a known security vulnerability, it's kind of registered there and available for everybody to look up and kind of mitigate on their own. Right now the issue is there are so many CVEs that are registered. I think there was 40,000 last year. It's just going up every year. And you know, this is, in our case, if you're trying to really secure software, it's very difficult to kind of manage all that and kind of keep track of it. And you know, in our case, this particular blueprint is for container security. Right. So if we have a container image, you're trying to release a container, you need to make sure that the CVEs it's scanned for are somehow mitigated or not actually a problem. And then you have this many. It kind of, you could see how it doesn't scale very well. Right. So if you're releasing thousands of containers and then you have tens of thousands of CVEs and each one is a real difficult thing to mitigate. It's hard to put out secure software.
[09:58]
Gregor Vand
Yeah. So this uses the microservices. It also uses, I believe, Nvidia Morpheus. And I think that's a product or a sort of. Could you maybe just speak to what is Morpheus in the first place?
[10:08]
Alan Inemark
Yeah, yeah. Like we call it cybersecurity AI SDK. Right. Which is sort of novel in itself. Usually when you think of cybersecurity products, it's like part of some sort of service or some company that's providing services. But what we've kind of found to be most useful is to release this capability out as SDK, which means that you can build the tools yourself with this framework. Right. So this blueprint, like we said, is very composable. It's sort of built using the Morpheus SDK and that sort of is an event driven pipeline. And so the reason we kind of developed this framework in the first place is ultimately like cybersecurity is a data problem. And you know, there's so much data that it's hard to get grips on. You can't store it all. And if you have stored it all, it's hard to kind of munch through it. And so really the way we're trying to approach to that problem is to use it as like streaming, inline, event driven. So you can kind of manage things line speed. Right. And so having GPUs enables that kind of for the first time because you're able to process data rapidly and quickly in line speed and do ML on it, you can do ETL on it. That's a very modular kind of pipeline system. And so as you can see, we're, you know, all kinds of capabilities through the Morpheus SDK. And a while ago, back to CVEs, right. It was sort of our own pain point where it was our own morphase container. And this was like during the holidays or something. And we got like a cve, it was like critical and nobody was around because they're all on break. And so whenever kind of like leads was all right, I'll handle this myself. And he's going through it and Munging through it and like to do a good CVE mitigation, you have to kind of open up the container, you have to see if the CVE is valid. You have to again, go and see. All right, well in this case it's like a version. Is that the versions that we're using? Sure. Okay. And then is this an actual function call and whatever. Long story short, like hours and hours later, it turned out to be like, not actually a vulnerability. Right? Because in this case there was something with some Java runtime and we're not even using the Java runtime. But like to even get to that point, you have to dig through everything. You have to kind of be an expert in code base. And at that point he's like, there has to be a better way. So make something, you have a bunch of integers, figure it out. So that's how this whole thing started to use this Morpheus SDK. And then we were like, okay, these LLMs at the time were starting to get really big. So we were like, well, how can we take these LLM capabilities and push it further in the cybersecurity stack? Right? And so that sort of started the prototype of what eventually became this blueprint and which was released out to the wild world, basically.
[12:43]
Gregor Vand
Yeah, the CVE problem as you call it, it's not solved by any means in the sense that, okay, it's great that we have CVEs that people report something that is in theory a problem with a particular piece of software or hardware. But the critical problem is knowing what is actually critical or sort of even moderate. And there are some, not exactly false positives, but there are also some there that are kind of not really a problem for anyone, but they're still there because yes, you have to, in theory, you have to report things.
[13:12]
Alan Inemark
It is a very messy space because sure, you can have tools that do the security scans for you, right? But the issue is they don't have access to the full context of what your software is and what you're actually using it for. Right? It's like a pretty naive scan, right? It's like, is this version and this software there? And so that's sort of what you traditionally would have this security analyst or application analyst do. And oftentimes it's not their full time gig. That's like something they do on the side as part of some group right? Ever, like they meet every Tuesday. And so it's not something that, you know, is terribly pleasant to do because it's not even the software that you're Writing yourself, right? You're sort of just, you know, you just want to deliver secure software. And so you have to go through all this, you have to ask the right people. Like I said, it's a very kind of often tedious process where you're aggregating tons of information from different sources. You have to even validate that it's a cve. You have to ask like, how's this stuff used? So you have to be an expert in a lot of different domains, but then you also have to summarize a lot of information and then go out and find it. Right. And so that's sort of how we approach this problem in the first place is, hey, this kind of workflow that this person is running for is like, it seems like it's something that can be kind of automated in a sense of like creating a workflow using this morpheus and these LLMs, right? Kind of that plan execute style LLM pipelines that we can then use to really like accelerate the timeline. Right. Because some of these, it could take a long time to kind of figure out and like just collect all the information. And also like, you know, you just don't have enough people to do it. Do it.
[14:48]
Amanda Saunders
Well, yeah, I think that's what makes this such a great use case for AI. Humans do it. We don't like doing it. We're not particularly good at it because we get bored, we get distracted. We have other things that are on our plate. And so by leveraging, you know, an LLM to take not all the work out, the security analyst still has to come in at the end of the day and they have to have their expertise, but taking out the tedium that they don't need to do and doing that a lot faster and probably a lot better, to be honest, because it is an LLM, it doesn't get bored. I think that's what really sort of helps us identify a great use case for AI.
[15:25]
Gregor Vand
I completely agree. I was in a role a couple of years ago where we were building it was attack surface management. So we were looking at CVEs more from the outside, but exactly. We had a person who would be tasked with sort of trying to translate, okay, take the CV what's reported from the cve, but let's try and change this into effectively natural language for the end recipient. It's got to be very contextual, like this person might not be technical, so that can't be technical language. And I think that's, as you've just said, Amanda, that's exactly what LLMs are fantastic for being able to be applied to.
[15:58]
Alan Inemark
Yeah, and it's sort of the magic sauce for these LLMs too is also they're able to ingest so many different types of information modalities, right. If you look at our architecture diagram, it's like we are ingesting stuff from the actual code repository. We're ingesting stuff from SBOM web searches, like general documentation. I mean it does this all stuff. Morpheus is really good at that. But you're able to kind of feed that all into, you know, using VDBs and kind of good prompting and stuff like that into something that LLMs can just digest. Usually you'd have to even do the pre processing back to more data analytics days to get it in a way that you could do. Traditional ML, for instance, is like, that's a huge project, it's going to always break, right? And so the fluidity that these elements enable you to do just kind of widens the amount of context you can feed them. And you don't need that level of like API specificity to kind of make it be useful still. Right? That's sort of at superpower. And so from there you can then more naturally like describe tasks for it. Right. And again, something that mimics how person analyst driven workflow would work, right? Like in our system you have one LLM call that's setting it up to do a task list, right? The first thing you do is like, all right, well I have the cve. Like what are the tasks do I need to do to figure this out? Okay, well first I'll check the version numbers and then second, after that, does version numbers match? Is like, I'll do what as a deep function calls for this, right? And after that it's like, all right, is this actual function the vulnerable aspect that is described in cve? Yes or no. And then to the benefit of our systems, we can do all that in parallel instead of sincerely, like a person would do. And then we kind of make an ultimate judgment on it with kind of a summarization, right? So based on all these individual tasks that these agents have gone out and concluded on and reasoned through through serial reasoning, are you ultimately vulnerable or not? And then kind of present that to the security analyst as a kind of a summarized view, which is useful in an outright saying like, yeah, you are vulnerable. You aren't vulnerable for ABC reasons. But we also found that super useful is kind of generating a very big report that is the full kind of reasoning process. And it's kind of Marked down nicely so that they can go and verify the sources or like indicate that, oh yeah, this reasoning is accurate or maybe it's like a bit off here. Let's go tweak that.
[18:23]
Gregor Vand
Yeah, and I'd like to maybe just step back one piece, which I mean we're just talking about sort of actually the architecture and sort of some choices around what's being used here. Because I think that's where a blueprint to me comes into its own, where some of these quite hard choices have been made for me in a really good way. You talked about plan and execute LLM pipeline. Could you just describe that architecture a little bit and actually why that was applied to this blueprint?
[18:50]
Alan Inemark
Yeah, absolutely. Essentially something that best matched how the analysts actually approach the problem. Right. And kind of mimics how you can break up a task into something that you can parallelize and then you can kind of connect this like reasoning to. So that's sort of the main structure of it as well as what we're finding out is while this is a cyber specific use case, the benefit of a blueprint is like it's not, it's extensible. So a lot of people are saying, hey, you know what, this kind of general architecture, if I ingest different types of data, I do different prompting, I can apply it to different problem spaces as well. And so we have a lot of solution architects, which are folks that run out and kind of interact directly with partners. They're doing a lot of very interesting stuff with this as well. That is like cybersecurity adjacent. Right. But they're using the similar sort of agentic architecture that this blueprint. Like you said, we sort of got a lot of the complicated stuff sorted out and laid out and for you and like, hey, how do you connect these pieces together? What framework should I use and say, well, hey, here's a really good reference. It's much better than starting with like a blank page, Right?
[19:58]
Gregor Vand
Exactly. I mean even the choice of models and both, you know, from the sort of pure LLM side and the embedding side, I believe is llama 3.17 terabytes was the model of choice. Could you just speak to sort of why that model and how do you even go about choosing a model for this kind of thing?
[20:19]
Alan Inemark
Yeah, that's been a fun process. We've gone through the gamut of so many different models here, which again shows the ability of Morpheus and blueprints to be very extensible. We started with GPT 3.5. We were experimenting with some Lora tuning to make it more cyber specific. And then essentially as we were prototyping this out, NIMS became available. And that just made our life so much easier because then it kind of handled all that stuff for us. And it's like, look, you just hit an API and we're like, awesome. We don't even have to spin something up. So we just ran with that and we found that specifically 70B was sort of the sweet spot with number of parameters. If you do more like we have some 405s out there, it doesn't do that much more improvement. And then if you do smaller, the agentic actions are as good. It's not as good as a tool usage. So it's a little bit of trial and error. But then we kind of, with that, we sort of dialed it into 7B being the most robust. And in the case of a blueprint, like you could mix and match models if you need to, if you want to go like, just add a little bit of extra performance or, you know, tweaking it for your use case. But in the case of a blueprint, it was just a little less complexity and easier to deploy if we just kept it the one model which is pretty much good at everything.
[21:33]
Amanda Saunders
Yeah. I think one of the cool things about this, I mean, and you know, this Gregor models, it's almost weekly that we're seeing new models come out as well.
[21:42]
Gregor Vand
Exactly.
[21:43]
Amanda Saunders
So I think that was the Great thing is llama 3.170b is an incredible model. It's so powerful. And Meta continues to innovate on what they're putting out there. So as new things come out, we can put it in and we can test it. And because it's this containerized model, it's really easy to swap them in and out and then say, okay, well, the prompting has to be adjusted a little bit or things like that to get the same performance. But it does really simplify this down. So, yeah, the blueprint today built on Llama 3.170 B because we think that works really well. But three months down the line, six months down the line, maybe we'll update it with some of the later models, depending on what new features come out in the models themselves.
[22:24]
Gregor Vand
Yeah, I mean, I'm going through that process at the moment with a product that I'm working on, which is. It's both that model as well as the embedding model. So I guess, yeah. Just also speaking about the embedding side, again, I never want to assume anything from our listener Base sort of what they perhaps know and don't know. Could you maybe just speak to very briefly, what is embedding and why is there such a specific model for embedding as well?
[22:43]
Alan Inemark
Yeah, sure. I mean, the embedding space is sort of how you translate a document to something that the model can understand. Right. There's a couple of different ways you can give the model a context, and embeddings are sort of the best way to do that. There's a lot of performance stuff you can do by creating an embedding. So, like, this is a really good example of, for this blueprint, we've made the embedding process in the VDB process like inline. Right. And there's a lot of different libraries, rapids associated that are good for VDBs and, you know, utilizing them in very accelerated ways. Definitely check those out. In our case, it made sense to do it inline because again, for ease of deployment and when you're turning a code base into embedding, you kind of. All right, is this the latest version of the code base is up to date? Yes. You know, if we do it live, then that's not a problem. If you were to take this into like full actual production, maybe not the best methodology. You want to make separate that out. Right? You want to have it. So that's like another process. I mean, still could use Morpheus, it still can use the same embeddings, but it's run on like a nightly basis because you could have, you know, again, your thousands of different code bases and that you don't want to do every time. And it's up to you to define your kind of process of like how often you want to update it and mimicking that. But piping it into the LM would be the same way. Right. So that's sort of like where we draw the line between this is a blueprint versus what somebody would take into production. Right. And kind of being mindful of it. And again, taking the ability of the fact that the blueprint is pretty extensible, it's not too hard to decompose.
[24:15]
Gregor Vand
Yeah, absolutely.
[24:17]
Alan Inemark
Are your software deployments secure by design? Lately, secure by design and shifting left principles have been hot topics in the software industry, pushing development teams to make security a foundational part of software development. Today's sponsor, Bitwarden supports developers in securing every phase of the development lifecycle with end to end encrypted credential management. This ensures software is built on secure principles to prevent data leaks and unauthorized access. Try Bitwarden Secrets Manager built specifically for developers to safeguard infrastructure and machine secrets. Or Bitwarden Password Manager for everyday logins and other sensitive information. Start a free trial today@bitwarden.com talked a.
[24:53]
Gregor Vand
Bit about sort of how all these processes are sort of how the data sort of makes its way into the system, if you want to call it that, and sort of at a higher level how that's going to be queried. There's a repo with all the information, incredibly well documented.
[25:08]
Alan Inemark
So really thank you very hard on the documentation. I'm a little bit of a nerd when it comes to good docs, so thank you. We work very hard to look at the index. It's huge, right? Because we want to.
[25:21]
Gregor Vand
I mean, yeah, I was going to say from such a large company, I was not expecting such good documentation. So within a repo itself, sometimes you can go find it somewhere else. But yeah, this is awesome. And ARC diagrams in there, et cetera. And if I kind of look at the main one that talks about the key components and on the left side we've got the ingestion of things, SBoM, et cetera. And then on the right hand side that's kind of all the bits that are then doing the figuring out, I guess is mostly nim. We've got NIM Checklist Generation, we've got NIM Task Agent, NIM Summarization, NIM Justification. So I guess a question I've got is NIM walk through are these sort of API endpoints or these things that can be self hosted or how do these work?
[26:07]
Amanda Saunders
Yeah, I'll jump in on that one. So yeah, NIM is essentially a containerized optimized model. So Nvidia, as I said, has loads of libraries and SDKs that we've been working on for years. And rather than asking developers out there to go package this up themselves, we thought why don't we do it for you? Put it into a container that either you or your IT person can go deploy and then we build standard APIs on top of them. So we use whatever's common in the industry and we try to be compatible with that, or we invented ourselves, if it doesn't already exist. But we have these standard APIs. So from a development standpoint, all you're doing is accessing APIs. This is something we're very used to, very used to developing with these days. But it's totally managed and controlled in your own environment, so you can take it with you. You can run it anywhere, from PCs, workstations, all the way up into the cloud, that's really the benefit there. And then, like you said, the blueprint has a number of new minutes. So this one, we're talking Llama 3.170 B. But if there are multiple NIM, one for embedding, it's the same deployment mechanism. So you're getting this really common way of deploying whatever the model is, no matter if it's an LLM retrieval model, speech avatar, you name it. So I think it's just a really powerful tool for both developers as well as those IT teams who support them. And that's what makes being able to do these blueprints and give the building blocks that Alan's been mentioning that are extensible, give those options to the developers. I think that's the beauty of nim, the secret of nim, as it were.
[27:52]
Gregor Vand
Right? Yeah. And again, as sort of documented in the docs, for this blueprint, as I imagine for most, it's very clear that they can be Nvidia hosted or they can be self hosted. Right. What's the platform like for the Nvidia hosted? Where does someone go, I guess, to kind of spin up an Nvidia account, for example, like for that side of things.
[28:13]
Amanda Saunders
Yeah. So we host the NIM for prototyping on build.Nvidia.com so hopefully easy to remember, so everyone can find it. So on there you can go, you can test out all the different nim, you can run the blueprints, as you said, experience them without even doing anything coding. But then we also have options to download and deploy them yourself. We also have another option called Launchables, where you can actually deploy onto another infrastructure. So to answer your exact question, if you're using the APIs that are hosted on build.Nvidia.com, you're actually accessing those models running on DGX, which is our most powerful infrastructure that's out there. So you're getting the best of the best, and then you can take them, you can deploy them anywhere, move them into production and really build real applications on top of the same software, just wherever your GPUs are hosted.
[29:04]
Gregor Vand
Yeah, that's super nice, being able to get to test out the Ferrari of GPUs, even if you maybe can't afford them for your own projects at specific times. But yeah, that's really nice. I mean, looking at kind of how this has been received in sort of the real world. What have either of you seen in terms of. So let's start with this blueprint, specifically, sort of what has been the reaction and sort of who has given those reactions, I guess. And then we might also just talk about any of the other blueprints that have received some kind of attention.
[29:32]
Alan Inemark
Yeah, I mean, I can start with. I mean, we're deploying this internally within our own teams. Security folks, Right. Because if anything, we should be using our own things that we're putting out there. And I mean, the feedback is if you work with security people, they can be very kind of conservative and skeptical, right? It's like, oh, somebody's just throwing another tool that I have to figure out how to use. So while there's a little bit of hesitation at first, once we were able to present it to them in line to their work stream, it's not like they have to do something new. It's sort of part of their process. They're like, oh, wait, this is great, right? Oh, this summarizes everything. And you give me the sources. Okay, I can start trusting it, right? And then very quickly it goes from like a. I don't know if I want to use this to like, starting to ask for more features, right? Like, hey, can it do this? And can we make sure it's accurate? Like, that's sort of one of the important parts of using anything like this. Like, you gotta have good domain experts at your disposal and kind of point them. We have also worked with some partners, you know, that are working with deploying these blueprints themselves and they're working really closely. I feel like some of the best feedback you can get is if folks start contributing back to your code, Right. Because it means they're starting to get vested in it, right. They're starting to grok it and, and understand it and they want it to improve. Right. So that's sort of. It's the worst when you don't get any feedback, right. But when you do get feedback, it makes people like, it shows that people are invested, right? So the blueprints, we're continuing to improve them. We're continuing doing commits and updating it to latest models and any sort of new features. So that's sort of the feedback. We're listening to it and then we're improving the stuff based on that.
[31:10]
Amanda Saunders
Yeah, I think, just in general, I think developers have been really excited to have a starting point. I think Alan sort of mentioned earlier, it's hard to start with a blank page generative AI. It's coming fast and furious. We all can think of things that we would love an agent to do for us, but then actually going about Building that and making all those decision points, it's a lot. And so I think just having that starting point for people to then say, okay, well, I followed the blueprint, I did these things, but actually for my use case, if I make these changes, it works better. And that's been, I think, the best part of feedback and the thing that we were really striving to do. Everyone's going to have their own changes and they're going to evolve over time. We didn't want to try to build the final solution. We just wanted to give a development starting point. So far that's been the feedback. But like Alan said, the more people use it and the more they contribute back and let us know what other use cases and what other areas I think will be really impressive and will help us. And then I think the next interesting one that we're hoping to see, and we're starting to see a little bit, especially internally, is sort of daisy chaining blueprints. How can you connect these things? How can you take pieces from one and add it to another? Our digital human is a great example of that. Almost any agent can have an avatar in front of it. So if you want a CVE avatar, you can build that. So I think that's going to be a really cool thing to watch when people multiple blueprints and build their mansion or whatever it is.
[32:49]
Alan Inemark
Yeah. I mean, just to chime in too, it's pretty impressive to see Nvidia basically one of the things that Nvidia is really good at, and I'm slightly biased, but the way it kind of makes you can buy in as little, as much as you want into the system. Right. Hardware wise, Software wise. Right. It's like you can go full everything like Nvidia up and down the line, or you can just say, you know what? This one piece is what I'm interested in. Right. And so we're sort of doing that a lot with the software now where if you want a Nim like this is just you can build out your own framework and you just hit the API with nim. Okay, great. Right. Because I don't want to do optimization or blueprints. Great. I'm going to take this and customize it. I'm going to change blueprints together, right. And build some crazy, crazy big system out of these things. But it's sort of. It's all available and very easily enabled. There's no real gatekeeping to this stuff. It's pretty out there. That's another reason why a lot of these Blueprints are released on GitHub, so it makes it kind of just easy to use in a sense, what we're going for here. Right.
[33:55]
Gregor Vand
I think that comes out in, I don't know, all aspects of what, at least from an outsider's perspective, I've seen from Nvidia, which is just this sort of overall openness to try and help people use its stuff. It is complicated stuff. And so it's sort of. Exactly. I feel like blueprints has, to me anyway, shown how invested Nvidia is in actually having it being used in the real world, as opposed to just sort of saying, well, we're going to produce these incredible GPUs and then buy them, and that's kind of where we finish. So, I mean, to me, this is just a really great initiative in terms of any of the other blueprints. I mean, I don't know. Which would you say has had the most reception? I don't know if it is. Perhaps it is this one. I don't know. I'm just curious if any have kind of.
[34:36]
Amanda Saunders
We had a standout, again announced a couple weeks ago at CES was a PDF to podcast. So taking PDF data, processing it, having speakers, multiple speakers, be able to come out in sort of a format. Gregor, we're not coming for your job. Don't worry.
[34:53]
Gregor Vand
I have used, I mean, just to, I guess to name another, like NotebookLM. Obviously I've tried that out. Yeah, I was impressed, but I was also not too worried about my job.
[35:02]
Amanda Saunders
Exactly. And so the way I like to think of this one, and I think, you know, we all sort of touched on it and even the CVE blueprint covers. This is. This is another one of those use cases where how can I take masses, amounts of data and summarize it and structure it in a specific way? We happen to choose podcast, but it could be any structure. And I think that's a very common task for us humans, especially as the world continues to grow and data continues to grow and we want to absorb and learn as much as we can. These types of tools, I think, have just endless amounts of use cases. So that one's been really popular. I'm going to, you know, it's only been out for a couple of weeks. We've had thousands of people going to it and testing it out. So it'll be interesting to see what comes out of those initial development efforts there.
[35:53]
Gregor Vand
That's really fun and. Exactly. Just sort of going back to, I guess, this sort of talking about reception and contributing back et. Cetera. Can you maybe just speak to. I mean, there's a repo there. I've seen, like, a lot of people have forked it. How do you sort of class this from a sort of open source point of view?
[36:06]
Amanda Saunders
Yeah, I think we want to make blueprints. Alan said it well, we want to make them open and available. You know, this is a lot of. This is stuff that we do internally. We're using it ourselves and we had the same problem. And so rather than just us doing it and us learning from it, how can we put it out there into the community? And we're going to continue to evolve these blueprints, we're going to continue to learn from these blueprints. We love seeing them forked and being used in other ways. I think that's great. And I think one of the big next steps is opening that up to the ecosystem. There are incredible partners out there like LangChain and Llama, Index and Crewai and Daily, even some of the MLOps partners like weights and Biases, who have expertise in this, who have blueprints that not only include Nvidia and Nim and the great work we do there, but include their own libraries, SDKs, services. And so being able to create a hub of recipes for the community, I think is just really, that's what we're trying to do and be as open as possible.
[37:14]
Alan Inemark
Yeah, I kind of want to just maybe wrap it up in a little bow tie here with like, maybe we, like, noticed that we haven't talked that much about cybersecurity specific things. And it's sort of, in a sense, what we do here, Nvidia, is like, we take all these amazing frameworks, we build out new frameworks, and then we apply them to like a multitude of use cases, right? We're not trying to do one specific product domain or anything. You can see, like, we're all over the place with interesting problems we're trying to solve. And we're sort of, we're not prescriptive, like how we think they should be solved, right? It's like we're going to give all these tools and all these capabilities out there. And like Morpheus sort of started that way, right? It's like, well, what are these GPU acceleration capabilities? How can we do new things with them to solve cybersecurity problems, right? Hey, these LLMs are out there. How can we use them to solve cybersecurity problems? And so, like, you know, that same thing can be said for all kinds of different domains in the tech Space and enterprise space. Right. And so that's sort of what we like to do, is put out all this really interesting, useful frameworks and then see what people can build with them. Right. We're not trying to build the end thing ourselves, we're just trying to find how people can come up with really great and interesting solutions to their specific use cases. Because I mean, at the end of the day, folks are their own domain experts, right. They know their problem space better than anybody else. And so being able to enabling you to build for your own problems and solve them is something we like to do.
[38:48]
Gregor Vand
I'm glad you brought that up because I did want to kind of just come back, so to speak, to kind of the security part of this. And I think to me, even it was a little bit sort of almost surprising to sort of learn that Nvidia was looking at security full stop. So how might we sort of, I mean, you've sort of, I guess summarized it there, but you know, how might we see Nvidia evolve in the future from a sort of security standpoint? I mean, I liked that the blueprint, this one specifically you mentioned, sort of it actually came from like an internal pain point of how do we solve our own container CVE munging and half of the best products in the world come from people trying to solve their own problems. So can we expect to sort of see more maybe outward facing security, I don't know, products or I don't know, GPUs that are almost designed with this use case in mind. I'm curious how that looks.
[39:36]
Alan Inemark
Yeah, I mean there's so many cybersecurity specific use cases that are big problems and you can solve them a lot of different ways. Yeah, we're continuing to just see what we can do with the Morpheus SDK and with Genai and essentially trying to get this sense of. There's been a shift here where we're instead of thinking about you're making an application, you're more of a agentic system application. Right. And so we're sort of internalizing that shift a lot and seeing how can we up level that kind of capability to a wider audience. So you don't necessarily need to be like some hardcore dev person. But like blueprints is like it's a good starting point for you to kind of take advantage of this agentic AI stuff. So continuing to do that. And then there's, there's always graph, which is something that is like super powerful but always kind of gets like pushed back a bit. There's some amazing graph capabilities within Rapids and that we're working around the cybersecurity space. But yeah, we're just sort of a lot of applied research that we're doing in Morpheus Tide.
[40:46]
Amanda Saunders
Yeah. And just to make a. Make a point on that one is, you know, we don't want to be a cybersecurity solution provider. You're not going to see Nvidia up there competing in that space. We are an AI company. We under. We love AI, we're excited about it, we've got a lot of experience in it. So what we want to do is bring AI to cybersecurity providers, to cybersecurity problems, to things like that. So I think just to, you know, I don't want to confuse people to say, hey, Nvidia, they're diving into the cybersecurity space. No, we really. We're focused on AI and where it can help and whatever those problems are. And cybersecurity just happens to be one of the great ones that we can help with.
[41:25]
Gregor Vand
Someone came up with a phrase the other week saying sort of the future, or at least 2025, looks more like it's not software as a service, but it's service as software. I think that's kind of a good way to think about it, which is sort of solving slightly more specific problems in a very deep way, whether it's for a specific enterprise or maybe a group of companies, but sort of not this sort of blanket, one tool fits all. And I feel like this is a very good example of this where again, CVE remediation has many different contexts. Am I remediating from a external attack surface point of view? Am I remediating it for internal container point of view? I would want very different results based on that. So I think this definitely leans into that way of thinking. Just as we wrap up, I always like to sort of ask guests just a couple of kind of questions. The main one I like to ask is knowing what you know now. If you could sort of tell yourself anything at the start of your career, what would you tell yourself?
[42:23]
Amanda Saunders
I'll jump in on that one. I would tell myself, you can learn anything. As somebody who doesn't have a traditional technology background to now, having been in technology for a long time, but been in Nvidia 10 years, there are things I have learned that I never realized I could and would never have bet on. But I think as long as you stay open and keep learning, there's no telling where you can go. So that's what I Would have comforted myself that there's nothing that you really can't just learn.
[42:51]
Gregor Vand
Yeah, I love that. I was thinking this morning, I was on the bike this morning and thinking, yeah, it's just about so long as someone knows how to learn, then possibilities are endless. So I like that a lot. Alan, what about you?
[43:02]
Alan Inemark
It's hard to top that. I mean, if you're working in video, it's kind of. You have to, right? The amount of different and new things I've had to learn every couple of years and just not out of, like, necessity, but out of just, like, curiosity, right? Because it's like, wow, that's neat. Like, how far does that go? You just gotta be open to that and adaptable to it. Even as I'm getting a little bit older and I'm like, you know, getting a little bit more grumpy about, ah, this new thing came out. Do I have to learn it or not? You start getting a sixth sense of, like, if you've been in technology place long enough where like, no, no, there's a thing. This one has a thing to it, right? And so this, this AI stuff, I mean, yes, very hyped, but, like, there's a thing there and it's kind of like we're. Sometimes it's easy to do. Very jaded about this tech stuff, but like, sometimes it's also good to take a break and pause and like, really, this is like, kind of magic. Like, it really is magic. Like, I mean, I was thinking about Star Trek the other day and how, you know, before they were like, oh, iPads, you know, they're little pads, right? You're like, oh, they're so futuristic. And now like, oh, they look clunky, you know. And then I'm like, oh, that's so funny. The tech looks. And then I was like, talking to the computers and like, you'll just magically make a thing like, that's never gonna happen. And now I'm like, they're doing that right now. I'm like, huh. So, yeah, it's like, it's pretty magic, right? What. What stuff is able to be done right now? So on some level, just be open to it and. Yeah, stay curious.
[44:34]
Gregor Vand
Definitely. Yeah. I mean, I can identify with that. There was a phase where JavaScript frameworks were a dime a dozen, and I think my learning capacity just kind of tanked at that point. I was like, yeah, this is difficult. This is really difficult to get excited or even think about learning. And then, as you say, things evolve. And luckily that phase went by and now we're here. And again, my learning kind of enthusiasm has just rocketed. I couldn't agree more with that. Look, it has been, again, such a privilege to have you both join Software Engineering today just again, as a recap, where's the best place for a developer just to head to kind of get stuck in?
[45:12]
Amanda Saunders
Go to build.Nvidia.com that's their starting point. We've got all the latest models on there. We've got these blueprints that you can go and explore, and then your journey just starts there. We'll give you the notebooks, you can start running, and hopefully you'll have your application going in no time.
[45:29]
Gregor Vand
Awesome. Well, thank you again. And yeah, hope we get to catch up again in the future.
[45:33]
Amanda Saunders
Thanks so much.
[45:34]
Alan Inemark
Thanks.