
Live from GTC, F5’s Director of Product Management Greg Dalle pulls back the curtain on the critical infrastructure layer powering modern AI factories, from intelligent LLM routing to next-gen traffic management.
Loading summary
A
Foreign.
B
Welcome to Reshaping Workflows with Dell Pro Precision and Nvidia, where innovation meets real world impact in high performance computing.
A
This is Logan, reshaping workflows, GTC 2026. I'm Logan, your host. You know me here with Gregory of F5. So, Greg, before we get started, I'm going to hand you the mic, tell everyone your role, kind of what you do, and then a brief overview of F5.
B
Sure. So, Logan, Greg Dahl here. I'm director of Product management for AI Solutions. And so what we do is we're bringing solutions for, in terms of traffic management, application delivery for those big AI clouds. And so we are presenting a bunch of new things this week at gtc. And so it's not as sexy as the kung fu robots, but we are actually showing how to optimize AI factories and AI clouds in general, running traffic management, security, and trying to optimize the number of tokens that these factories can generate.
A
All right, the podcast audience, some are very technical, know exactly what you just said. Some maybe on the ITDM side understand, but maybe not at a Leopard. So we say AI factory, right? We're thinking cloud, we're thinking data center, we're talking multiple GPUs kind of clustered together. But let's talk, let's say one, you said kind of helping the speed and the orchestration of that. What exactly does that mean? And what is F5's responsibility in that?
B
Yeah, so basically think now let's focus on inferencing, right? And so you need, you get a request and it comes from a user somewhere on the Internet. You need to route it to the proper spot and to the proper nodes where the models are running. And so we optimize from that point up to the worker nodes where the models run, the application run. And we do load balancing. That's where we started about 30 years ago. But we optimize that for AI. We do LLM routing, intelligent AI load balancing. So for example, being able to send the traffic to the right servers based on the load of the GPUs.
A
So load balancing makes sense, especially if you're handling like, you know, multiple requests across distributed. But you mentioned, you know, there's a couple, you said not as, maybe as sexy as robots, which, I mean, maybe it is, maybe it isn't, depending on who's listening. But you said you had several announcements, maybe give or, you know, new product things that F5 came out with this week. Maybe give me your top two.
B
Yeah, so, and you'll have to come to, to my session tomorrow. But we're going to talk about the AI grade in particular and how to do this distributed inference and routing the traffic from the user to the application. So that's, that's one thing. And we'll show the integration with LLM routing in particular for that. But also very basic stuff like DNS, which is becoming cool again for AI. And then. So that's one thing. The other thing that we are talking about a lot is AI guardrails. Right. And I think security is obviously becoming more and more important as people really adopt AI. And so we have a. We had the acquisition a few months ago of company called Calypso AI and so we're presenting that this week and some demos on this.
A
So, I mean, security has been kind of a big part. Right. Of gtc. I mean, with Nvidia launching Nemo Claw and some other things. Right. So from a security standpoint, what does F5 provide? Right. Like what's kind of some of your core services that you offer?
B
Yeah, so there are multiple layers to the security. It starts very basic in terms of segmenting traffic, making sure that if you have different parts of your organization, if you're a bank and you have people doing trading versus financial analysis, et cetera, that you can segment the traffic and isolate it. That's the very basic layer of security. Then you go into the network security, so firewall DDoS protection. Obviously you need to do that at scale. So we are actually, one thing we do is we run our Software on Nvidia DPUs, Bluefield 3 for now, Bluefield 4 in the future. And that let us accelerate. So if you take stuff like DDoS, you want to be able to make sure that it doesn't clog your servers. And so we run that at the edge of the servers on the Bluefield DPUs and then we go up the layers and we have the guardrails and there it's really about looking at what's going on in the inference request responses. So we have Red Team in particular, we mitigate, we remediate attacks. And that's. Yeah, the whole stack of security.
A
That's amazing. So tell everyone who's listening, where can they learn more about F5? You know, website, you know, socials, all that?
B
Yeah, the easiest is it's very simple. F5.comai well, thank you, Greg, really appreciate it.
A
Check it out. Logan gtc. We'll see you on the next one. Do what you want. Do what you want. This podcast was produced in partnership with Amaze Media Labs.
Date: March 20, 2026
Host: Logan Lawler
Guest: Greg Dalle, Director of Product Management for AI Solutions, F5
This episode was recorded live at NVIDIA GTC 2026 and features a discussion with Greg Dalle from F5 about the critical role of application delivery, load balancing, and security in the era of AI-powered data centers (“AI factories”). Host Logan Lawler explores F5’s approach to traffic management and the newly-announced security solutions specifically built for high-performance, GPU-accelerated workflows. The conversation demystifies how advanced infrastructure—powered by Dell Pro Precision workstations and NVIDIA RTX GPUs—transforms both scalability and security for modern AI systems.
Timestamps: 00:19–01:11
"We are actually showing how to optimize AI factories and AI clouds in general, running traffic management, security, and trying to optimize the number of tokens that these factories can generate."
— Greg Dalle (00:33)
Timestamps: 01:11–02:11
"We optimize from that point up to the worker nodes where the models run, the application run. And we do load balancing. ... But we optimize that for AI. We do LLM routing, intelligent AI load balancing."
— Greg Dalle (01:33)
Timestamps: 02:11–03:23
"We'll show the integration with LLM routing ... and then very basic stuff like DNS, which is becoming cool again for AI. ... The other thing we are talking about a lot is AI guardrails."
— Greg Dalle (02:29)
Timestamps: 03:23–04:42
"We run our software on NVIDIA DPUs, Bluefield 3 for now, Bluefield 4 in the future. ... We have the guardrails and there it's really about looking at what's going on in the inference request responses. ... We mitigate, we remediate attacks."
— Greg Dalle (03:37)
On why AI security is vital right now:
"Security is obviously becoming more and more important as people really adopt AI."
— Greg Dalle (02:29)
On F5’s historical transformation:
"We do load balancing. That's where we started about 30 years ago. But we optimize that for AI."
— Greg Dalle (01:33)
On leveraging hardware acceleration for security:
"If you take stuff like DDoS...we run that at the edge of the servers on the Bluefield DPUs."
— Greg Dalle (03:37)
Timestamps: 04:42–04:50
"The easiest is it's very simple. F5.com/ai."
— Greg Dalle (04:50)
This episode offers an insider’s look at how F5 and its partners are powering the next generation of AI-enabled data centers. From intelligent load balancing for complex inference workloads to multilayered, hardware-accelerated security, F5 aims to be the connective tissue that keeps large-scale, mission-critical AI deployments both efficient and secure.
Listeners gain practical insight into how Dell Pro Precision workstations with NVIDIA RTX GPUs fit into this picture, supporting the dynamic, always-on world of enterprise AI. If you work in IT, AI, or infrastructure—or just love geeking out about how real-world AI systems stay performant and safe—this is a must-listen episode.