
Latent AI CEO Jags Kandasamy shares how shrinking models and edge computing are redefining AI for defense, energy, and beyond.
Loading summary
A
Welcome to Reshaping Workflows with dell Pro Max PCs and Nvidia, where innovation meets real world impact in high performance computing.
B
Welcome back to another episode of Reshaping Workflows with Dell Pro Max and Nvidia RTX GPUs. And let me tell you folks, you're in for a good episode today. I've known this guy for quite a while, you know, I've been doing AI stuff for a while, about a year and a half at Dell. I think was kind of the first partner that and company that I worked with. Very interesting, but I don't want to steal all of his thunder. Jags, please introduce yourself and tell everyone about how wonderful you are.
A
Thanks Logan. Thanks for having me here and great to be back on the podcast. Hey everybody. Jags Kandasamy. I am the co founder and CEO of Latent AI. At Latent AI, we believe that AI needs to be available everywhere for it to make sense, which means models need to be smaller. Model needs to be running on any and every hardware that's available out there. So we have built a set of tools pipeline that help you design your model, train your model and optimize your model for various different hardware from sensor to server. That's the way we put it, right? So that's, that's basically what we do.
B
Okay, so I already know this, so I'm going to tee up a question. So with, with models, right? I mean whether it's a large language model, vision model, you know, a photo generated model. Every single model seems to get bigger and GPUs, like for example, the ADA architecture from Nvidia, the 6000 went from 48 gigs. Now Blackwell 6000 is 90 or 96 gig. It keeps getting bigger. But what you said was kind of the opposite. You have pipelines, you know, processes, patents, tools to shrink those models to be able to fit on every device from a small NGPU to a Jetson sensor. So not saying you're different, but you are a little different. Why do you take that approach? Why do you think that that is.
A
The way the primary reason AI came to existence is the amount of data that we started generating over the last decade and a half, right? Like almost two decades now since the advent of Internet and then mostly mobile devices and IoT the amount of data is growing incredibly exponentially on a daily basis, if not right. If you were to collect all this data and start to send it up to the cloud and process it there, you are not going to have enough bandwidth to do this. You can have 5G, and then you can have 6G. You can have all the GS that you want. But there is a law of induced demand that comes in. Right. The more the bandwidth you open up, the more devices come online and more the data that they pump up and they start to clog the pipes up. On the other side, there are decisions that need to be made in milliseconds and need to be made closer to where the data is being generated, where the ADC happens, analog to digital conversion happens. Sensors, like we work with military, very closely with military customers. Some of those data decision making has to happen real time in the field. You may or may not have Internet connect to it. Think about industrial, think about medical, health care, oil and gas industries. All of these have remote locations. You are not going to have time or the bandwidth all the time to be sending data up and to be processed and then to get the result and then act on it. Right?
B
Yeah, it makes total sense. Let's kind of give, for those that are listening, kind of an example. Right. So I totally get what you're saying, right? I mean, there's massive amounts of data out there. I mean, data is the new, what do they say? Oil or gold? Like, it is the. It is the new thing. It is the new valuable thing. But you're right, like from a bandwidth perspective, you can only, like with a garden hose, you can only fit so much water through that hose. But if you make a bigger hose, you'll still fit more water. You'll be able to add even more water and it will come with more water. So I get the point. Maybe give an example. It doesn't have to necessarily be, you know, one of your customers or anything proprietary. But what Latent AI did for them would not have been possible without the skills and the pipelines and then the patents and everything you bring with elite program to be able to quantitize a model. Give us a good example so people can kind of understand and get it in their head.
A
I'll, I'll give you a real world example. This is a company that does satcom. Okay. They are satellite providers and they work with remote installations for oil and gas. And these are unmanned operations in most cases. Okay. Now if I were to do surveillance in these remote locations, the only way I can do is I have to set up cameras and I need to stream that data through SATCOM to a central location for somebody to monitor. Just imagine how expensive that will become using a satellite connection. So we are helping them run quantized models and smaller models at the edge, at the distributed remote locations that are doing surveillance monitoring. If anybody is entering a certain parameter that we've set up, if they're coming into that location, then we can alert, we can audibly alert the intruder stating that they're coming into a location that is restricted. At the same time, sending a blurb of data through Satcom is faster than uploading a video. Sending, sending data stating that, hey, somebody's intruding, you want to alert local authorities or whatever other mechanisms that you need to do, shut down operations or whatever needed remedial action that you need to do, you can initiate that without waiting for the video to come in. So now you've got intelligence working at the edge for you. So here we help the customer pick like, you know, one of Dell native Edge products to run on this, right?
B
I love the use case, it's a great example. But I know people out there are probably asking, hey, you know, I've heard, and we'll tie this kind of back to leap and stuff like that. But in when you quantitize a model, at the end of the day you're shrinking it and you're moving some parameters saying this is important, this is not. Sometimes if done incorrectly, you actually get a less accurate but smaller model. But what value do you provide in that kind of quantization piece to be able to, for example, have a model that, let's say, hey, oil and gas remote, let's just say, whatever pipeline example, hey, if someone shows up looking like Logan Lawler versus Jags, like we want to know if Logan's there versus Jags. Logan's okay, JAGS is not. How do you get that model to a point where it's still accurate and fast enough to do its intended purpose?
A
Yeah, great, great question and thanks for teeing that up, Logan. So quantization is the concept where we take a number that is expressed in 32 bit floating point and reduce it to integer 8 and sometimes below 8 as well. Right. You can go to FB16 or integer 8 and below. So the moment like, you know, neural networks have these parameters like billions and billions of parameters that are expressed in floating point and you take them and shrink it down to integer 8. When you do that automatically, the accuracy of the model will drop. The logic that we have brought in, and this is an IP that my co founder SEC worked at SRI Stanford Research Institute for a DARPA initiative, is how do you maintain the accuracy of the model while shrinking it? So now you got the computationals reduced but still maintaining that Accuracy, that is the core competency, core IP that we bring to the table. That could be the easiest solution. But now when you come to customers, there are so many design parameters that they need to consider. Your choices may not be my choices. Right. Your use cases will have different choice points than mine. So what we've done is that like, you know, you guys have given us a bunch of hardware that we have in our lab in Princeton. We have different hardwares like that. We take different model architectures, we tune them, run them on hardware and measure performance. How fast does it run? How much memory does it take? How much power does it draw? What is the relative accuracy of the model? What is the size of the input data? All these things. We measure about 19, 20 data points per run that we do. And we have run over 200,000 hours of Edge computer information, collected 12 terabytes of information. So we are sitting on top of the. This is what helps our customers decide what is the right model architecture, what is the right model configuration for their particular use case. That's what drives the choice points easier and faster to deploy.
B
So Jags, you know, interesting point. Let me kind of expound upon the example a little bit. Right. Is let's say you have that oil and gas pipeline. It's in the middle of Alaska. There should be no one there, obviously in that use case. I probably am not so worried about the per se accuracy of the model. I want it to detect quick because no one should be there. However you put that pipeline. Yeah. Let's say in the middle of Missouri, where I'm from, where there are activities, farms, cows, etc. Like, yes, you want it to be quick, but you're more worried about accuracy because you don't want to necessarily pick up, you know, a cow or like a child. You're looking for something more nefarious. How does latent AI help with that choice ultimately of which model? Because every model is a tad bit different. Right. In terms of its strengths and its skill sets.
A
That's absolutely right, Logan. So this is where, when I told you we are sitting on top of 12 terabytes of data. This is where the customer can choose. I want my model to be this accurate. I don't worry about power because it's going to be wall plugged, but I need it to be this fast. Right. Like, what are the levers that you can push? Those are the levers that we expose to the customer that information. You can slice and dice it, you know, seven ways to Sunday if you want, however you want. And whatever is your priority, you can align to that and the customer can pick based on that. And also a factor here is your hardware. We should never forget about hardware because AI does not run in the ether, it runs on hardware. Right? And if you, let's say you have 100 miles of pipeline and you need to put a sensor or a processing unit every mile, you know, think about it like you start with an expensive piece of hardware because yeah, you want that accuracy and you want that speed, but you start to scale it up, it's going to become very expensive. So you have to consider that as well. So like, okay, what is the best value prop that I can get with my hardware? And this is the speed that I need, this is the accuracy that I can live with. All of that can be compressed into that model selection that you're doing.
B
Makes sense. I mean, I think that's what makes Latent AI so unique. If you haven't, you should definitely check them out latentai.com, but I think that's what makes you all so unique is that, and I hear this from customers all the time, is, well, hey, I have this use case, what do I need? And a lot of times it is guessing. It is. I mean, I won't say guessing, let's say an educated guess on what hardware you need, what sensors you need. You're right. Like, and then you look at it. But the way that you approach it is more of like, hey, let's look at the use case and let's start, and let's start with kind of the model and then we work backwards to kind of the ultimate goal, saying, hey, we want it to be fast, low power, accuracy doesn't really matter. Okay, well we need a sensor. Okay, well what's the power draw on this? And everything kind of sums up to the end goal, which I really like, where you're not necessarily over buying, you're, you're buying kind of from, you know, from Latent AI all the way down to the hardware you're buying. Stuff that makes sense for that use case, which I absolutely love because not a lot of people are doing that. It's more like, here's the server, have fun, you know. Yeah. So one question I have for you, I don't think I've ever asked you this is that obviously there's some, some cornerstone models you have for like computer vision use YoVlo, use others. What is latent AI's stance on new models? Because I know that there's kind of the mainstays that are the tried and trues. But how often does late nag go out and look at the market to say, hey, this XYZ model is making some noise. Like instead of, I don't know, I'm just thinking out loud, I don't know which one, but let's pick one is making enough noise. Hey, we're hearing from our customers, we're going to add this and start doing data on it. Or do you try to bring in as many models as you can to make the best decisions for your customers regardless of what they're doing?
A
Look, the way we approach model architecture is basically that architecture that underlies what things are being built on, right? Like if you think about your yolos, your efficient debts, all the regular computer vision models, they are all based on deep neural networks. Those architectures we are easily supporting. Then you've got your LLMs, they're all transformer based. So we look at that foundational architectures and can we support that through our pipeline? Once we get that support going, then bringing these open source or foundational models in and running them through the process becomes easier. So do we have support for transformers? Yes, but with the pace that LLMs are going in, we are basically taking that as a customer requirement and a customer request because I don't know, you know, you might have a llama version 3 that you're using for a particular use case that you have certified on and that's what you want to do. So you bring that to us and we can work with that. Somebody comes with another Mistral model and they want that to be done. Like we can work with that.
B
Right?
A
Like there is only so many buttons that I can push as a startup. So we from an LLM perspective we are working on a customer requirement, but from a foundation perspective we support the transformer architecture easily.
B
Okay, I mean that makes sense. So now, I mean I know it's not late breaking news, you've kind of, I think it's been a formally announced but so what we kind of discuss is kind of your core competencies, you know, around the model quantization in maintaining accuracy, the LEAP program helping you select the right hardware model based on kind of the input factors. But now Latent AI is launched an agentic product. So let's just start with one, what is it called? And then two kind of give us the 90 second sales, not sales pitch but 90 seconds on it and then we'll dive into it more.
A
Absolutely. We were very innovative in the name choices. We ran a process and we settled on Latent Agent.
B
Latent Agent Shelley. I mean, that was probably many focus groups.
A
Exactly.
B
That's so funny.
A
So Latent Agent. What we have done is that if you think about, you know, we are engineers, we started to build this product for engineers, right? And as you've met most of the team members, right, like PhDs from MIT to Stanford to everywhere and the deep ML compiler engineers that we have. So when we built the tool, our tool looked like a 747 cockpit to begin with. Knobs and switches everywhere you turn, right? You could do surgery with models and you can go to the depths and analyze things. All of that could be done. So that's where we started many years ago. And one of the goals for us was to abstract that complexity up and up and up, abstract it away so that it becomes easier to use the tool and easier to build and deploy models. That is like the North Star that we are always chasing. Okay? So when agents became a thing and we were watching that space closely and seeing how this foundational model to agent transition happened, and we saw our opportunity and we jumped in right there. So what we have done is taken the 747 cockpit and changed it into a car cockpit, an automatic car. Not even a stick shift like an automatic car. So what we have provided is we've got this agentic interface, a natural language interface for any software developer, not AI engineer, not ML engineer, any software developer to use our interface. And they can design their model, use the 12 terabyte of information that they can interact with, extract, which is the right model for them, and they can train the model and extract the object, object file, that they can plug it into their application and deploy.
B
Okay, all right. So many questions. So many questions. Okay, so I know, I mean, I've met Sec and I've met, you know, a lot of the very smart people. You have the 747. I mean, I'd say it's more like a 787, like cockpit. There's a lot of, you know, like dials and stuff like that. But now you've kind of scaled it back. You're. You've controlled it. So. And it's designed specifically for software engineers, you said, not ML engineers, not neural network, machine learning engineers. Why? The choice of just software developers and software, you know, coding versus exposing more of kind of what you do in the core competency side of it. Like to an ML engineer, a neural network engineer, an AI developer.
A
Right. You and I are old enough to remember the early days of the Internet.
B
Yes, I do. Dial up. Yeah. Oh yeah.
A
And remember all the things that we had to code on our own, all the little services that we had to build on our own. Think about that. To think about today, if you were to build a web application, how easy it is.
B
It's a lot easier.
A
So many things that are readily available that you could just plug and play and you can do a lego brick.
B
Like WordPress, you just. Yeah, easy.
A
Exactly, exactly. So for me today, we are in the days, early days of the Internet in the AI world.
B
I agree.
A
Okay. And for us to become, for us to distribute AI inference across different application, different devices and make it ubiquitous, not just in the cloud, but ubiquitous across the landscape, you need every software developer to be thinking about incorporating AI into their workflow. So that is the reason that we went ahead and built this specifically for developers. And then also from a TAM perspective, right, like from a market size perspective, you are talking about probably 500 ML engineers, you know, highly qualified ML engineers. If you were to air coaches, maybe 300 to 500,000 engineers around the world today.
B
Right.
A
But software engineers, bunch of them, 30 million is like one of the rough counts that that took.
B
It makes sense. Makes total sense.
A
Right.
B
And you're right, like at the end of the day, software, like true software developers are not AI experts. And I can't tell you how many conversations I've had with Nvidia about that. This is going to be kind of the new wave. Right. It's like you're not teaching a software developer how to fine tune a model or train them out. You're not doing that. You're wanting to teach them how to be able to insert either via coding, via whatever, an AI either application or an API call or whatever it is to be able to take advantage in their software of some sort of AI inference at the end of the day. So it makes total sense. So your application obviously probably chat based. Give me a couple of the workflows that if you're a software engineer that you could start and, and then ultimately complete within your AI agent.
A
Absolutely. So first thing is that you register on latentai.com, request a license key. Right. And then we have provided the agent orchestrator as part of the VS code ide. So you can install agent through the VS code and you can interact with our agentic workflow orchestrators through VS code and it'll generate all the code, everything from there and, and you can literally deploy from there to. And the other thing that we are providing, Logan, is that we give you access to the hardware in our lab for you to compile into, cross compile into there so you don't have to have a Jetson and RTX on your machine for you to compile for that. We'll compile it for you and then we provide you that executable that you can integrate into your application.
B
Okay, that's pretty awesome.
A
Yeah. And then the roadmap is of course to make that hardware available in a cloud form so that users can also test it out there. Let's say, for example, you're building a use case and you have not decided on which hardware you want to go with. You got your accuracy right, you've got your speed right, and you've got your power budget all figured out and then you're like choosing, okay, do I go with hardware A, B or C? Normally customer would have to buy each one of those boards or machines and then they'll have to test it out and see how it performs in terms of like, you know, rack space or, you know, the size, weight, wherever that they need to fit it. So instead of that, can we allow them to test it out and actually do the performance benchmarking on a real hardware for their particular use case? So that is the next roadmap item that we want to tackle.
B
I mean, that's super powerful, right? I think that really aligns with what you all do as a company. It's not just telling you what mile, not just optimizing, but being able to tell you, hey, which hardware. And it makes perfect sense. Right? Because it would be. I'll give you a bad analogy. It's like, hey, you've, you know, you want to buy a car, you have the loan, but you have multiple cars and you're like, I think this one. But like, I don't get to test drive it. So I don't get to see how many miles per gallon. I don't get to see how fast it goes. Like, it's just, here you go, have fun. It's just, yeah. So I mean, that makes total sense. I don't want to think about the complexity or logistical the process of depending how many people are on the platform versus all the options of hardware and all that. So we'll save that for another talk because that's going to be. But anyway, I love this. So for example, like, is it. And that's the thing. I played with different kind of agentic tools and agentic workflows where some are very, very good and then some are very, very not great. And in the difference is, is that the good ones are very well boxed and defined in the sense where hey, this agent is not trying to do everything, it is trying to do a very simple linear task that it's very good at. The ones that aren't so good are the ones that say, oh, I'm going to create an AI edge agent. Oh, and then I'm going to do that. I'm going to fine tune your model and I'm going to do this, I'm going to do this. All these things never work super well. So I already know the answer. But latent is more the linear like kind of defined use case. Or is a latent agent trying to do everything?
A
Yeah, we tried to bake cookies with this and we failed. So we stuck to our lane. You know one thing that when you're a startup and when you know you're focused on AI, you have customers that will come in and tell you like, hey, can you do this problem for us? Can you solve this problem for us? Can you solve that problem for us? Right, like things will get pull you into 10,000 different directions and you don't want to be distracted there. Similarly, when we started the agentic work we were very, very clear on like this is exactly what we are doing, this is exactly the problem that we are trying to solve and that, that North Star, right, Which is very clear on how do we enable a software engineer to build ML models? That was very clear. So one of the things that my guys reported this, right, like non ML developer, a software engineer built a model and as they were testing the model, the agent found a memory leak and it fixed it and gave the object file back. So this is what I'm saying about like narrowing it down and like okay, when you build a C object file, right, like so we can compile into C, not just Python, we can compile into C in C. If you don't properly handle your memory, your point is you're going to run out of memory pretty quickly and you're going to blow the application. The agent was able to identify that and able to fix that. That's the narrow focus that we have gotten to and we are ensuring that we stay within that, which is great.
B
Which I already knew the answer, so that was kind of a softball. Now my question though is obviously it's been battle tested. Where have you seen? Well I'm a first part of the question is like that's a great example. Have you seen it do other things in a good way that were not expected? That like where you've heard A customer be like, hey, I was trying to build this, compile this, you know, anthropic and, or whatever and it did this. And I was like, whoa, has it done anything like that? And if not, that's okay. I'm just more curious.
A
No, I think the data is still fresh. Right. Like there's a lot of.
B
It's not 12 terabytes of data yet.
A
Not yet. Right. Like, but we are getting, I think in the first six weeks we had over 80,000 interactions.
B
Wow. Okay.
A
Right. So there's a lot of, lot of data there. Most of the customers that have expressed they have jaw drop experience. Right. Like holy. Right, like this is really good. Like we didn't expect this sort of engagement from an agent. Right. Very, very narrow, very specific and helping you solve a particular problem. Right. So I think that that has been a pretty, pretty good experience for us so far.
B
It's awesome. So gotta ask and you don't have to share, but you have use case 1 design for software engineers being able to compile ultimately in the agentic framework, what is kind of the next workflow or the extension of the current workflow or what's the new workflow that you think you're going to try to tackle next?
A
Okay, so we've solved this. You have to think about what comes before that workflow. Data is a new oil. Everybody's sitting on data. Data that is not context aware is just dumb data. So we have built an assistive labeling product. This came as a requirement from a customer, our U.S. navy customer. And we have taken this to the commercial side as well. So we can literally label data 80 times faster than what a human can do. Right. We are not saying that you take it and label everything by system. No, it's more about how do you have human in the loop. That is verifying and validating the data and labeling that in a much faster pace. So what that allows you to do, you get your data, you can label them, you're preparing them in a faster manner and bringing them into the training process. So that is the next stage. Right. Like the labeling part and then bringing your own data to this training process is those two things that we are addressing right now.
B
Okay. Not released in the process.
A
In the process, in the process and then the last mile at the end of the tail. Right. Like this is where, you know, we work with you guys in deploying this to the Department of Defense using the Dell tough book. How do we enable rapid retraining of model at the edge instead of bringing all the new Data, the drift information, everything back to the enterprise, into the cloud to retrain. How do we push this information onto a laptop at the edge, retrain it and redeploy that new information onto your assets immediately?
B
I love it. I gotta ask some questions about the. I know that we're getting close to the end of it and I want to be respectful of your time. But with the labeling, it's interesting because I haven't done anything high end, but I've done some image generation fine tunings and stuff like that for a couple animation studios. And I absolutely hate labeling data because you can use like blip and other things that contextually will read it, but like it's, it's trash, like it's not very good. And if you hand label it, I mean that really comes down to what kind of person you are. Because like I'm not a very creative person in like an artistic sense. And I did a little test where I gave context to what the image was and then an artist did it was like, like it was night, day, right. Like, but I wanted that. So I guess with the. My question is, is that one, how is it, is it videos image, you know, PDF documentation, what are you labeling and then what is the process for the human in the loop in this to ensure the accuracy and the reinforcement training?
A
Great question. So this is. Currently we are focused on vision because that's where most of our customer base is. So think of videos, think of, of images and stuff, right? So let's say you have 10,000 images in your fold. We can ingest all those 10,000 images, identify objects in each one of them and we can create a plot, a graphical plot clustering similar looking objects together. So you will see a plot with like let's say green bubbles, green dots all combined together, a red dot combined together and other, all coming together. So what this is, is like if you have let's say a potted plant, simple example, a potted plant. In your image data set, all the potted plants will be in one cluster. You have a bicycle, the bicycles, all the bicycles will be in another cluster. So now visually you can see them that are grouped. You can look at that like, okay, these are all, you can quickly sample those. Okay, this is all potted plant. So I can nmask select them and label potted plant en masse, select this and label bicycle. Right? So that's how easy it is. Then you can also do similar to Google search, right? You pick a bicycle and say like I like this bicycle. This is, I want you to find all the bicycles in my database in my, in my data store. It'll go search and bring it to you and say like, all right, these are all the bicycles that I found. Do you want me to label them as bicycle? Yes, done.
B
Got it. Makes sense. So it's very, it's very vision, very object based versus descriptive, like in the artistic sense. But yeah, I mean that makes sense because that's. That makes sense.
A
That is coming. Another one, another layer. Okay, so we are using two VLMs.
B
Okay.
A
One VLM will describe what's in the scene and it'll extract those objects. We'll run the other VLM to verify and see which one, like what is the ground truth versus, you know, comparing the two inferences. Then this becomes the feeding pipe into the process that I just explained to you, the clustering and manual labeling. So you get reinforced data labeling process through this. So you get descriptive as well as label data with the context that you're coming out.
B
Man, we're gonna. I mean, I'm probably getting ahead of myself, but we're gonna have to. I wanna show this off at supercompute 25 on the GB, probably GB10 or GB300, but we'll get into that later. So we'll leave that for another day. But Jags, I know we're kind of up against it, man. Always a pleasure talking to you. Can you go ahead, kind of tell everyone where to find you on the Internet and then give one more quick overview of latent AI. Pretend someone just joined in and you have one minute to tell them everything about latent AI and then we'll go ahead and close it out and get you back to work.
A
Absolutely. So you can reach me@jagsatentai.com always visit the website latentai.com to find all the latest information. So if you're a customer that is just now starting to think about what do I do with AI from a computer vision perspective. You want to do quality assurance or surveillance or whatever, Think of us, right? We can help you even if you don't have AI engineers. We can power your software engineers and help them build AI models and rapidly deploy them on and get to production faster. That's basically what we help you do. We do this for the Department of Defense. We are heavily active in US Navy, Air Force, Army, Special Forces. We are doing a lot of cool things over there. And that is a technology that we are bringing back to the commercial side and we would like to help you.
B
I love it. And they're all really great people too. I mean, that's just the cherry on top. So Jags really appreciate the time as always, great catching up. Love learning about Latent agents and we'll definitely make something happen for Supercompute24. So with that being said, you know, I think just to wrap this episode up is, you know, GPUs get bigger and bigger and models continue to get bigger and bigger. You don't necessarily have to follow that trend, right? Yes, in some use cases it makes sense, but at the end of the day, using something like Layton, where you're condensing down, focusing on the core competencies within the model, being able to right size your hardware to fit with kind of the computational speed accuracy requirements, especially when you're first getting started and you don't have an unlimited budget, really makes a lot of sense. So definitely take a moment, check them out latentai.com and with that, this is Logan from Reshaping Workflows. Until next time, keep your AI running locally on your Delpro Max, your Nvidia RTX GPU and I'll catch you on the next one.
A
Do what you want do what you.
B
Want this podcast was produced in partnership.
A
With Amaze Media Labs.
Podcast: Reshaping Workflows with Dell Pro Max and NVIDIA RTX GPUs
Host: Logan Lawler (Dell Technologies)
Guest: Jags Kandasamy (Co-Founder & CEO, Latent AI)
Date: November 6, 2025
This episode dives into how Latent AI is transforming the deployment and efficiency of artificial intelligence at the edge—delivering intelligence closer to where the data is created. Host Logan Lawler interviews Jags Kandasamy, who explains why shrinking AI models and optimizing them for diverse hardware—from powerful servers to remote sensors—is crucial in modern workflows. The conversation covers Latent AI’s innovative approach to model compression, hardware-software co-design, real-world use cases (like oil & gas surveillance via satellite connections), and the recent launch of Latent Agent, a tool that empowers software—not necessarily AI—engineers to harness AI in their applications. The episode also offers a behind-the-scenes look at data labeling and future workflow innovations.
Timestamps: 00:48–03:40
Data Explosion & Bandwidth Limitations:
AI’s rise is fueled by massive, exponentially growing data from IoT, sensors, & mobile. Yet, sending all this to the cloud is impractical:
“If you were to collect all this data and start to send it up to the cloud and process it there, you are not going to have enough bandwidth... The more the bandwidth you open up, the more devices come online and more the data that they pump up and they start to clog the pipes up.”
— Jags Kandasamy (02:08)
Real-Time Decisions at the Edge:
Critical decisions, especially in industrial, military, and healthcare, need to happen where data is generated, not after cloud round-trips.
Timestamps: 04:28–05:56
“So we are helping them run quantized models and smaller models at the edge, at the distributed remote locations that are doing surveillance monitoring… Sending a blurb of data through Satcom is faster than uploading a video.” – Jags (04:28)
Timestamps: 05:56–08:50
“The logic that we have brought in… is how do you maintain the accuracy of the model while shrinking it?... That is the core competency, core IP that we bring to the table.” – Jags (06:50)
Timestamps: 09:35–10:54
“You have to consider that as well… what is the best value prop that I can get with my hardware? … All of that can be compressed into that model selection.” — Jags (09:35)
Timestamps: 12:35–13:51
“There is only so many buttons that I can push as a startup… we are working on a customer requirement, but from a foundation perspective we support the transformer architecture easily.” – Jags (13:38)
Timestamps: 14:23–20:53
“We’ve taken the 747 cockpit and changed it into a car cockpit—an automatic car. ... Any software developer ... can design their model, ... train the model and extract the object file, ... and deploy.” – Jags (15:05)
Timestamps: 22:24–23:48
“[W]e tried to bake cookies with this and we failed. So we stuck to our lane. ... That North Star ... how do we enable a software engineer to build ML models? That was very clear.” – Jags (22:24)
Timestamps: 25:22–27:01
“We can ingest all those 10,000 images, identify objects in each one … and visually … cluster similar looking objects together.” – Jags (27:59)
Timestamps: 26:26–27:01
On Edge AI:
“You need every software developer to be thinking about incorporating AI into their workflow.” – Jags (17:40)
On Quantization:
“When you do that [quantize], automatically, the accuracy of the model will drop. The logic that we have brought in ... is how do you maintain the accuracy of the model while shrinking it?” – Jags (06:50)
On Agent Focus:
“We tried to bake cookies with this and we failed. So we stuck to our lane.” – Jags (22:24)
On Democratizing AI:
“Software engineers, bunch of them—30 million is like one of the rough counts ... But software developers are not AI experts ... This is going to be kind of the new wave.” – Logan (18:23 & 18:33)
On Platform Flexibility:
“You’re not necessarily over buying—you’re buying kind of, from Latent AI all the way down to the hardware, you’re buying stuff that makes sense for that use case, which I absolutely love.” – Logan (10:54)