CEO of Liquid AI Ramin Hasani Says a Worm is Changing the Future of AI - Digital Disruption with Geoff Nielson

Summary5 min read

Podcast Summary: Digital Disruption with Geoff Nielson Episode: CEO of Liquid AI Ramin Hasani Says a Worm is Changing the Future of AI
Release Date: March 17, 2025

Introduction

In this captivating episode of Digital Disruption hosted by Jeff Nielsen from Info-Tech Research Group, Jeff welcomes Ramin Hasani, the co-founder and CEO of Liquid AI. The conversation delves deep into the innovative technologies developed by Liquid AI, particularly focusing on how insights from biology are revolutionizing artificial intelligence.

Liquid AI's Origins and Inspiration

Ramin Hasani shares the genesis of Liquid AI, drawing inspiration from biological systems, specifically the nervous system of the C. elegans worm. He explains:

"We started looking into how we can bring in insights from biology and physics into machine learning because we wanted to see what are the mathematical operators that we can find in biology that doesn't exist or we are not using them right now in the space of neural networks." ([00:54])

The C. elegans worm was chosen due to its transparency and genetic similarity to humans, making it a valuable model for understanding nervous systems. This biological foundation set the stage for developing Liquid Neural Networks.

Technology: Liquid Neural Networks

Liquid Neural Networks (LNNs) represent a novel approach to AI, inspired by the dynamics of the worm's nervous system. Ramin elaborates:

"Liquid neural networks are an instantiation or like a type of recurrent neural network that are coming from more mathematics that we use for describing physical processes." ([06:33])

These networks utilize differential equations, a staple in modeling physical systems, to govern neuron interactions. This approach contrasts sharply with traditional neural networks, offering enhanced expressivity and efficiency.

Differences from Traditional Neural Networks

A significant distinction between LNNs and traditional models like Transformers lies in their computational scalability and efficiency:

"Our computation scales linearly. Meaning the longer information they process, the better the gap becomes between like 10x could become like a thousandx at use." ([34:58])

Unlike Transformers, which scale quadratically and become computationally intensive with larger contexts, LNNs maintain linear scalability, making them vastly more efficient, especially for extended sequences of data.

Scalability and Efficiency

Ramin discusses the breakthrough achieved in 2022, where Liquid AI overcame the computational challenges of scaling LNNs:

"In 2022, we published a paper in Nature Machine Intelligence that allowed us to bypass the computational complexity of differential equations, enabling us to scale these models to billions or even trillions of parameters." ([13:48])

This advancement has positioned Liquid AI to compete robustly in the AI landscape, offering models that are not only powerful but also energy-efficient.

Applications and Use Cases

Liquid Neural Networks have a broad range of applications across various sectors:

Edge Computing: Deploying AI models on devices like smartphones, robots, and IoT devices without relying on cloud-based APIs.

"We are building the best quality foundation model today below on the edge... You can host liquid foundation models on a Raspberry PI that works with a robot." ([19:49])
Biotech: Utilizing LNNs for drug discovery by designing DNA sequences that translate into meaningful proteins.

"We can design new structures that match biological proteins... which could be a drug candidate." ([26:42])
Financial Services: Enhancing fraud detection and portfolio optimization through sequential data processing.

"In the financial sector, we can build predictors that provide financial advice and detect anomalies for fraud detection." ([20:18])
Robotics and Autonomy: Improving control systems and synthetic scenario generation to enhance robotic interactions and surgeries.

Ramin emphasizes the versatility of LNNs in handling different data modalities, making them suitable for complex, real-world applications.

Explainability and Control Theory

One of the standout features of Liquid Neural Networks is their inherent explainability, rooted in control theory:

"The approach, the mathematics of liquid foundation models are informed by control theory mathematics. That allows us to understand how foundation models come up with decisions." ([38:33])

Unlike Transformers, which operate as black boxes, LNNs provide transparency into decision-making processes. This "white box" nature is crucial for deploying AI in safety-critical applications, where understanding and verifying AI behavior is paramount.

Market Readiness and Deployment

Liquid AI is transitioning from experimental stages to production, collaborating with early adopters across multiple industries. Ramin outlines their strategic focus:

"We are 50 people right now. We are working with enterprises in consumer electronics, e-commerce, financial services, and biotech to implement our models." ([35:44])

Additionally, Liquid AI offers platforms like Liquid AI Playground and partnerships with Perplexity AI and Lambda Labs to provide early access to their models, fostering broader experimentation and feedback.

Future Vision and Long-term Goals

Looking ahead, Ramin envisions Liquid Neural Networks as the foundation for future AI systems:

"I can see liquid foundation models to be deployed on any devices we own... in the near future, we could port them onto satellites and other edge devices." ([43:20])

He emphasizes the importance of sustainable AI development, aiming to balance intelligence with energy efficiency. Furthermore, Liquid AI is exploring the potential of co-designing specialized hardware to further enhance the capabilities and deployment of LNNs.

Personal Insights from Ramin Hasani

Ramin shares his personal journey from a scientist focused on understanding intelligence to leading a pioneering AI company:

"Exposure to venture capital and the real-world application of our technology shifted my focus from purely scientific endeavors to bringing tangible value through scaling." ([51:58])

He highlights the collaborative environment at Liquid AI, teaming up with esteemed scientists like Professor Daniela Roost and leveraging the collective expertise of his team to drive innovation.

Conclusion

The episode sheds light on Liquid AI's groundbreaking approach to artificial intelligence, blending biological insights with advanced mathematical frameworks to create efficient, scalable, and explainable AI models. Ramin Hasani's vision positions Liquid AI as a formidable player in the AI revolution, promising transformative impacts across various industries while maintaining a steadfast commitment to sustainability and transparency.

Notable Quotes:

"Liquid neural networks are an instantiation or like a type of recurrent neural network that are coming from more mathematics that we use for describing physical processes." — Ramin Hasani ([06:33])
"Our computation scales linearly... That's one of the nice things about them." — Ramin Hasani ([34:58])
"We can design new structures that match biological proteins... which could be a drug candidate." — Ramin Hasani ([26:42])
"Instead of just designing black boxes... we have a white box intelligence." — Ramin Hasani ([38:33])
"I can see liquid foundation models to be deployed on any devices we own." — Ramin Hasani ([43:20])

This detailed summary encapsulates the essence of the episode, providing listeners with comprehensive insights into Liquid AI's innovative journey and the transformative potential of Liquid Neural Networks in reshaping the future of artificial intelligence.

Loading summary

Transcript39 lines

[00:01]
Jeff Nielsen
I'm super excited to talk to Ramin today. I think that, you know, as the CEO of Liquid AI, he's got these fully new ways of thinking about, you know, AI, this underlying technology of his learning models. I think it's going to revolutionize the way that frankly, all businesses could be using AI in, you know, a few years from now. Hey everyone, this is digital disruption. I'm Jeff Nielsen and joining us today is Ramin Hasani who is co founder and CEO of Liquid AI. Ramin, super excited to have you here. Just jumping into it, I mean whoever came up with the name Liquid AI, Liquid Neural Networks, first of all, from a marketing perspective, I mean, I love the name. Can you maybe just walk us through though, you know, what is the technology, what makes it different? You know, hopefully not at the PhD level.
[00:54]
Ramin Hasani
Yeah, definitely, definitely. When I was so, so the, the short story about that is that like we started looking into how we can bring in insights from biology and physics into machine learning because we wanted to see what are the mathematical operators that we can find in biology that doesn't exist or we are not using them right now in the space of neural networks. Okay. When, when we were designing these things, when we started this project was 2015 and everything started in Vienna, Vienna University of Technology, where I started my Ph.D. and, and the professor I was working with, together with my current CTO at the time, we, we started looking into the brain of a little worm. The worm is a very, very tiny little worm, but, but it is a very popular worm. It has a very, you know, it has won so far four Nobel prizes for us and it shares 75% similarity in genome with human. So that's why it's, it's really useful for us, you know, to understand how it's, how its nervous system or cells in general behave.
[02:06]
Jeff Nielsen
So that's what you mean by popular?
[02:08]
Ramin Hasani
Yeah, yeah. Award winning worm among neuroscientists. So that's why I had to say that. And then, so the body of the worm is transparent. So every, everything that happens inside the worm is very kind of you. You can see that under the microscope, you know, like, and that, that makes it like a great model organism, you know, and then what, what we've done, we started looking into the data of the nervous system of the worm. Reason being, if you understand, at the worm and, and by the way, so in the tree of evolution, 600, 600 million years ago, we got split, humans got split from this worm. Okay. So if you think about it like these, some somehow our fathers you know, so, yeah, so, so we, we, we wanted to see, we wanted to understand at the core what are the operations. If you understand how nervous system works at the level of the worm, maybe we can take that and then, you know, scale it into like better and more sophisticated learning systems. And that was like the motivation of what we wanted to do. We wanted to understand how this thing works. We started on the C elegans and the mathematical operators that we started learning. And this new type of neural network that we designed, I called it liquid neural networks. And my professor at the time, Radu Gruso, he wanted to call them, he wanted to call them regulatory neural networks. Okay. So yeah, so at the time, because these are kind of input dependent kind of systems that they adapt their dynamics or regulate their dynamics as they go forward based on the inputs that they receive. And that was the operator that was, you know, inspired us like how neurons exchange information with each other. I called it liquid. You know, so that, that, that was like 2017 when we discovered this thing. And we showed that with 12 neurons, like worm inspired neurons, you can drive a car, you can drive a robot, basically like a, like a small mobile robot. And then we showed that with 19 neurons with, and with a, with a convolutional neural network on top of it, you could, you could drive a camera based autonomous driving, full blown. And that was, that was what we've done at MIT. So 2017, I joined MIT together with my current CTO, Matthias Lechner at Daniela Rus's lab, who is the director of MIT C Cell, and then another co founder of ours and Alexander Amini, who was a scientist at the time at mit. Four co founders we started, we continued our journey on scaling this type of liquid neural networks on real world applications of robotics. Started with robotics and then soon we scaled it into robotics and autonomy. And then we scaled it into like modeling time series. Okay. So systems that can actually work really well on sequential data. Now the type of data could be coming from sensors mounted on a robot, could be video, audio, text, anything. And we started seeing like promise on this type of technology, applying it in different domains. And that, that was the beginning of how this thing became kind of liquid neural networks as kind of a thing that right now like some, some students of ours at MIT are doing their PhD in basically, which is kind of a new type of doing AI.
[05:33]
Jeff Nielsen
So they're. Sorry, they're doing their PhD in this.
[05:36]
Ramin Hasani
Working with this specific model on these, on, yeah, exactly. Like continuation of how these things are and you know, like how can we extend understanding the characteristics, you know, like to build better and better AI systems specifically for real world applications. You know, at Daniels lab, it's, it's a robotics lab. So it is like applications were always like centered around real world and the impact that we could have on real world. And I was like, what, where, where.
[06:03]
Jeff Nielsen
Things are today, that's, that's so amazing. And I mean, my mind is still blown just from the, you know, as you said, this transparent worm that's, that's informed this entirely different way of doing things. How does it, you know, when you compare this to a more traditional neural network, you know, what's different about this one? Aside from the fact that it's, you know, worm influenced and it sounds like, you know, with 12 or 19 neurons, in some ways it's a lot simpler of a model. What are the implications of that and what makes it, you know, a more attractive model to use?
[06:34]
Ramin Hasani
Yeah, so you see, like when we were focusing on robots, we are talking about resource constraint environment, right? Like the hardware, you don't have access to the massive amount of compute or something. You have a Raspberry PI running on a robot. And this is the brain of the robot. So what you try to do, you try to build an expressive system that fits into that small. Like how can we maximize expressivity of a system while, you know, like putting it on these resource constraints? So two objective functions you have. One is quality of a model plus efficiency of the model. Right. So we've been thinking about, and that's where biology is very useful. You know, it can give you that expressivity because the models are going to say expressivity, but I mean, they can map data, input, output data better than the other kind of system. You know, a system that can model data better, that means higher quality on the benchmarks that you would measure or lower rate of error on the benchmarks that you measure. You train on a training set and then you test it on a test set. And then the performance on the test set, which is coming from the same distribution as the training set, it basically shows you how well your model is generalizing how well your model is performing. Therefore the model is more expressive or not. Okay, that's just like a top level view of what is going on. But in reality, where we saw these type of models met kind of machine learning models, was recurrent neural networks models that have feedback mechanisms in there. So that's one of the fundamental properties of our models. So we found obviously there is a category of feedback systems in machine learning which we call Recurrent neural networks, they're not just computing forward like input output. They receive an input, they have an internal state, they think, and then they basically generate an output, right? So that becomes kind of a recurrent neural network. Okay? And this has been there for decades in the realm of kind of artificial intelligence. But this is liquid neural networks are an instantiation or like a type of recurrent neural network that are coming from a more mathematics that we use for describing physical processes. Okay, the mathematics that we use for describing physical processes, similar to how nervous system dynamics, like, for example, that's a physical process. The mathematics that we use is basically we use differential equations as a tool to describe those things. So that became kind of a novelty. Bring those kind of continuous time systems that are predicting the next step of a system with a delta delta T, for example, like that, kind of systems that are describing our reality, basically like that. Zorka differential equation physicists always like their life is full of differential equations to solve problems. Why? Because this is how we describe how a physical system make progress in time. You know, and that complexity of how do you go from time zero to time too, when you want to describe a behavior of a physical system? Okay, that kind of process is modeled by a differential equation. And we use that to describe behavior of neurons and synapses. And that became the base of this new type of AI system. And as a result of that, some sort of new operators got added to the class of recurrent neural networks. And that those are kind of the liquid operators that got added. And now in terms of applicability, now when we talk about machine learning today, you were talking about larger scale models, and we have evolved into generative AI field. Why? Because we realized that the larger you make these models and the more you scale these models, the better they get at the scale where I was talking to you, like 19 neurons just being able to navigating a car or something, that. That's like a very simple kind of application, you know. But if you want to do more sophisticated tasks like generative AI and do language modeling, do audio model, multimodal kind of language model. When I say language, it could be video, audio and text, for example, as an input, if you want to get there. You cannot really do that at a warm level kind of intel. So you need to scale this thing to much larger kind of instances. One of the hardest that we had to scale this technology for us was the fact that in the academic domain, like we. We've optimized this thing for small neural networks. And if you Want to scale this mathematics, you're going to have a lot of troubles. The reason why, for example, we have, we have at Ceran, you know, like in Switzerland, we had like the first versions of the supercomputer. Why the physicists use supercomputers for modeling the physical processes that happens, like for atoms and everything. So when you want to model physical processes at scale, you need supercomputers to actually do that kind of analysis. Why? Because the mathematics of differential equation based mathematics are very hard to run at scale. You know, the larger you make them, the more comes. Forget about, like running them on very efficiently, you know. So we had to do another breakthrough on taking these operators that we have, built an efficient version of them so that you can actually take them and really scale them into billions or maybe trillions of parameters. And that was a Nature Machine Intelligence paper we published in 2022. And this was actually the start of the whole process why we thought about building this company. So 2022, after we published this paper November, somebody wrote about like, this was us solving a fundamental problem of going from like a differential equation that didn't have a solution into the solution space of that differential equation that would allow us to kind of bypass the computational complexity of that, of that set of mathematics into something that we could actually run very efficiently and we could potentially scale these things for the first time. And, and that. And then, you know, like Quantum magazine wrote something in, in, in, in January of 2023, and then my inbox was full of VCs. Everybody was saying, like, oh my God, this is a new type of neural network. They're so powerful, like we showed on benchmarks, they're like really, like doing everything really well at the small scale. And now, now we have the potential to scale them now. And then everybody was getting crazy that, okay, so now we should take this technology. What would happen if 19 neurons can drive a car? What would happen if you put like billions of these neurons next to each other? You know? And that became kind of the thesis of the company March of 2023 for co founders. We started from Daniela Ruth's lab, Liquid AI.
[13:48]
Jeff Nielsen
It's an amazing story, and I'm curious, Ramin, because there's like, like there's so many different directions. As you said, you can take this thing. And I'm glad to hear you've got, you know, kind of PhD candidates and PhDs working on it. There's, you know, if I'm hearing you correctly, there's like everything you can do with 19 neurons, which it Sounds like there's still quite a lot to unpack there. And then there's the scaling question. So. So, I mean, where to from here? Is it both? Are you focusing on applications at both scales or is there, you know, an area you're focused more on?
[14:18]
Ramin Hasani
Yeah, definitely. That's a great question. So the mission of our company right now is to really build very powerful AI systems. Like with the two objectives that I mentioned, nature had the same objectives. You build the most intelligent system given resources, given scale that is available to you. That means, like right now we are building foundation models. Foundation models that can, you know, like, these are models that are, you know, like general purpose. And they can, they can do general tasks. They can communicate with humans, with natural language in the form of text, audio and vision. So and these are the kind of systems that we are building, even if they can tackle signal, but we want to have, like the human element, like, associated to it. So that's why, like, it is important to have that kind of functionality added. So we identify ourselves as a foundation model company. The models that we are designing, we try to maximize their performance while also, like, being mindful of the energy consumption that they would have, both on the training side and on the test time, like after you obtain the models, you know. And this is possible because of the breakthroughs that we've done on the architecture side, on the learning algorithm side, on data curation side, because there's a lot of research goes into. I cannot tell you that you can just change the transformers into a liquid neural networks and voila, like you, it's going to have like a crazy better AI system or something. You know, there is a whole bunch, whole game that we have to play. And the game that we have to play is basically there's an infrastructure game that you have to play. For example, you know, Transformers, since Google invented them and this became kind of the mainstream, everybody, like there are tens of thousands of repositories that contributed to building the infrastructure for scaling this kind of technology. But for liquid foundation models, which are kind of the products of this company right now, everything from the infrastructure, we had to build it from scratch ourselves, right? Because this was a new technology and we didn't have that much of a developer effect. We had like some impact, but I mean, it wasn't as much as, you know, like Transformers and because we haven't talked about this technology for a while now and then. So as I said, the two objective functions, that means at every scale we are going to build. What does that mean at every Scale, that means wherever possible to host a foundation model. We want to have the best quality foundation model today below on the edge. Let's call it on the edge, when I call it Edge. Edge could be a laptop, Edge could be a mobile phone, it could be a humanoid robot, it could be an autonomous car and it could be a satellite, it could be a network point inside kind of an IoT device. Right? So in all these kind of hardware constrained places, you would be able to put a foundation model, an intelligence kind of system in there. We want to have the best quality model at those kind of places today. What we managed to enable in 1 year and 10 months life of the company was that at this scale of edge, we are very confident that we have the best quality models like at every scale. Like you can put liquid foundation models on a Raspberry PI that works with robot. You can put them on a, on a, on a mobile phone and have like an offline interaction with these models. Because the idea of having efficiency in mind and being able to host something directly on the edge, it just allows you to have everything done privately at the user's site. Right. And that does that just, that's already like a super value that you can provide to clients. You know, like the ideas of sovereignty of AI, you know, comes out. So you want to own your own intelligence. This is where liquid AI would help you to actually develop the best kind of models and in the cheapest possible form. This is the impact that it would have for a client and on the client side and then the impact for an enterprise who purchases our, our license, access to our technology and models, the impact is that they can host foundation models for free. Why is it for free? Because you're not calling an API anymore, you're not hosting them in a data center. You're running it directly on, it's on the edge. And if it's running on the edge, that means like the only cost that you're bearing is a battery of that device, you know, and that, that's basically free. So now that means like imagine if you can provide intelligence at the highest level on the device. This would be the form of serving this model and for the first time like solving some of the business challenges around generative AI as well, which is kind of the, the, the hosting cost of the foundation models as a, as a whole is very, very high. Right. And what do you want to do? You want to be able to reduce that. And the ideal case scenario is that can we reduce it to zero, you know, which would be like hosting it directly on a device. The thing is, again, there are limits up to the point that you can actually enhance the intelligence of a system on a certain hardware constraint. But we believe that what liquid neural networks and liquid foundation models enables us to do is to put the maximum amount of intelligence on a certain amount of device. We had high confidence, right?
[19:49]
Jeff Nielsen
And we're kind of still at that abstract layer ramin of how we use this. And it's so, to me, it's so interesting and I'm sure there's 10,000 use cases for this. But from that applicability perspective, where are you seeing the most compelling use cases? Where are you seeing people kind of knocking on your door asking questions like would this be applicable here? What, what sorts of sectors? And you know, if you're, if, if you're able to tell us what use cases specifically are, are the most compelling.
[20:19]
Ramin Hasani
Yeah, absolutely. Like on a phone, for example. If think about Apple Intelligence. You know, the idea of Apple Intelligence in the perfect world doesn't have a server model because Apple intelligence has two components. One is the on device computation, like one small model and one large model that is on the, on the cloud. Now the more you can run the applications that you would care about for generative AI on the phone, this could be summarization, this could be text summarization, this could be image understanding, you know, this could be document understanding, this could be translation composition, you know, like a lot of applications that you can do with generative AI on the edge. If you can, you know, lift all these kind of complexity to the edge model, you just, you're winning, you know, because this is just, the cost is just reduced, you know, like you can just host it there. But, but you need to have like a reliable models today. Small models are not that reliable to do that kind of job. Right? This is on the just generative AI point of view. The next wave of this thing is agentic behavior. You know, a lot of like kind of discussions are going into agentic flows and agentic behavior. What you want a model to be able to do is you want your model to be able to push a button, you know, in a reliable way. There are like bunch of buttons that you want to do. Like your models, you want to give it access to the button that it can select. For example, book my travels, book my calendar, kind of, you know, like put something on my calendar, ask it to do, eventually think about Jarvis, you know, like how you want to do. That's like the eventual kind of form of intelligence that we are thinking about like you can, you can do that like somehow with the cloud models, but can we do that in a private way and how much of that kind of weight we can lift it to the edge? You see, and that's like, that's kind of the balance that we are thinking about putting in place. And when you think about agentic behavior, you want systems that can take action. One quality that you want to improve and one thing that we are good at today is instruction, following and being able to, to do tasks, reasoning tasks. And when you want to do agentic behavior, you want to always have a constant loop of information coming back to your system. So system interacts with a user or environment and then information comes back. And based on that, you might take an action. The action might not be optimal. You want to optimize the action on the go. And this becomes ideas of test time, computer and all those kind of matters that you know, like all one series of models can, can enable. Those are the kind of things you can do that also with small scale models. You know, you don't need necessarily like larger models because not everybody on their daily basis they're solving kind of, you know, like the hardest mathematical problems in the world. You know, there are use cases that you can just solve with a smaller kind of foundation model. And so far anything that wants to be lifted on a private way, you want to use agent workflows and you want to, you want to have this thing in a, and, and increase your profit margin for the use cases of generative AI and agentic AI. This has been the discussion with our company sectors has been consumer, electronic, naturally, robotics, it has been financial services. It has, it has been biotech. Because our technology is really good at really also understanding, you know, the, you know, something that we call in machine learning, credit assignment, you know, having like a long sequence of data and performing like what is, what, what is the relation of this sequence in respect to each other? Like, like the DNA data for example, right? Like DNA is kind of a sequence of information and you can process these things with our type of technology, right? DNA is one instance. The other thing is like data modality. It, it's something that we don't have an issue with. Like as long as you have sequential data, you know, like that would be time series data. So for finance we can build in like kind of time series data plus language data and then you can build kind of more complicated kind of predictors, more complicated systems that can provide you financial advice, financial kind of portfolio optimization. Like there's There's a lot of things that you can do with this thing. And you can also do fraud detection with some of our clients, like in the, in the financial sector. Like we are working on foundation models built by these liquid foundation models or LFMs, as opposed to GPTs. You know, we are building kind of the, the best transaction foundation model that you can build so they can process transactional data that whatever customers have done in a sequence of kind of events that came in, map that with news that are coming in and then you can try to find out anomalies that happens in this form. So you can do fraud detection based on these tracks. So the use case would be fraud detection, right? The use cases can vary, like in different industries. You know, for robotics, as you know, also you can do control, you can do data generation. For generative AI, they are really good at synthetic scenario generations. You can build simulators, right, where you can generate like synthetic scenarios to improve the quality of a robotic system that is taking an action or doing a sophisticated unstructured task. This could be like helping a patient. This could be performing surgery. This could be like all sort of a human plus a machine in the loop kind of interactions. You know, this is where generative AI could be really, really helpful in the robotics space and consumer electronics as a whole would be Just the fact that can we bring the intelligence on the edge, you know, and that those are kind of the places that we have been active.
[26:15]
Jeff Nielsen
I feel like, I feel like I could talk to you for like an hour about any one of those specific use cases. There's just, it's amazing to hear about the breadth and, you know, the value created from each of them. Ramin, are there any, what are the coolest use cases that know, you've, you've come across so far, or maybe the ones that surprise you the most and you thought like, wow, when I, you know, when we first came up with that technology, I never thought I could do something like this.
[26:42]
Ramin Hasani
I think the most surprising one, like, so far it is more on an exploration side of things. But I mean, with some of our bio partners, you know, like our company's headquarters is in Boston. And you know, Boston is one of the biotech hubs. Like, you know, we have access to a lot of kind of biotech companies. We started talking to the biotech companies when we built the first version of our DNA foundation models building on top of our technology. And we did a one on one. So these DNA foundation models just to tell you what they can do, they can process basically Sequence of data and they can get an instruction about like what that sequence, like DNA sequences. And then they can generate for you a new sequence based on the information you provided or desire that you would have for designing something for, for a DNA sequence that would turn into a protein. You can then fold this DNA sequence into a protein and that protein could be a drug, you know, basically a drug discovery process. You can use this thing for drug discovery. One of the early kind of things like, which is one of the coolest applications of these things in healthcare, right? What I, what I observed, even a small model below 1 billion parameter model we achieved kind of on the error rates, like when we look at one to one comparison to a GPT that is used for generate generation of high confidence proteins that for sure would fold into something meaningful and biologically meaningful. We saw that for the first time we can design new structures that are matching the biological proteins that are existing in real world work, but with a different sequence of DNA sequence. And that was just opening a new opportunity for drug discovery. The type of protein, protein structures was coming out of a liquid foundation model as opposed to a GPT was like novel to the extent that was like practical. You could take action on them, you know, like you can take these things and now test them for are they going to be like a drug candidate, like a candidate for the next generation of drug or something? You know, and that's like. I think it's one of the most fascinating thing that I have seen, you know, from generative AI of that small size. Like, you know, like today you're talking about GPT4s of the world being like, you know, much larger in size, like trillions of parameters. You're talking about sub billion parameter model being able to generate proteins of high confidence that might actually turn into an actual drug. That was super fascinating for me. That's something that I haven't seen in the past.
[29:20]
Jeff Nielsen
And if I'm understanding that correctly, we're not just talking about doing it faster. It sounds like it can help with the hypothesizing or the exploration itself. Is that fair?
[29:30]
Ramin Hasani
Yes, it is, because it's just a better learning system. When I told you the objective function is one is efficiency and the other one is expressivity, you want to have a very expressive model. They are basically consistently outperforming transformers. So it in some sense like what we are thinking, what we're seeing actually in action, is that there's a potential for a new wave of kind of AI systems building on top of this new foundation model. Which is not a transformer system. And it's like kind of a liquid foundation model or LFM basically. And that's super exciting, you know, and we are exploring like the horizontal game, as you see. Like there's like so many places you can deploy this technology in. But at the same time, like we are a tiny startup. I mean, we are 50 people right now. We cannot really like capture everything. So we have to also have some sort of a focus at the moment. Like we are working with some of the people in the consumer electronics and on the edge. And as I mentioned, but we are not stopping scaling these models because we are at the verse of we just recently closed a financial round led by amd and that round is going to allow us to really like scale this technology into regimes that were not possible before. There's a lot of uncertainties. There are a lot of things the questions that we have to answer. So far we have scaled these models up to less than 100 billion parameter models. And now we want to see what would happen if you scale them further than that. Do they scale? So far it looks promising. It looks like they're going to the right direction. But we have to see how much we can push the boundaries because the larger you make them, the more capacity these models have for really encapsulating knowledge in there. And that's the place where we're going.
[31:18]
Jeff Nielsen
It's so cool. And from an efficiency perspective, I mean, as you get bigger and bigger, if you're able, and I mean, I don't know the scale exactly, but if it's 10 times or 100 times more efficient, whatever the number is, then the transformer model, I mean, I think many of us know that compute is a huge issue now. And just the sheer power and electricity required to execute this stuff at scale, it sounds like there's potentially some really cool gains to be made that they in some ways unbend the curve needed to make this whole technology work.
[31:56]
Ramin Hasani
On the development side of these models, when you want to from scratch, you want to build a liquid foundation model as opposed to a GPT is going to be 10x more efficient to build something like this. So the reason being like, because the computation happens in a linear scale, like computation scales linearly with the amount of data that this system can process, as opposed to GPTs, that computation happens quadratically. So they grow quadratically in complexity. That's why they're really hard. Like there is something called a context length or a working memory for a foundation model. The context lengths of foundation models like it's memory is such an important aspect of learning. Learning and memory are very intertwined with each other. You know, whenever you want to do something beyond humans, you know what you want to extend the working memory of a human, you know, and that's like where things becomes interesting, you know, and that means like millions of tokens, you know, can we really scale? When you start scaling transformers into those regimes, they are basically the computation scales exponentially, quadratically. Our computation scales linearly. That means the longer information they process, the better the gap becomes between like 10x could become like a thousandx at use, you know, like depending on the context that you want to use them. Now that becomes like a fascinating fact. And the reason why that's possible is because of the form of computation being different. I told you at the very beginning that we discovered new operators to operate at an efficient way that works with hardware constraints and stuff. And that's where the magic comes in. We have operators that are mathematical operators that are, we have mathematical operators that are extremely good at doing efficient memory computation. And they scale really, really nicely with the sequence lengths and the amount of data that they want to process. That's on the development side. On the deployment side. Again, you can control how much memory you want to have at the test time, when you want to test the model. Again, as the computation becomes larger, the amount of memory and the amount of computation that happens on a GPT for memory, it scales linearly. The more information they process, GPTs they would have to accumulate that memory. So that kind of restricts their usability for a long period of time on a hardware constraint. And this is something that we overcome with the liquid foundation model, which is kind of 10x lower at this kind of thing. And they scale linearly. And that's like one of the nice things about them. And I told you the scale would be like between ten to a thousand times, depending on where you are on the memory side.
[34:58]
Jeff Nielsen
And if I'm hearing you correctly, the bigger the scale, the bigger the delta, the more impact there is in using this model, the more efficient your model becomes. Right? Like the delta, which, which is, which is so, which is so crazy and you know, makes me very, very excited about, you know, what you're developing and speaks to the value of being able to, to scale this thing out with the organizations you're working with. Ramin, is this like kind of in production at this point? Is it exploratory? Where are we, you know, is this ready for the market? Is it a little ways out, you know what, what should organizations who hear about this and they say, wow, like I'm dreaming up, you know, use cases for this. How far away is this?
[35:44]
Ramin Hasani
So early versions of these things are ready. So we are testing it with a bunch of kind of enterprises, early adopters of the technology where, you know, POCs are now getting completed. You know, so we are getting into the phase where the technology is going to be productionalized, you know, and as I told you, like one of the challenges that we had, like we had to overcome and it's still like one of those places that we are constantly kind of improving is infrastructure, right? You need to be able to bring like the quality of models that we develop are phenomenal now. You need to have like a serving stack and a customization stack that the clients take and they buy this basically software and they start like using this thing like you know, in many different applications. So far it has been good. Like we have early versions of the products like getting tested again with the early clients that we have in different sectors that I mentioned, like consumer electronics, a little bit of E commerce as well, like and also financial services and biotech basically. So there are early adopters of the technology, partnerships are getting built, you know, like to take this technology to the next stage. We decided to do go to market like with a focus on enterprises and not user facing because we were not ready for a rapid kind of feedback from the clients. If we wanted to really have control over how we are building this thing and making sure that it actually can generate value, then once the value is proven with enterprises we would also have like a consumer facing kind of game at the moment. It's just we put our technology out there. You can test these models, you can test them on our own kind of playground Liquid AI. You can test the technology on Labs Perplexity AI. So on Perplexity you can actually test one of our models. Like we partner with them to host our models and you can use our API on Lambda Labs for example. Like you can get access just these are like just places where we want to give people kind of an early exposure to the technology. Not just that you're not monetizing any of these, these are like freely available to everybody to just test the technology. And then on the side like what we are doing, we're 100% focusing on enterprise enterprises right now to use these models so they can contact us today and to become like basically a development partner.
[38:02]
Jeff Nielsen
I love it. And it's such a cool model and it's such I don't know, it's such an interesting approach to say like, you know, you go play with it, let's figure it out, you know, at scale, what we can do with this broadly. There's one piece of the value Ramin we haven't talked about yet here today that I've heard you talk about in the past, which is, you know, the white box, right, the explainability piece here, which I thought was super cool. Can you tell me a little bit about the philosophy behind that? Like, why is that important and what does it look like in practice with your model?
[38:34]
Ramin Hasani
Absolutely, absolutely. So there is a field in the, in, in the electrical engineering is called control theory. Okay. Control theory, as the name kind of implies, is that how the theory of how we control things, right? The way we designed cars, engine, you know, airplanes and everything like around us, every, every kind of machine that we built stems from like these things, you know, like it, it's, it's, it's the most fascinating kind of field for building machines and engineering. You know, it's them from, from, from this control theory. So control theory has a certain form of mathematics that allows you to design controllers, okay, for systems like let's say an autopilot of an airplane. It is designed to control that process, like to drive, fly that airplane or something autonomously. Now the mathematics, we have 200 years of kind of knowledge on how these mathematics work from control theory. Obviously you want to design safety critical systems using mathematics or maybe systems from the way that humans design engines. You know, like, like we have, we want to have like full transparency into the design of systems. You know, the approach, the mathematics of liquid foundation models are informed by control theory mathematics. So that would allow us to use those operators, 200 years of kind of knowledge from the control theory to understand how foundation models come up with decisions. That means today instead of just designing black boxes, full blown black boxes, which transformers are basically matrix multiplication systems. And if you want to really understand them, you have to look at them like from, you open them up, you look at behavior of a certain neuron and you hope that you start understanding a little bit of behavior in there. Because it's very ad hoc the process of really understanding how these things work specifically at the scale, it's very complicated, like to, to really understand these things. Entropic is actually doing a lot of work in the, in the interpretability of GPT based models. Like with the Claude models, like they have like a 25%, they're saying, like Dario was saying, like 25% of their organization is focused on interpretability and figuring out what happens inside the black box. So they can open the black box. What we thought, we thought that. Okay, let's take this first principle. Thinking of inherently the mathematics that we design neural networks are something that is rooted in control theory. Therefore we have the tools to understand the systems. And therefore these kind of technology is really good to be applied in safety critical applications. So when you want to bring these kind of systems, that's why I called it white box intelligence. Because on the development side, I can pause my learning process when I'm actually training this model. At every instance of the training process, I can pause the system, I can take an instance of my model, I can look at it. Every layer is doing a job of a controller of a control system. I can look at it from the mathematics of control systems and really understand how my system works. Now what I can do from this instance, I can direct the behavior of the system towards. I have a lot more control over the design of this system. That would be on the development side. At the testing side also, like when you want to understand how a network came up with a decision, or if the network made a hallucination or made some mistakes, how can we go and do root cause analysis? And that's something that you can enable with purely like first principle kind of way of thinking of control theory.
[42:24]
Jeff Nielsen
And it's, you know what's so amazing to me about that, as you said, is just that it's, it's inherent in this model, right? You don't have to go back in and retrofit it or reverse engineer it and say, okay, how can we, you know, have all the smartest minds in the world break apart what we already did, it's right there to begin with. And I don't know if this is a coincidence or not, but it, like the image I saw is that transparent worm, right? Like, it's just like the worm. You've got the model where you can see all the things that it's doing. So it's so it's super, super cool. I did want to ask you, you know, you talk about all the amazing work you're doing scaling this out, you know, it gets more valuable at scale, you know, in the world. According to Ramin, if things go your way and it grows the way you want it to, you know, where do you see this technology in three to five years in terms of applicability, in terms of what it can do? Do you have some sort of, you know, long term vision? I hesitate to call It a master plan but you know, some sort of view about where all of this is going.
[43:20]
Ramin Hasani
Well, absolutely. Great question. So I've been thinking about this a lot, you know, and even when I think about this is that like in an ideal world you want to have the right type of intel. So one of the things that we started as a slogan of our company is like machine learning done right, you know. So when you're thinking about both, the base of AI systems should be something that gives you that power that chatgpt of the world are giving you, but at the same time do not consume the entire power of a country or something like that. So you want to do that in a sustainable way. So I feel like this approach is so fundamental and it comes from nature and nature gifted us like 13 billion years of evolution where you know, you've seen, you've seen like how a lot of, a lot, there's a lot to be discovered like from nature, you know, this is just the tip of the iceberg. Like we just looked into like some element of this thing and it enabled us to design like better learning systems, you know. No, no, when, when I think about like future AI systems, I can see this to become the platform, you know, to be the base for AI systems of the future. Because you want more reliable as we are moving towards, you know, agentic behavior systems that we machines that we give them control to take action in the real world. You want the technology to be trusted as well. You want the technology to be understandable. You want the technology to be a hundred percent controlled by us. The only way that you can do that is that if the base of the intelligence is not anymore a black box, it's a white box, it's a system. You have a lot of control and at the same time you're not spending that, that much, that much energy to develop it. So when I see in the future I can near future where I see like I can see like liquid foundation models to be deployed on any devices that we own. Device could be anything that we have and even ported porting them on satellites, you know, like on the as an edge device I would consider them there and that would be in the short term, in the long run we would have like a larger instances of these models. They could be hosted but they would not consume that much energy. But at the same time would give us like comfortably a really good experience with like built working with foundation models that are coming from nature, from us basically for us basically. So something like that.
[45:42]
Jeff Nielsen
So I mean when I look at your technology, you've got effectiveness, you've got efficiency, you've got explainability. You know, is, is there anything left for these more traditional models to compete on? Is there a world in your dream where this just becomes the new platform for AI in a world where this is working well at scale? Do we even need all the other stuff that exists now? Or could this be that this is the better way and it replaces all of it?
[46:12]
Ramin Hasani
I mean, there is that potential we need to unlock that scale. Scale to see what, what happens at that scale. But I would say like we can cohabit with the other models as well, because there's a lot of energy has already gone into like building these large foundations. You know, for example, Claude is really good at coding. Okay, then chat GPT. Like GPT is really good for general purpose kind of questions that you would ask. The Model O1 series are good models for reasoning. You know, like the base is something and you know, like there's also like technologies like RAG retrieval, augmented real generation. So what I want to say is that there is always a space for specialized models. There's no one model ruling them all. There's always space for everything. But then again, today, places where you do have resource constraints and if you want to make money off of generative AI, this is the type of technology that you want to use. Because you really have to work on the efficiency angle, by the way. So that being said, there's a lot of groups and also started looking into new architectures, new algorithms really massively going after how transformers are. How can we make transformers more efficient? There is also this belief that transformers can do. Transformers are brilliant. Why? Because they are doing unconstrained computation. That means unbiased computation. What does that mean? That means you don't put any biases into the architecture. You, you say matrix multiplication. That's like as much as you do. Like, let's just scale matrix multiplication. What can be more elegant than that? You know, it's, it's a really nice thing, like a very simplistic thing that you can scale it, which is very nice, but it has some shortcomings. And maybe nature has gifted us these operators that we are using and those operators are absolutely necessary for building the next generation of controllable and reliable AI systems. You know, and that's kind of the place where we want to go. The more reliability you expect from an AI system, the more you want to use a liquid foundation model as opposed to a GPT, basically.
[48:23]
Jeff Nielsen
Well, and that, that, that's exactly why this is so compelling to me. And I mean as you said it like if you're looking to get you know, roi, if you're looking to you know, make money off of AI, it's like, I don't know, to me it's, it's a no brainer. I quickly wanted to ask you about quanta. We talked very briefly about it as we scale. You mentioned supercomputers. Is there a quantum play here? Is that something you're exploring? How does that come into the future of this scale?
[48:50]
Ramin Hasani
We are definitely looking at it as a field. I'm not just stressed about our technology, but I can tell you that form of competition is absolutely incredible. So I'm looking forward to when it is practically ready for us. Quantum inspirations that we can take. And you can do quantum inspired kind of machine learning at first. But I would say the most important problem. I'll tell you one live story that I had like 2016 when IBM was showcasing, there was a conference, AI conference in Barcelona and this is the largest AI conference in the world called NURIP, Neural Information Processing Systems. And there IBM was showcasing their, their, their quantum computer. You know, like they were like hanging from the wall basically just showing like this thing that is hanging. And then I was like so fascinated, I asked the guy, so what do you think, like when can we have this like commercially available to us? He told me twent, he told me 2035. Okay. So basically the year that he told me and I feel like, like this is kind of, kind of true. And I feel like like this would be the time where, where you know, like the true value of quantum would, would come out. You know, unless we have like massive breakthroughs like coming, coming around. But I would say like my timeline for being having like reliable quantum computers that we can use commercially, this would be like the timeline that I would, I would think about, but it is absolutely promising. And this is one of those places where scaling computation is not just a software game, it's a hardware and it's like the medium and the form factor as well.
[50:29]
Jeff Nielsen
But you're not waiting for Quantum, you're 10 years out. Great, maybe then. But you've got work to do in the meantime.
[50:36]
Ramin Hasani
Yes, yes. So we can work with the computers that are here. On the hardware side though, there are ways to also specialize hardware for the type of software that we are designing. So far we have been adapting our technology to the existing hardware and. But there are ways to also like co design new hardware kind of systems. And that's another Exciting area that I think in the near future we are going to explore.
[50:59]
Jeff Nielsen
So Ramin, I mean, this is clearly an area of passion for you. And you know, you talk about the, you know, the path you've taken over, you know, close to 10 years to get here. You know, is this something, you know, you had a passion for, you know, since you were a kid? Where did this come from? And you know, could you ever in your wildest dreams have foreseen that, you know, this is where it was going to go?
[51:21]
Ramin Hasani
Not this way. So what I, what I, what I thought, where things are, I'm, I'm a, I can focus. One of my superpowers that I can focus so much. You know, I've been a scientist all my life and my question was always like, like, you know, like I got interested in intelligence, you know, itself in the, in the form of like understanding how intelligence works. That was the year 2014. Okay. And I thought that, okay, so I'm on the verse of becoming a scientist in this field where I would just go and crack basically some aspect of intelligence and win a Nobel Prize, basically. That's like the path that I chose.
[51:58]
Jeff Nielsen
No big deal.
[51:58]
Ramin Hasani
Yeah, that's what I thought. And then 2022, when we solved that equation, we solved this equation, I told you, in this nature machine intelligence paper that got us to the point where we could scale these models was an equation that I solved that didn't have the mathematical form, didn't have a solution since 1907, you know, and that was like this equation that was describing the behavior of neurons. 1963, two English scientists like Hodgkin and Huxley actually cracked. Like they described the behavior of two neurons like very, very kind of closely. And in 1953 they published a paper. In 1963, they won a Nobel Prize. Okay. My thing was that if I saw and this equation was known not to have a closed form Solution yet, but 2022, I cracked it. And basically we had this together with our teams at mit and I thought that, okay, so this is the path that I'm going to go, right? But then the life shifting experience was that exposure to venture capital also to see how things are in the real world. And then we thought that, okay, so maybe it goes beyond just thinking about just the scientific aspect of it and we can actually take this technology and bring, bring value by, by scaling the technology properly, you know, and we deeply thought about this with the four co founders of mine, one of them being like one of the greatest mentors of all time, like One of the robot. One of the greatest roboticist scientists in the world. Like Daniela Roost. Professor Daniela Roost. She's like a phenomenal, I think one of the top 10 most powerful scientists in the world. Like in. And. And I have been privileged and it has been a humbling experience like really working with her and working with my colleagues. You know, I usually like refer to my CTO as like the smartest man on the planet, you know, because he's. I mean I feel dumb when I'm, when I'm talking to myself like this. I can tell you like Matthias is like absolutely incredible. The same feeling that I have to. To Alexander who is my other co founder and the rest of the people that we have at Liquid AI. We have been fortunate to be able to attract a group of talented people into one space. And this group of people are just so phenomenal. And it's as I said, like the joy of like every day interacting with people that are smarter than yourself. That's just something that you cannot replace it with anything else. Like if nothing happens and then we would, we would fail as a venture like this. Just experience of working with these people has been like the most important pleasure that we have. So.
[54:21]
Jeff Nielsen
So in your, in your kind of world of hopes and dreams, is scaling this in enterprises the definition of success and Nobel Prize is a cherry on top? Or is Nobel Prize what you're going for and everything else is the cherry on top?
[54:36]
Ramin Hasani
I don't know. I feel like I felt personally internally I thought that I did enough on the scientific side where I cracked that equation and now it's the base of this company as we go forward. I thought that okay, so this is, is good enough, you know, I feel. And then if, if something happened in the future, good if doesn't happen like the Nobel Prize and stuff. But, but, but, but, but that's not something that I'm. I'm actually focused on at the moment. So the main focus of the come it's not enterprise play as well. So I want in the hand of every, every single person in the world. So we want to bring value, we want to bring intelligence in the hand of people in a form that is possible. You know, this could be in the future. We are not just having mobile phones and laptops in front of us. This would be glasses. This could be like other kind of devices that we would bear and maybe internal even chips and stuff like you. You never know. Like future is very interesting, but again there's an intelligence is always tied to the substrate. You're hosting it on, you know, and that's like something that. That, that. That interests me, you know, like from the business point of view as well. You know, like when we were thinking about the future of intelligence, I'm not just thinking about the largest kind of form of intelligence that you can possibly put it in a data center, but the one that actually is very, very intelligent, but you can put it in a, you know, in the hand of people, you know, and you can own that. That's, like, much more fascinating for me.
[56:01]
Jeff Nielsen
Well, and that, that's. That, that. That's another kind of echo of the animal kingdom too, right? There's the brain and the body, and you're looking at, you know, how it actually. The right brain is on the right body. It's so cool. And we could. Yeah, I could talk for a long time about this. I know we're just about at time. Ramin, is there anything else you wanted to talk about today or.
[56:19]
Ramin Hasani
No, no, I think we covered so much. Thank you so much for having me. This was a pleasure chatting with you.
[56:25]
Jeff Nielsen
Hey, the pleasure is all mine, Ramin. Thanks so much for joining today. This has been such an enlightening conversation. That guy's gonna win a Nobel Prize. I think he's gonna win a Nobel Prize. That's my predict.