
Python is the dominant language for AI and data science applications, but it lacks the performance and low-level control needed to fully leverage GPU hardware. As a result, developers often rely on NVIDIA’s CUDA framework,
Loading summary
Kevin Ball
Python is the dominant language for AI and data science applications, but it lacks the performance and low level control needed to fully leverage GPU hardware. As a result, developers often rely on Nvidia's CUDA framework, which adds complexity and fragments the development stack. Mojo is a new programming language designed to combine the simplicity of Python with the performance of C and the safety of Rust. It also aims to provide a vendor independent approach to GPU programming. Mojo is being developed by Chris Lattner, a renowned systems engineer known for his seminal contributions to computer science including llvm, the C Lang compiler and the Swift programming language. Chris is the CEO and co founder of Modular AI, the company behind Mojo. In this episode he joins the show to discuss his engineering journey and his current work on AI infrastructure and the Mojo language. Kevin Ball, or K Ball, is the Vice President of Engineering at Mento and an independent coach for engineers and engineering leaders. He co founded and served as CTO for two companies, founded the San Diego JavaScript Meetup and organizes the AI in Action discussion group through latent space. Check out the show notes to follow K. Ball on Twitter or LinkedIn or visit his website K Ball LLC.
Chris Lattner
Chris, welcome to the show.
Thank you for having me. Excited to be here.
I must say I'm so excited to hang out with you because you have such a history in an area that is fascinating to me, which is language design, but in particular language design for the current age that we're in around AI and machine learning. So excited to dig in, but let's first have you introduce yourself a little bit to our audience, who you are, your background and what brought you here.
Sure. Well, so I have I guess two epochs to my professional career. Epoch one known for CPU programming stuff and developer tools and so I'm known for building the LLVM compiler and so that underlies a lot of programming languages including C and Rust and Swift and tons of different things, which is really, really cool. Through that journey I worked at Apple and built a bunch of cool technologies including the Swift programming language, which was a nights and weekends project that kind of scaled and went beyond. So that was very exciting and did a whole bunch of stuff in GPU programming and other stuff like that at Apple. Around 2016 I got really interested in AI and so that was before LLMs, that was before ChatGPT, that was back in kind of a different era. But I saw the potential of the technology and I've been on kind of this hero's journey ever since then trying to figure this stuff out, make it easy to Use, make it so that everybody can actually have access to this and it's not just kind of locked up in big companies. And so that brings me to Modular, where we're building Mojo and Max and a whole bunch of cool technologies where we're trying to help fix AI compute.
Yeah, let's maybe do a quick high level of what Mojo and Max are before we dive into the guts. Like, what were the motivations behind building this?
Yeah, so before we say what they are, to your point, let me tell you why we were forced to build them. And so if you zoom out, AI, it's not new, right. I mean, depending on how you count, it's decades old. But the modern deep learning revolution is, you know, within the last 10 years, it's been amazing how that exploded onto the scene and all the different fits and starts and new things got added along the way. Well, along the way, it got really built into this technology from Nvidia called Cuda. Now, CUDA is the thing that powers so much of deep learning. It is what enables Nvidia to be the titan of the industry right now that is reigning supreme across the entire, entire world. We, I think, rightly owe debt of gratitude to Nvidia, to building CUDA like it was designed for a completely different world, a completely different use case to enable physics and games and like all kinds of different things that really had nothing to do with AI. But it was the right technology to catalyze Alexnet and a whole bunch of really early deep learning technologies that took off building. On top of that, we got things like TensorFlow and then we got Pytorch and we got other technologies built on top of Cuda. And when you play it forward, AI was changing so much. Like so many new research papers and so much innovation, so much stuff happening all the time, that we as a software industry just put more and more and more and more stuff on top of this. Cuda and Pytorch and other technologies and just kept accumulating, kept building higher. Well, today a lot of that foundation is not really actually stable. It's very rickety. A lot of things don't work. Nvidia is an amazing company, but they kind of get too much of the lion's share of credit for what they're doing right now. And a lot of people are unhappy with that. And so what we set off to do at Modular back in the day is we said, okay, well, let's actually fix this. But fixing it can't be done by adding another layer of duct tape and Bailing wire on top of the current stack. Let's really fundamentally challenge the status quo. Let's do something much more difficult. Let's go build a replacement for Cuda. Let's go build a full stack ecosystem where we can actually build into AI. Let's design it for Gen AI, not with Gen AI as an afterthought. Let's make it portable, let's make it scale, and let's really tackle a lot of these assumptions now. Like, you know, given my background, I understand fully well what it takes to make a language successful. This is not a three month sprint. This means doing more than just building out a parser. This means a whole tools ecosystem, community and many other things. And so the only reason we said yes to this and decided to embark on this is because we think the stakes are high. We think that there's a whole new form of computer. People are struggling today, but as we look ahead into the future, I only see more hardware, more innovation, more AI. Yes, but also more application for this kind of compute. And this is where I don't see anybody else out there that's really swinging as hard as we are and trying to actually make a difference here.
Yeah. So let's maybe take a look at what that starts to look like. When I was trying to bring myself up to speed really quickly on what Mojo is, I looked at this and I was like, oh, this kind of feels like a Python Rust mashup. Is that a fair description? Like, how would you describe the language Mojo?
Well, if you look at Mojo, you can see many different things depending on who you are and what perspective you're coming from. So first, superficially, the syntax looks like Python and so there's a lot of Python DNA that is very intentional. The entire AI world is really revolving around Python. Python is one of the most well understood and well loved languages out there. And so I'm very familiar with building languages and we can argue about curly braces and different syntax things and stuff like this, but from my perspective, Python won and it won for a lot of really good reasons. It's beautiful, it has a lot of advantages to it and I understand taste may vary, but we decided, hey, let's make Surface level syntax be Python family right now. Your point is also really good. That Rust, huge influence. So Mojo has a whole borrow checker. It's not the same as Rust, but learns a lot and then takes the next step forward. It has other type system features, it learns a lot from Swift, it learns a lot from many, many, many other communities. C even has still some good ideas. I mean, one could argue that C has too many ideas. And so Mojo's very different in that respect. But really, I look at this as saying that our goal here is to bring the best ideas together in a novel way and then try to solve the problem in a way that only we can do with this combination of techniques. And so the problem we want to solve here is basically GPUs. How do you program a GPU? How do you make stuff go fast? How do you enable code to span GPUs and CPUs? How do you enable these crazy AI ASICs that people are talking about? And like, these problems are things that other programming languages aren't really trying to do in a coherent way. This is a big thing. And so you'll see many influences, because I think that taking good ideas from wherever they are is really important. But many of the details end up being quite a bit different.
I like that framing as a raison d' etre for the language of like, this is about solving GPUs and making it seamless to bridge that CPU GPU boundary. Can we maybe dive into some of the language features that you've chosen and look at them from that lens? Like, for example, the ownership borrow checker. That is very cool. Very big part of Rust, also relatively novel, very different for someone coming from a python where you've got a garbage collector and things like that. Like, what's the driver for that and how does that frame? Or how does that really help you solve the GPU problem?
Yeah, well, so actually it's funny, you asked two questions. You asked what is an example killer feature? And then you asked about the ownership model, and they're actually totally unrelated. So you need an ownership model to have a memory safe systems programming language. And so like Rust, we have no garbage collector, we use static memory checking. And memory safety is really important for any modern language. And so, yes, we have to have an ownership model. And we could talk about why Mojis is better than Rust or things like that, but yes, we need to have that, but that's not the differentiating feature that allows us tackle hardware. And so that's table stakes these days. And so the more interesting things are when you look at things Rust doesn't do, right? And so Rust does not have a very powerful comp time model. And so Mojo has comptime, which is, I think, most known from Zig, for example, which allows you to write arbitrary code that runs both at runtime and have the same code run at compile time. And so you can think of this as like C templates or there's many macros, there's many different systems in other languages that try to solve, like, how do I get code to run when the program's being built? Zig most notably, but also Mojo, take that way farther forward than these languages that use templates or other languages by unifying it with the host language. Now, why is that important? Well, it turns out that when you're programming a GPU, GPUs are very complicated. Okay, fine, but they're also really weird. And so GPUs are all about performance. And so people care about cost and they care about latency, and they care about these things when using AI, particularly with Genai and deep learning. But then it turns out that the memory hierarchy is really fragile. And if you get it exactly right, you can get super high performance. If you get something slightly wrong, well, you fall off a cliff with performance. And so techniques like auto tuning become very important. And reconfigurability, when you start merging AI operators together is really important. And being able to span across multiple different kinds of hardware is really important. And so what you want is you want to be able to write code that's generic, but not just generic over float versus double. You want to be generic over how many threads are in a warp. And many, many, many other parameters end up mattering a lot to efficiency on a device, but they don't affect the structure of the computation. And so this is something that Mojo is really, really excellent at. I can give you another example. We could talk compiler tech, but why don't you ask questions about this first?
I kind of want to dig into that. So just to make sure that I'm understanding, one of the big things that you're diving into is and following that lineage, you know, starting from C templates, Ziggs, comp time and things like that, is essentially how can we pre compile, build a lot of our genericism, our parameterization, into the pre compile so that it's happening at compile time based on the architecture that you're targeting, but doesn't have to be dynamic at runtime and incur a cost at that point. Is that right?
That's right, yep. So let me give you an analogy just to make it more accessible, because I assume not everybody's already a Guru programming GPUs, so I'm a compiler guy, right? And so compilers, you often have parsers, and one common technology is called a parser generator. And so you use like a domain specific language. There's Lex and YAC and Antler and many of these different things. And what you do is you write the grammar for the thing you're trying to parse, and then you run a program that then gives you Rust code or C code or whatever it is you're running, right? So that is run at compiler compile time. So you're like building the code so that you can then link it into your application and then you don't run it on the fly. Okay. Now, GPUs have similar kinds of problems. So in a GPU you have what's called a tensor core. A tensor core is the thing with all the floating point operations. So this is the thing that's really, really important for performance. But now makers of GPUs keep changing them, even Nvidia, if you just zoom into their pretty important ecosystem. The tensor core on the A100, the tensor core on the H100, the tensor core onthe B100, these are each just three different generations of their hardware are actually quite different. And so what we want to be able to do and what the world needs is to be able to write software that's abstracted from that hardware, so you can write things on tiles. Okay, well, to do this you need to do actually quite a lot of calculation about like what index should be, where and which element goes into which slot of the tensor core. And exactly how do I lay this stuff out, and if I'm charging through memory, like what order do I do this in, et cetera, et cetera. All that is actually very static by the time the code runs. And so it's complicated. You need like data structures. I mean, it's not just like C template where you can like add two integers together or something. You need to be able to actually have trees and talk about nested hierarchical structures. And so the cool thing about Mojo is that you can just write code and you can use normal code. You can use a list or an array or a string, or like whatever data type that you want at comp time at the compiler runtime. And so now Mojo has a very clean division between code that runs at runtime, the code that you can use at compile time. And when you're writing these algorithms, it's the same code and that becomes very powerful.
This reminds me a lot of early in my career I worked on a high performance compiler and was dealing with navigating with folks in the scientific computing community. And one of the reasons that, for example, people there were writing code in Fortran rather than C was because the memory model of Fortran allowed the compiler to lay things out in a way that it could manipulate how things were laid out in code much more powerfully than it could in C, because you didn't have to have the same like pointer access to wherever and so you could optimize to, in that case, the CPU memory hierarchy and make sure you were getting all of your cache hits Here it sounds like what you're doing is that same level you have some form of constraint and I do want to dig into like what constraints did you have to place on the memory model or to be able to do this, but you have form of constraint that allows people who write Mojo code to not worry about it, but have it then compile into things that are optimally laid out for the memory of these different GPU kernels.
Yeah, that's exactly right. So is the question, how does it work?
Well, yeah. So the question is then what trade offs, if any, did you need to make in terms of the programming language, like what a developer can do with this thing in order to get yourself the set of guarantees that you need to be able to do that type of optimization?
As far as I know, there's no trade off, there's no downside. It strictly makes the language more simple, more consistent, more powerful. There's no trade off, it makes the compiler more different than previous generation languages. Right. But if you look at, again, I'm fond of Mojo, of course I'll say nice things about other people's systems. If you go look at Zig. Zig's a relatively very simple language, but has very powerful metaprogramming and generics capabilities. And so they made other decisions in other parts of the language. So they decided not to have a borrow checker because that was part of what they were going for. But powerful metaprogramming doesn't have to come with complexity. And so a previous journey, I built the Swift programming language. Swift has many good things about it, has some challenging things about it. One of the bad things about Swift is it has accreted many different language features that are non composable and non orthogonal. And so it has many things that got added over time and they don't quite quite fit together in the right way. And so you get more and more and more and more and more features. The benefit of having a powerful metaprogramming system from the beginning is that now many of those things that became features in Swift instead become features in a library. And so With Mojo, what we've been able to do is make sure the language is much smaller, much more orthogonal and much more consistent. And I think this is strictly better. I don't see any trade off there. It doesn't come at a complexity cost, doesn't come at a compile time cost. By the way, our compiler is way faster than Rust or C, particularly for highly parametric. Like C templates, for example, are really, really bad for compile time. And so we can express very powerful things. I don't see a trade off.
That is really interesting and I like introducing this concept of composability and orthogonality. And I'm used to thinking about it in the concept of a non language software architecture. Right. How do I compose things? How does that play into language design? Like, what was it that you needed to do in the language to achieve that?
Yeah, well, so we're all learning, right? And so I've worked on some cool things in the past, but I still don't know everything. And if I did, I would be very bored because I love to learn things. One of the things that through an epoch of my career, I always looked at Python and said, what is that thing? Like, you know, I'm a systems person, like, I want stuff to go fast, right? And so what use does Python have? And I never really took it seriously because it was just a scripting language for kids or something, Right. When I dove into the AI world, I was forced to come to understand what this Python thing was really about. And what I came to realize is that Python, despite its interpreter and its implementation details and things like this, has a really beautiful software library ecosystem that is, I think, maybe unmatched anywhere else in terms of the number of systems it can compose and the power that it gives to library developers. Right? And so to me, that really opened my eyes. And this flows directly into Mojo, by the way, because what I saw is the thing you want to do is you want to keep the language simple, make it so that people can learn it rapidly, ideally without having to retrain. Which is why we just say like, okay, well, Python has this feature, awesome. So do we. Like, you don't have to relearn everything to start Mojo. If you know Python, you already know almost all of Mojo, but then really focus on giving library developers superpowers. And so we're just talking about Tensor Core programming. Well, how does that work? Well, Mojo, the compiler doesn't know anything about a Tensor Core. What we've enabled is we've enabled GPU super experts to go Build abstractions in the library that use all this fancy comp time metaprogramming stuff so they're super, super, super efficient. And then you get full access to the hardware, you don't get an overhead and the language is simple. So to me that, that's like a beautiful thing, like if you can keep the language simple, if you can give library developers superpowers and then you can make the whole system easy to learn. Right. That's really, to me, what I see as our North Star right now.
Yeah, that makes sense.
Kevin Ball
This episode of Software Engineering Daily is brought to you by Capital One. How does Capital One stack? It starts with applied research and leveraging data to build AI models. Their engineering teams use the power of the cloud and platform standardization and automation to embed AI solutions through throughout the business. Real Time Data at Scale enables these proprietary AI solutions to help Capital One improve the financial lives of its customers. That's technology at Capital One. Learn more about how Capital One's modern tech stack data ecosystem and application of AI ML are central to the business by visiting capitalone.comtech.
Chris Lattner
What do you think it is that makes the difference between, you know, Python is an interesting example, right, because it's been adopted by such a wide range of different people. It's been adopted in the data science community, the machine learning community, communities that were not really like software engineering communities and you kind of see that. And then it's also used for very serious software engineering. So what is it about the language that enables that sort of library superpower such that it can span those vastly different ways of approaching software?
Yeah. Oh, so I can say good things and I can say less good things about Python. So the good things I'll say is that it's very easy to learn because it is that universal connector language. It's almost like the duct tape for software. It has been able to span across many disciplines and made disciplines become cross disciplinary, particularly as technology evolves. And so AI has been really good for Python because it was there. And I think Python got a huge boost because of AI. And I think that duct tape aspect, which I don't mean in a negative way, but being the universal superglue, maybe that's a different way to say it, I think is extremely powerful. Also being easy to learn, being taught. There's many, many, many different things that Python has benefited from. Now the challenge is, is that people start using languages naturally for things that they're not great at. And so Python without types becomes a challenge when you scale your application sometimes.
Kevin Ball
Right.
Chris Lattner
And so this Python performance becomes a challenge when you need things to go fast. And so what has happened with Python on the flip side is that you get Python, python, Python, until suddenly you're like, oh, it's slow. Now I have to use C or Rust or some other go fast language. And now I get actually kind of the worst of both worlds, which is I have some of Python and some of Rust. Like all of Rust can be a beautiful thing, all of Python can be a beautiful thing, but Rust and Python are very different things. And so what Mojo is designed for is Mojo, by the way, is a go fast language today. That's really what its strength is. It's not a fast Python. It's like it's faster than Rust, by the way. And so what we've really focused on is making it so that you can integrate Python and Mojo code directly and make it super easy to do that. So now instead of saying, I have to take my Python code and rewrite in a completely different language and then have a team that has completely different skills, and I have tabs and colons on one side and curly braces on the other side, saying like, okay, these are actually very similar. Now your GoFast language and your Python language are actually very similar. We found that that's actually very, very nice. And so we're working on features right now where we're basically saying there's zero ffi. You can just use Mojo code in Python and you can already just use arbitrary Python code in Mojo. And so these features make it really, really nice to say, okay, well, if you're in a Python world, having a go fast language that's convenient and easy to use and right there and is nice is actually quite unusual.
Yeah, the FFI is an interesting thing to dig into a little bit because I think to your point, one of the things Python did well is they said, okay, we have all these fast machine learning libraries and other things that have been written in C. It's really easy to build a set of Python bindings over those. And then suddenly everybody in Python world can be using their notebooks and their interactive flow and still accessing the power of that. I mean, one thing in Mojo, it sounds like you don't need that for performance, but another thing that's going on there is like, you have all this existing infrastructure that exists. What is the interop story for Mojo? If, say, someone has an existing PyTorch or TensorFlow type of package, they don't want to have to worry about rebuilding their C in Mojo. What does that look like.
Well, so Mojo, I think you framed it really well there, which is you have two really different goals. Like on the one hand, in Mojo, you could theoretically rewrite the entire world and you could have everything pure Mojo. And that's very nice because you get the benefits of the type system, which includes traits and like all the modern language features that you'd come to expect from a high performance system language, powerful metaprogramming, all these great things that come with well designed language. But the flip side is that pragmatism, you don't want to rewrite the world, right? And so Mojo also is very pragmatic. And so I think that's very, very, very important. So Mojo can directly talk to Python and, and so you can import an arbitrary Python package and so go to town. This means you get all the wonderful things from the data science ecosystem and many, many more. And so that just works. Now the flip side of that is if you import a Python package, you get the Python interpreter and you get the Python packaging ecosystem and you get, you get the full Python experience. And so it comes with that. You can also directly talk to C C, these kinds of things. And so you can directly import and call Malloc or FileIO or whatever if you want to go do that. And, and so being able to talk directly to C code is super, super important for lots of very pragmatic reasons. Now, what we've seen in our community is that there are lots of people that like to build lots of cool stuff. And so we've got people that build UI libraries and they wrap an existing UI toolkit that makes tons of sense. But then we see other people that say, hey, wow, Mojo has other fancy features around like SIMD programming and things like this. And so they're able to get better performance using Mojo than they are using numpy, like 10x and things like this. And so then suddenly it's very fun to say, hey, wow, if I write some for loops in mojo, I get 10x better performance than calling numpy. That's actually pretty fun. And so it's not my job to tell people what is the right way to do things. I think that what I'd love to see is I'd love to see a vibrant Mojo ecosystem evolve, but I don't want it to be kind of like religiously driven. I want it to be very pragmatic and very focused on outcomes and this kind of a thing.
So something that you said there leads me in another direction that I was wondering around this. So you developed Mojo originally to try to solve the sort of GPU problem and making it easy to program ML stuff, but you've also kind of highlighted it's able to access whatever, it can wrap other different things and it's quite fast and it sounds like doesn't have some of the same compile time trade offs that for example, people trying to build in Rust are encountering, which I.
Think it would also C. But Rust is more, more extreme perhaps.
Well, and modern language features without maybe needing that. So I'm kind of curious, what niches are you seeing people using Mojo in beyond that sort of core ML space that you were originally looking at?
So this is the fun thing about languages is that they can be used for anything, I mean, assuming they're designed to scale. And so we've seen people building AI frameworks. So at Modular we're very focused on AI inference, but we've seen people take that and do training with things like this. Like I said, we've seen lots of games and other things like that and people playing with like graphics visualization type of things, which is again, you want to go fast language to do that. UI libraries I think are pretty niche and I think that people are playing with this, but I don't think it's quite as serious. And a lot of people working on data structures and algorithms and lower level components that systems programmers like to perfect. And so there's lots of different applications we're really focused on make sure it's really great for GPUs because GPUs are the new frontier. Right. It's the thing that's under service, it's the huge problem. And you were asking about what makes Mojo interesting. The second part, besides powerful compile time metaprogramming is some basic compiler nerdery and so if you want we can talk about that. I'll try to keep it high level, but we have a whole new compiler infrastructure that I've been building for many years now called mlir. And so using a powerful new compiler backend is what enables us to have both really good compile times and things like this. The metaprogramming system really builds on this. But then also it allows us to talk to lots of hardware. And so that's something that's very different. And there aren't other languages that are widely used that actually do this. And so this is another reason we had to build Mojo.
Yeah, so I would like to nerd out on this because I love to geek on this stuff and I think once again you have such a great background for diving into this. So maybe first let's start with very high level. I think a lot of stuff today is built on llvm, which you obviously are very familiar with. Maybe give people who aren't aware the very high level of like, what does that structure look like so that we can give them the context of what does this new structure with MLIR bring to the table? Or how does it shift that world.
Yeah. So let me go back in time 25 years ago, I was a university student at the University of Illinois, right? And at the time, gosh, 25 years ago, Java was the cool thing. And so virtual machines were the hot new technology. And they weren't new, but it was taking over the world. And everybody assumed that Java would be the thing that unites all of compute. Now, Java is a beautiful system and the Java virtual machine beautiful system for a lot of different reasons. But it struggled because it was really designed for just in time compilation. And so it was really about, okay, you load an app off the Internet, you JIT it and then you execute the code. And so for larger scale applications that having to compile it all before you started running it was actually a bottleneck. And so LLVM came onto the scene. And our initial idea, which lasted about 3 microseconds by the way, was to make a better Java compiler thing that didn't have this just in time component that was as heavy weight as what Java did. Now when we played it forward, the way I architect and built LLVM out was to make it a very generic and composable ecosystem. And so it's a modular design. The way that you can put together the pieces within the compiler are designed to be composable. And so what LLVM evolved into is it evolved into the universal connector for CPUs basically. And so it's designed, it has one, what's called an ir, the intermediate representation. And if you look at it, it's well documented. You can go read the spec if you want. If you look at it, it's like a compiler person's take on C. And so you've got integers and pointers and floats and structs, and you've got SIMD vectors and so you have some of the basics there, but it's really C. And the way that LLVM and then the family of languages around it evolved is they said, okay, well if you take something like C or you take Rust, or you take Swift, or you take one of these other higher level languages, it's up somebody else's problem to figure out how to map a class down to basically C in the case of C. Okay, figured that out. Like what Clang does, and I built the Clang C compiler along with a big team is it says, okay, well there's vtables and so we can lower a VTABLE into your code and LLVM can represent that. And that was fine. Now more modern languages like Swift and Rust and things like this really benefit from higher level optimizations. And so you really want to be able to do devirtualization, you really want to do monomorphization of templates and things like this. You don't want to do that on a syntax tree. And so LVM is still an amazing thing. It's widely used for lots of different reasons. But as languages evolved, there became a need for higher level IRS and higher level representations of the program. And so Swift has its thing called sil, Rust has its thing, everybody started building their own thing. And so what this drove is this drove a couple of exciting opportunities because languages evolved and that was very powerful and yay, go technology. But it also drove fragmentation. And so the cool thing about LLVM is that it's, it united so much energy. Like the chip providers could say, hey, if I just add hardware, if I add support for my chip to llvm, then I get all of software, I get, you know, Linux, I get a web browser, I get like all this stuff just by adding it back into llvm, I get Swift, I get Rust, I get Julia, I get all these beautiful things that come just by adding LLVM support. That's cool. But now, and what that did was that enabled a factorization of industry effort. And so people weren't reimplementing the same stuff 50 times over. They could get high ROI, high impact for their work and they got this massive software ecosystem. Now as languages evolved, you got this fragmentation. And so the Swift people have their thing, the Rust people have their thing, the Julia people have their thing, et cetera. Well, guess what? AI people didn't notice. And so what happened is when AI came on the scene, they started building all their own stuff and so their own intermediate representations. You got TensorFlow graphs, and then TensorFlow had TensorFlow Lite and then you had TorchScript and you had Glow and then you had XLA and then you had like this explosion of compilers, Onyx. Like all these things got built. And as all these things got built, they, you know, what do compiler people do? They go and build a Compiler, Right. Well now what you're doing is you're fragmenting the talent ecosystem. And I love compiler engineers, I know many of them, they're lovely people, but there's not very many of them. And so if you take a scarce.
Resource, you might say you know most of them.
I probably know most of them, like over half, probably the people who represent as compiler nerd. And so you take a scarce resource and you divide it and you get this really unfortunate problem where many of these technologies, they're made by really well meaning good people and they're, they're very talented, but there's like two people on every project. And so they're well meaning, they're good, they're very talented, but if you only have two people, there's only so much you can get done. And so, and they're trying to tackle these new spaces like heterogeneous compute, GPUs, crazy custom accelerators, like all this stuff. And it became very untenable. And so MLIR was a project that I built again out of need. And when I was at Google and at Google, my day job was I was responsible for getting the Google TPU to work, get the software ecosystem to scale, get, get it launched in Cloud, get TensorFlow to talk to it, get it integrated into Pytorch, these very big, you know, audacious problems. But I was also in charge of all the CPU GPU performance. Google also had many unannounced ASICs that were both data center and Edge and many other things. And so all of these things had different compilers that are all built for their one little niche. And so what I realized is, I realized that the LLVM architecture didn't make any sense. You can't say, hey, Google tpu, which has huge matrix multipliers of primitive, it's not C like that is not the right abstraction. And what I learned from Swift and from many other systems since then was, okay, well, there isn't one right answer. What actually you need is you need the ability to build domain specific compilers and build them very efficiently. And we need to refactor the ecosystem. We need to get compiler engineers to talk to each other again. We need to get it so that we get compounding compilers interest out of our investment. And so what MLIR is fundamentally is it's a way of writing domain specific compilers with very high leverage so you don't have to reimplement all the basic compiler stuff like a constant folding pass. You can instead really focus on your domain and get Great leverage out of that. And so this is something that has kind of taken over our ecosystem. I mean, it's widely used, there's lots of AI things that use this. But also your last show that I heard, quantum computing massively into mlar, because it's a very important domain. I've used it for chip design and so making it so you can actually synthesize Verilog and then synthesize hardware underneath that. That's very different. That's not C. And so like these kinds of applications are really enabled by this.
Yeah, no, I love that. So just to flesh back out or replay this a little bit, llvm, and particularly the LLVM IR did was kind of create a choke point for when things looked mostly C like, and were mostly compiling to target CPUs. Everyone could, if they had a new language, they could build a little layer that compiled down to this C like ir. And on the flip side, if they're building new hardware, they just need to make sure that there's a plugin that knows how to compile from that IR to their chips, machine language, what have you. And the breaking point was, oh, new hardware, heterogeneous computing, all these different things like that. IR does not sufficiently express what we want to express. It doesn't allow people to do it. So you're kind of, in some ways, if I understand it, MLIR is almost a higher level IR that allows you to express many more concepts than could be expressed within C and then also do build sort of reusable abstractions at that level, optimizations, and then there's another layer for people to target. Now, two questions that I have coming into this, actually, I'm going to start with the sort of downwards path. So from mlir, you have this intermediate representation, you're able to do whatever you're doing. What is below that? Does that get lowered? Or is that able to then go straight to machine language? What does that path look like?
So it depends on the compiler you're building, but typically you end up with llvm.
Okay, so this is kind of another layer on top, essentially. Yeah, which makes sense. I mean, compilers are probably the deepest abstractions known to man, right?
That's right. So if you're building an AI compiler, which is where I'll focus, but there's also lots of other cool stuff you can do. But if you just look at AI compilers, typically you're starting with tensors, and tensors are gigantic multidimensional arrays. And then you need to lower it down into I have a 4 by 4 matrix multiplication, things like this, or I have SIMD operations on a cpu, things like this. And so that mapping process is very complicated, but once you get down there, then LLVM is amazing. It's really good at doing register allocation and scheduling and the core things in co generation. And so it's quite good at that. And so it's about how do we solve these higher level things. Yep.
Okay, so that's awesome. And I love that the next question is kind of at that MLIR layer you highlighted, that you're trying to make it useful not just for GPU programming, but for kind of programming any number of these sort of accelerator type heterogeneous computing problems. So what is the abstraction that this MLIR operates at? Like, what is it that makes that work?
So MLIR does not. So this is where MOJO comes in. But let me, let me express why this is so MLIR solves a compiler construction problem, and so it's really good at making it so compiler experts can build a new compiler quickly. But it does not solve design, it does not solve any specific abstraction problem you might have. Let me just give you an example of this. MLIR has many, many, many different ways to map tensors onto hardware. And if you go check out mlir.llvm.org, you know, nice plug, you can go read about these things. None of them actually work very well. And so you need just because something can represent different levels of abstraction. And by the way, the ML in MLIR doesn't stand for machine learning, the ML stands for multi layer. And so it's about progressive lowering, it's about representing things at the right abstraction level to do the optimizations you want to do. I sometimes joke that as a compiler engineer, that which you can represent, you can transform. And so a lot of the power in compilers is getting things at the right level of abstraction so you can do things with them. But you still have a burden of having to decide what those abstractions are. And MLIR really doesn't help you with that. And so what's been happening is the industry at large has been stabbing around the dark and there's a million different projects and none of them have really gotten traction. And so this is, this is why MOJO has to come into this world. Because first of all, there aren't languages really. Actually, I'm going to blow your mind in a second, tell you the other language that targets mlir, but generally people have been integrating these things and trying to get them to talk to PyTorch or TensorFlow or things like this. And so that's been a huge challenge on its own. Does that make sense?
I think so. So I guess then a couple of different things that I'd wonder is, so multi layer is cool, are those layers well defined right now or is that also flexible?
They're completely flexible. And so MLR has a concept of a dialect. And so you can define a dialect for your hardware, for LLVM itself, for your programming language, for quantum computing. It's domain specific and you get to own the domain. This is what I'm saying, where MLIR is very powerful and very awesome, but the power gets reattributed back up to the person holding the hammer. Because now you can go.
It's very much a very powerful in some ways, right? It's. It's saying like, for example, I have a new piece of heterogeneous hardware or a new accelerator, I can build a domain within this that understands how to translate from some of the higher level MLIR into the way that this is going to work optimally on my device. And now it can get lowered into LLVMIR and go.
Exactly. And so now here's the thing, right? So MOJO exists because the entire industry was not doing it, right?
Right. You need a proof point, you need an example for people to see and follow.
For years people have been successfully taking MLIR and they've been integrating into existing systems and upgrading things, and it's had huge impact. I'm very, very, very proud of mlir. And it's got a massive ecosystem and lots of different people use it. It's very, very exciting. But there is no language or there is no really defined programming language that uses it. There's a lot of domain specific languages or EDSLs or things like this, but there was nobody that was actually taking this and using it in a way that expresses full power of generic heterogeneous hardware. Back to programmers, it was more about like, try to get Pytorch to go fast. And some of those sort of work, but most of them did not. Right now I have to, I have to be careful because MOJO is very uniquely designed for mlir. But there's one very brand new, very exciting new programming language that now directly goes to MLIR as well. This is a new LLVM project that is just graduating to the next level of maturity, the LLVM FORTRAN compiler.
Interesting. Okay, okay.
And so now you brought up earlier, let me connect the dots here. Right? Well, so in this case it's kind of an accident that the folks working on Fortran wanted to build a new compiler for Fortran because all the old ones were the wrong thing for a variety of reasons that I'm not the expert on. But now to your point, Fortran has like parallel arrays and it has higher level abstractions that have to be lowered. And so they said, hey, well, this MLIR thing's great because it allows me to represent my Fortran domain specific constructs. By the way, modern Fortran has objects, it has all kinds of.
It's come a long way since I was working on it, early 2000s. Yes.
And so it's easy to make fun of Fortran because it's from the 1960s or 50s or something like that, but actually Fortran has evolved into a very modern language with a massive community of its own. Maybe not as many web apps built in Fortran, but if you look at this, it's the power of MLR that allows people to tackle these problems and solve them, and in a great way. Now what MOJO does. So coming back to what you know now, now that you get some idea of the wackiness of MLAR and both the, the power, but then also the curse that comes with having to make design decisions, right? What Mojo says, it says, hey, well, the problem with a lot of these systems and compilers and these Pytorch thingies and like all the different chip people building stuff, is that they are functionally taking power away from the programmer and locking it up inside the compiler. And so what Mojo does is it says, hey, you know what's important? Library design. Let's take tensor lowering, let's take all this magic out of the magic compiler that a few people are building and let's put it back in user space, let's put it into the MOJO code, let's enable people to tackle Tensor cores or their hardware or whatever else it is, and let's give that power back to the programmer and let's teach a new generation of people how to actually deal with all this cool stuff. And the way that we do that, in my opinion, is through software engineering. It's one of these old things that is not magic, but it's actually a lot of hard work and design. It's about building libraries, building abstractions, building ecosystems that are layered the correct way and putting that in a way that it's accessible. So you don't have to be the hardware expert, you don't have to be the compiler engineer. I love compiler engineers, by the way, but it's like redemocratizing all this technology instead of trying to lock it up into libraries written in assembly or into compilers themselves. And so that's what makes Mojo very exciting. And so that's why having metaprogramming is so important to me is because now you can allow people to write just, you know, just code that has the power of traditional compilers. And that is, I think, very profound. And you know, we're, we're using it to go solve problems for AI and GPUs and things like this. But I think this is a very powerful thing that as, as time plays out, because these things take time, I think the world will discover and be able to use in new ways.
This is fascinating. And this is a domain that I like to bring up every now and then and geek out on, which is just like, I feel like DSLs in particular are massively underutilized. They're incredibly powerful. There's some interesting examples that I. So for a long time I was very much in the web space and I saw Babel arise. And Babel is essentially a user space compiler that allows you to write language features, DSLs, whatever, or do other different things that Compile all to JavaScript. And this was utilized for things like building out JSX for React and things like that. And I looked at that, I'm like, oh, this is going to create a thousand flowers. And like it basically stopped with jsx. Like there's not very many people writing lots of domain specific language extensions there. It's still incredibly useful tooling. It lets you do, you know, a bunch of different things. But I'm kind of curious how like it feels like language design, which is what we're talking about here, empowering people to build languages on top of languages. Another, another world that did a lot of this was like the Ruby world was very into like DSLs and built DSLs and things like this. Yep, like, that's a hard problem. There aren't very many people who are good at language design either. So how do you make those flowers bloom?
So when you talk about DSLs. So let me also give another plug. So I'm writing this series of blog posts called Democratizing AI Compute. And so I go through what went wrong with a whole bunch of different systems including OpenCL and AI compilers and CUDA and many other things in the system. And the one that's publishing this Week is about TritonLang and domain specific languages that are embedded into Python. Okay, and so what is a domain specific language? Well, they're actually all around us, right? There are things like SQL or regular expressions or even HTML as a domain specific language, right? And so these are quite powerful. There's a subset called an embedded domain specific language, which is where you don't invent syntax and then, but you hack the compiler, you hack the Python interpreter, you hack it so that it looks like Python, but it's not. And so the blog post lays out, these are both very powerful and very exciting and very nice, but they're also very cursed because it looks like Python but it's not a language. And so now you're beholden to some weird compiler, some weird system that is now again, it's using Python to hack it. So they don't have to do all the work of building a language, but you don't get the same quality result. Mojo is completely different. So Mojo, what we're doing is we're saying Mojo is a language, it has an lsp, it has a debugger, it has a code formatter, it is a lang, has a compiler, obviously it is a language, it is a full fledged. The hard thing to do, so it's, it's a different energy state than a dsl, but what it enables you to do is it enables you to build very powerful libraries. And so the canonical example of this, that is a language and enables you build powerful libraries is Python, right? And there's a lot of very powerful Python libraries. You can look at PyTorch or TensorFlow or things like this in the AI space, but gajillions of other ones, right? And so we're in that space, which is enable people to build powerful libraries and then allow even more people to build on top of them. And to me that's really what the sweet spot is, is I shouldn't have to build an edsl, I shouldn't have to go hack a compiler, I should just be able to write a struct and use types and then other people who don't want to know how it works can build on top of it. And this is the miracle of software engineering, which is basically it's collaboration, right? It's about ecosystems, it's about people working together, it's about the power of again, having two people is amazing, but having two people build something that's used by 20 people, that's used by 200 people, that's used by 2, 2 million people, right? I mean, this is where software leaps get made. And so what my goal is with Mojo and also with Max together is to make it so you have the ability to Unite the ecosystem, unite all these people that have different levels of expertise and get them to work together. Because I'll tell you, I don't know the latest Blackwell Tensor Core goofy layout nonsense because it's not compatible with Hopper and blah blah, blah, blah, like all that kind of stuff other people do. So once they build the library, I can build on top of it. That's power, right? And I don't want to have to like switch between different systems every. Every second Thursday, because I have a slightly different use case.
So let's follow that thread a little bit and talk about what does the mojo community look like today? Who's developing it, what's governance like? What does the library community look like?
Yeah, great question.
What's going on there?
Yeah, so modular is driving mojo and so we're funding it, we're paying for it, we're building all this stuff out, we are open sourcing it over time. And so we have an open source standard library. And the standard library is about half of mojo, by the way. I didn't dive into it, but like int and float are not part of the compiler, they're part of the library. Again, this is build powerful composable orthogonal language features so you can put all of it into library. So yeah, int is.
So what is the language feature that int and float are built on top of?
They're built on top of. They're all structs. And so like int is a struct, struct, int, and then as a dunder add method, just like you do in Python. And then you have the ability to use inline mlir. And so you basically say, okay, well, it's turtles all the way down. Zero cost abstractions, turtles all the way down until you bottom out at the low level compiler guts that you're talking to.
Fascinating. So you built INT in the library using inline MLIR to define how the operations work.
That's exactly right. And it's all open source. You can go check it out. It's super cool. And so our community has, we have discourse for an online forum and so there's a bunch of discussion there. We have discord, which is for the chat stuff. And so we have a bunch of folks there. I have no idea. It's like 20,000 people in the discord, something like that. And so they're all kind of doing different things. We've made the commitment, we're open sourcing the entire Mojo implementation, but we're doing that in steps. And so MOJO is still being developed. I have some scars and Battle scars from Swift, where we spent way too much time arguing with people online about things. And so the short version of evolution of Swift is it was secret for four years and then it got announced and launched and then it was proprietary for a year, but public. And then it was open, like open governance, open everything since 2015ish, something like that. And I think that went too fast. Swift, it was great to have community. I love, I love open source, by the way. I've written a few million lines of code that's open source that others that are not. But. But it's about the right time in the right place. And having a small coherent team driving the core architecture, I think is extremely important for the early phase of a project. And so we're opening things up over time and we've made a public commitment, I think we'll exceed expectations there. So.
Got it. Well, and as you stated, you're keeping the core language, which is what you are driving with that dedicated team, small. A huge amount is available for anyone who wants to by writing libraries.
Yep, that's right. And modular keeps open, sourcing more stuff over time. And the other thing I'll say is that Max, if you zoom out from Mojo, so Mojo allows you to program CPUs and GPUs and make stuff go fast. And this is very powerful and integrate with C and Rust and all this ecosystem, right, Max is that next level up. It's the AI solution. And so it enables you to write models. And so we have our own kind of. It's in the space of Pytorch, where you can now define your own models and actually build on top of this. And so now you don't have to know how a matrix multiplication works or something like that. You just say, give me one. And you get very fancy compiler techniques that do automatic code fusion and things like this. And so you get very high performance from the max. It's called a graph compiler, right. And so this, this then takes this and the way that works and the way that's really novel is that it builds on top of the power of mojo. And so all this metaprogramming stuff, the ability to have the mlir, like all of these very nerdy compilery things, enable us to build the next level of system that's way more powerful. And it's also an open box. And so just now we're now documenting how to do this stuff. And so we're teaching people how to program GPUs. Well, this is a huge thing because the code you Write in Mojo with Max and together, yeah, you can use it to solve AI, which is awesome and very, very exciting for lots of different reasons. But we're really about enabling heterogeneous compute, enabling people use the accelerator, enabling people to build and scale across hardware. And all this stuff is free. And so this, this is a really big deal. And while some of these technologies are young, like they're very, very powerful. And so I love to see to your point about community, like what people are doing with this and can do with this. So that's. And I talked with people last week, I was talking with a whole bunch of people that are, that are talking about, you know, robotics and all the applications. And you know, there's apparently this Python Ros thingy and moving chunks of that to Mojo would make it so much faster and better and solve all these problems. I'm like, wow, that's. I know nothing about that. That sounds really cool. So that's what I love to see.
Yeah, no, I love that. Well, I just realized, I mean, I feel like we could keep going for hours, but we are getting close to the end of our time. Is there anything that we haven't talked about yet that you think would be really important to leave folks with?
Yeah, I mean, the thing I'd love to say is that I get very frustrated with the industry right now because if you look at the powers that be out there, you've got these massive companies doing AI, right? You've got these very well funded AI research labs that are making the news all of the time, right? And you have all of this just investment in AI and it's all you have the hardware companies like Nvidia that are saying like, okay, well, cuda, cuda, cuda, right? And so what I see happening out in the world is I see all these people that are telling you, telling us all the AI is too hard, give up air quotes, only we can do it, right? And whether it be the big LLM company of the day that's saying like, just use our endpoint, or it's the hardware company saying you have to buy all of our hardware and only from us, or the big tech company that's saying like, hey, you know, all this stuff is too complicated, just use our cloud service or something, right? But it's all a lie. The stuff is complicated because the software is complicated and because this was all cobbled together and it doesn't make any, any sense, but the technology is not that complicated. What I strongly believe in is as we democratize this stuff as we make it actually easy, if we make it so that people have power back over the software, suddenly we can have another wave of innovation and we can take back power from all these, like, overfunded, overly powerful ecosystem players. And we as a community can rise up and we can achieve a lot of the stuff that they tell us is impossible. Because, I mean, I, I am empowered when I talk to new college grads, right? And so I'm. I'm an old dog at this point. New college grads know all of this stuff. They know how models work, they know how sometimes GPUs work, they know all this stuff. But the problem is, is that we have this fragmented talent ecosystem. And what I really believe in is if you get the people able to work together and collaborate and build stuff in the like, the whole world will continue to change in a much more powerful and positive way than. Okay, please just use our endpoint because you're not smart enough to know how AI works, which I don't believe. So this is a big part of the mission. Also, as a nerd, right. All the hardware out there is so exciting. It's all lacking software. And so we're trying to unlock that. This will be a big part of our storyline, particularly as we get into late this year. But I think that's a huge opportunity. And the way I look at hardware today is that arrays, you know, crying about Nvidia GPUs and the prices and all this kind of stuff. But when I look ahead 5 years, 10 years, I look into the future, I know deep down in my soul that hardware is going to be even more weird than it is today. Right. Physics, we're not in the age of Moore's Law. Innovation is not dead. New algorithms, new research, all this stuff's going to continue to happen. And so what we need is the ability to scale into this. And this is what I think we're trying to do. And very excited about that.
No, I 100% agree. We are in an age of increasing weirdness and heterogeneity in the hardware world. It's also a golden age of hardware innovation. There's so much going on because we're suddenly in a place where you just can't get enough compute. And so everybody's trying to innovate and change the boundaries. And we need programming languages that expose that to us as software developers.
Yep. And programming languages fundamentally unite communities. Right. And so that's where, again, what I love to see is different people with different backgrounds, different perspectives, different use cases that can collaborate and solve problems together. This is software, right? This is what's powerful. So if you're interested, please do check out our webpage. We do have a ton of stuff. We're open sourcing things all the time. We have major new releases coming out regularly. There's all kinds of new capabilities. Please join our community. Read the blog if you want to learn about the history of AI Compute and why the software is so screwed up. Trust me, I have decided that instead of telling every individual person, I should write it down and scale this a little bit. And so if you find that, you know, why didn't OpenCL win or something, please happy to teach you about it. So awesome.
Software Engineering Daily: Mojo and Building a CUDA Replacement with Chris Lattner
Episode Information:
In this episode of Software Engineering Daily, host Kevin Ball engages in an insightful conversation with Chris Lattner, the CEO and co-founder of Modular AI. Lattner, renowned for his pivotal contributions to the development of LLVM, the C language compiler, and the Swift programming language, delves into his latest endeavor: the creation of Mojo, a programming language poised to revolutionize AI and GPU programming.
Notable Quote:
Chris Lattner [01:34]: "I'm so excited to hang out with you because you have such a history in an area that is fascinating to me, which is language design, but in particular language design for the current age that we're in around AI and machine learning."
Lattner outlines his professional trajectory, dividing his career into two main epochs. The first epoch centers on CPU programming and developer tools, highlighted by his work on LLVM and the Swift programming language at Apple. The second epoch marks his shift towards AI around 2016, well before the surge of Large Language Models (LLMs) like ChatGPT. This transition underscores his vision to democratize AI compute and address the complexities inherent in GPU programming.
Notable Quote:
Chris Lattner [02:56]: "We think the stakes are high. We think that there's a whole new form of computer. People are struggling today, but as we look ahead into the future, I only see more hardware, more innovation, more AI."
The genesis of Mojo and Max stems from the limitations of existing GPU programming frameworks, primarily Nvidia's CUDA. While CUDA has been instrumental in advancing deep learning, it introduces complexity and creates a fragmented development stack. Lattner emphasizes the need for a language that marries Python's simplicity with the performance and safety of languages like C and Rust, while also offering vendor independence.
Notable Quote:
Chris Lattner [03:05]: "Let's really fundamentally challenge the status quo. Let's go build a replacement for CUDA. Let's go build a full stack ecosystem where we can actually build into AI."
Mojo is designed to be syntactically familiar to Python developers while incorporating Rust-like features such as an ownership model and a borrow checker to ensure memory safety without a garbage collector. This design choice facilitates ease of learning and adoption among Python users, a critical factor given Python's dominance in AI and data science.
Notable Quote:
Chris Lattner [06:08]: "Our goal here is to bring the best ideas together in a novel way and then try to solve the problem in a way that only we can do with this combination of techniques."
Ownership and Borrow Checker:
Comptime Metaprogramming:
Integration with MLIR:
Notable Quote:
Chris Lattner [08:28]: "Memory safety is really important for any modern language. And so, yes, we have to have that, but that's not the differentiating feature that allows us tackle hardware."
While Python excels in ease of use and a vast ecosystem, it falls short in performance and type safety for large-scale applications. Rust offers the latter but with a steeper learning curve. Mojo aims to bridge this gap by providing Python's simplicity with the performance and safety characteristics of Rust, facilitating easier integration and higher performance without fragmenting the development stack.
Notable Quote:
Chris Lattner [20:40]: "Mojo is designed for is Mojo is a go fast language today. That's really what its strength is. It's not a fast Python. It's like it's faster than Rust, by the way."
Mojo emphasizes seamless interoperability with Python, enabling developers to integrate Mojo code directly into Python projects without the overhead typically associated with foreign function interfaces (FFI). This integration allows developers to leverage existing Python libraries while harnessing Mojo's performance benefits.
Notable Quote:
Chris Lattner [22:06]: "Now your GoFast language and your Python language are actually very similar. We found that that's actually very, very nice."
A significant portion of the discussion centers around Mojo's compiler infrastructure. Lattner highlights the limitations of LLVM IR in handling the diverse and evolving landscape of AI hardware. To address this, Mojo utilizes MLIR, which offers greater flexibility and supports multiple abstraction layers, enabling more efficient compilation for heterogeneous computing environments.
Notable Quote:
Chris Lattner [34:03]: "MLIR is fundamentally a way of writing domain-specific compilers with very high leverage so you don't have to reimplement all the basic compiler stuff like a constant folding pass."
Modular AI is committed to open-sourcing Mojo progressively, ensuring a vibrant and collaborative community. The language's standard library, built entirely in Mojo, exemplifies its design philosophy of powerful, composable, and orthogonal language features. The community is encouraged to contribute via forums like Discourse and Discord, fostering an ecosystem of innovation and shared expertise.
Notable Quote:
Chris Lattner [47:46]: "Mojo, what we're doing is we're saying Mojo is a language, it has an LSP, it has a debugger, it has a code formatter, it is a lang, has a compiler, obviously it is a language, it is a full-fledged."
Lattner expresses optimism about the future of hardware innovation, emphasizing that as hardware becomes more heterogeneous and specialized, languages like Mojo will be crucial in managing this complexity. By democratizing AI compute and providing developers with the tools to harness advanced hardware without being bogged down by its intricacies, Mojo aims to catalyze the next wave of AI advancements.
Notable Quote:
Chris Lattner [53:50]: "What we need is the ability to scale into this. And this is what I think we're trying to do. And very excited about that."
The conversation between Kevin Ball and Chris Lattner highlights the emergence of Mojo as a transformative language in the AI and GPU programming landscape. By addressing the performance limitations of Python and the complexity of CUDA, Mojo promises to streamline AI development, foster innovation, and democratize access to powerful compute resources.
Final Notable Quote:
Chris Lattner [54:39]: "Programming languages fundamentally unite communities. That's where the power lies."
For those interested in exploring Mojo further, Modular AI continues to develop and open-source the language, inviting developers to participate in its growing ecosystem and contribute to the future of AI compute.
Additional Resources: