EP 221 George Hotz on Open-Source Driving Assistance - The Jim Rutt Show

Summary7 min read

The Jim Rutt Show: EP 221 George Hotz on Open-Source Driving Assistance

Release Date: February 6, 2024

Introduction

In Episode 221 of The Jim Rutt Show, host Jim Rutt engages in an in-depth conversation with George Hotz, a renowned hacker and entrepreneur known for his pioneering work in open-source self-driving car technology. This episode delves into Hotz's journey from his early days in hacker circles to founding Comma AI, an open-source self-driving car company. The discussion covers various facets of autonomous driving, contrasting approaches with industry giants like Tesla and Waymo, and touches upon legal considerations and Hotz's other ventures, including Tiny Grad.

George Hotz: From Hacker to Autonomous Driving Innovator

Background and Early Achievements

Jim Rutt introduces George Hotz, highlighting his early recognition as a participant in the prestigious Johns Hopkins Center for Talented Youth (CYT) program—a testament to his intellectual prowess from a young age. Rutt humorously notes that Lady Gaga was also part of the CYT, albeit without any known interactions between her and Hotz.

At [01:15], Hotz responds succinctly, indicating no personal interactions with Lady Gaga during their time at CYT.

Hacking Milestones

Hotz first gained notoriety by being the first to jailbreak the iPhone's carrier lock at age 17, a move that resonated with advocates of open systems against Apple's closed ecosystem. His hacking ventures continued with Sony over the PlayStation 3, leading to legal settlements but cementing his reputation in the tech community.

Transition to Autonomous Driving

Joining Google's Project Zero

Rutt mentions Hotz's stint at Facebook and his recruitment into Google's Project Zero—a team of elite white hat hackers tasked with uncovering vulnerabilities in critical systems. However, Hotz notes that his focus shifted away from security to artificial intelligence (AI):

[02:52] George Hotz: "Project Zero kind of led me to AI. I'm thinking like why am I looking for these vulnerabilities myself? How do I write software that looks for vulnerabilities?"

Founding Comma AI

The conversation transitions to Hotz's motivation to start Comma AI, an open-source self-driving car system. Initially aiming to build software for Tesla to replace the Mobileye chip, the contract fell through, prompting Hotz to pursue the project independently. Despite rapid progress in developing an autopilot clone, selling it to car manufacturers proved challenging.

Open-Source Self-Driving Systems vs. Industry Giants

Philosophical Differences

Hotz emphasizes a fundamental shift from traditional autonomous driving approaches, which rely heavily on specialized hardware and high-resolution mapping. Comma AI focuses on leveraging existing car camera systems and software to create a more flexible and scalable solution.

[06:20] George Hotz: "There's one system that can drive cars and it's human beings self-driving cars. Most of the ones you see like Cruise and Waymo, are really fancy remote control cars."

Critique of Current Systems

Hotz critiques the reliance on lidar and high-precision maps used by companies like Waymo and Tesla. He argues that these systems are economically unsustainable and technologically fragile, relying too much on centralized infrastructure that can fail, unlike Comma AI's decentralized approach.

[12:52] George Hotz: "That's absurd. There's a lot of criticisms of self-driving cars, but it's definitely not that one... How does that explain the human?"

Comparison with Tesla and Waymo

Hotz contrasts Comma AI's approach with Tesla’s and Waymo’s:

Tesla: Uses powerful onboard computing with extensive data collection but still faces issues like phantom braking and lane misalignment.
Waymo/Cruise: Operate under "level four" autonomy within tightly controlled environments, relying on remote operators to handle exceptions, making them economically unsustainable in Hotz's view.

[33:26] George Hotz: "Tesla has positive unit economics and Waymo has hilariously negative unit economics."

Technical Insights into Comma AI's OpenPilot

Behavioral Cloning and Simulation Training

Hotz explains the challenges of behavioral cloning—training a model to mimic human driving by learning from data. Without corrective mechanisms, small errors accumulate, leading to significant deviations from intended behavior. Comma AI addresses this by:

Adding Corrective Measures: Introduces algorithms to adjust steering based on lane detection to maintain stability.
Training in Simulation: Utilizes a proprietary simulator that reprojects real-world driving data with slight perturbations, allowing the model to learn corrective actions without relying on human intervention during training.

[15:33] George Hotz: "We refer to it as behavioral cloning... If you don't train in simulation, behaviorally cloned problem, you're going to have no corrective pressure."

Data Acquisition and Diversity

Comma AI boasts the second-largest driving dataset globally, amassed from over 10,000 weekly active users across diverse geographies. This extensive and varied data enhances the model's robustness compared to competitors limited to specific regions.

[19:10] George Hotz: "We have a massively diverse set. Waymo has all the same streets in Scottsdale... Now we have everywhere in the world."

Installation and Usability

Implementing Comma AI's OpenPilot is user-friendly, requiring minimal hardware modifications. Users simply connect a device via a Y-splitter to the car's existing camera system, enabling advanced driver assistance features without invasive changes.

[21:29] George Hotz: "Most new cars... have one plug that connects to it. All you have to do is install and connect."

Legal and Regulatory Considerations

Liability and Safety Standards

Hotz addresses concerns regarding legal liability and safety standards. Comma AI adheres to ISO 26262 standards, ensuring that their system cannot render the car uncontrollable. Responsibility for driving decisions remains with the human operator, who must maintain attention and control.

[43:41] George Hotz: "We limit the maximum amount of torque the system is capable of applying to the wheel... It’s on you."

Handling Malfunctions and User Compliance

The system is designed with multiple redundancies to prevent mechanical failures from causing accidents. Additionally, driver monitoring ensures users remain attentive, alerting them if they become distracted.

[44:42] George Hotz: "We have the best driver monitoring in the world... we force respect through effective monitoring, not coercion."

Tiny Grad: Simplifying Machine Learning Frameworks

Introduction to Tiny Grad

Towards the end of the episode, Hotz introduces Tiny Grad, a machine learning framework developed to compete with established platforms like TensorFlow and PyTorch. The key differentiator is its minimalistic design, comprising only 5,200 lines of code, which enables easier adaptability and deployment across various hardware environments.

[55:54] George Hotz: "Tiny Grad is a machine learning framework... it's 100x simpler. The code base is only 5,200 lines."

Applications and Future Goals

Tiny Grad is already in use within OpenPilot for running models on devices, and its simplicity allows for easy porting to new hardware accelerators. The long-term vision includes developing machine learning ASICs (Application-Specific Integrated Circuits) to optimize performance further.

[57:27] George Hotz: "It's used in OpenPilot to run the model on the device... the long-term goal of Tiny Grad is to build machine learning ASICs."

Vision for the Future

Beyond Self-Driving Cars

Hotz articulates a broader vision where solving self-driving cars serves as a stepping stone toward general-purpose robotics. The ultimate ambition is to create artificial life forms capable of performing complex tasks autonomously, such as cooking and cleaning.

[46:08] George Hotz: "Our goal is to solve self-driving cars as a jumping off point to general-purpose robotics... a $25,000 robot companion that comes home and cooks for you."

Commitment to Open-Source and Ethical Development

Emphasizing transparency and user control, Comma AI maintains an open-source ethos. Hotz is committed to resisting external pressures, such as patent trolls, ensuring that the company's innovations remain accessible and ethically developed.

[52:25] George Hotz: "I'm legit willing to do it... I don't want to oversell anything. Buy it or don't buy it, that's up to you."

Conclusion

This episode offers a comprehensive look into George Hotz's approach to autonomous driving through Comma AI's open-source framework. By challenging the conventional methods employed by industry leaders and advocating for a more decentralized and user-friendly system, Hotz presents a compelling case for the future of self-driving technology. Additionally, his work on Tiny Grad underscores his commitment to simplifying and democratizing machine learning, paving the way for broader innovations in AI and robotics.

Notable Quotes:

George Hotz at [02:52]: "Project Zero kind of led me to AI. I'm thinking like why am I looking for these vulnerabilities myself?"
George Hotz at [06:20]: "There's one system that can drive cars and it's human beings self-driving cars."
George Hotz at [12:52]: "That's absurd... How does that explain the human?"
George Hotz at [19:10]: "We have a massively diverse set... Now we have everywhere in the world."
George Hotz at [34:06]: "Tesla and Comma both have businesses where we sell things to consumers at a profit."
George Hotz at [43:41]: "We limit the maximum amount of torque the system is capable of applying to the wheel... It’s on you."
George Hotz at [55:54]: "Tiny Grad is a machine learning framework... it's 100x simpler. The code base is only 5,200 lines."

For more insights and detailed discussions, listeners are encouraged to tune into The Jim Rutt Show and explore Comma AI and Tiny Grad through their respective websites.

Loading summary

Transcript183 lines

[00:00]
Jim Rutt
Howdy, this is Jim Rutt. And this is the Jim Rutt Show. Listeners have asked us to provide pointers to some of the resources we talk about on the show. We now have links to books and articles referenced in recent podcasts that are available on our website. We also offer full transcripts. Go to jimrutshow.com that's jimrutshow.com Today's guest is George Hotz. George is an interesting fella. He was one of those smart kids selected for the Johns Hopkins center for Talented Youth. An extremely selective program, was about one in a thousand or something for talented youth. He participated in that as a teenager. And I always like to point out this is just so interesting and curious. Another person that participated in the CYT John Hopkins program was Lady Gaga. Who would have known? Lady Gaga could have been a mathematical physicist if she hadn't decided to a pop singer. In fact, Lady Gaga and George are more or less of the same age. Did you happen to meet her while you were there at cty?
[01:12]
George Hotz
Not that I can remember. I actually don't know what her real name is.
[01:15]
Jim Rutt
Yeah, I don't either. I'm sure the Google does. Anyway, moving on from his precocious youth, still relatively precocious Youth at age 17, George made a name for himself in hacker circles as the first person to break the carrier lock on the iPhone. I remember reading that story when it happened, I go, good, fuck them assholes, right? You know, I was an Apple II guy because Apple II is very open system, you know, under the Wozniak doctrine essentially, right? But then when they came out with the Mac, I've been anti Apple ever since. You know, they're closed systems, horrible exploiters, you know. So when somebody breaks Apple's lock on something, oh, I always say that's good. So that was George. I remember reading that story when it came out. A few years later, he got in trouble with Sony for hacking the PlayStation 3. What did you do, break the copy protection on the game? Something like that. Whoa, whoa, whoa, whoa, whoa, whoa, whoa.
[02:04]
George Hotz
Allegedly. Maybe.
[02:05]
Jim Rutt
Allegedly. So they claimed, blah, blah. Anyway, settled out of court. So nothing actually happened. Never. Whatever, whatever. But anyways, I'm a pretty smart kid. Anyway, he then went to work for Facebook for a relatively brief period of time. And then as I was tracking down his bio in various bits and places, I saw something else was quite interesting. He got recruited into Google's Project Zero. Most of you probably haven't heard of it, but if you were a hacker, you would probably know it back in the Day, at least it was amongst the most elite white hat hackers in the world and their job was to probe broadly distributed technologies or ones that were in critical pieces of infrastructure and find so called zero days. So actually why don't you tell us what the hell a zero day is and what you all were doing. No point in me telling it. You know a hell a lot better than I do.
[02:52]
George Hotz
So a zero day is. Yeah, just an exploit in a piece of software. It's a zero day because it's a previously unknown exploit. I haven't thought about this stuff in a long time. I haven't done security almost for 10 years now. Project Zero kind of led me to AI. I'm thinking like why am I looking for these vulnerabilities myself? How do I write software that looks for vulnerabilities? This turns out to be extremely hard. Yeah, that's been one of the recurring themes in my life. How do I make stuff that can automate any work I'm doing but specifically in that case finding exploits.
[03:24]
Jim Rutt
Yeah, so I always say the world was built by lazy people, people looking. I didn't feel like humping them buckets of water up the fucking hill to irrigate my little field. So let me invent a little paddle wheel thing to sen up the hill. Right. I'm convinced of that. That much of the progress of the world was driven by laziness or at least trying to get non sweat ways of doing things. And he has various other adventures. And then we're going to talk about today. He founded a company called Comma where he is currently the president and Comma is a company that has an open source, believe it or not, self driving car system. And regular listeners to the podcast know self driving cars is an area of interest of mine. We had a full episode on self driving tech back in EP94 when Sahin Farshi I think is how he pronounced it. He was a guy who sold his company to Amazon, one of the self driving tech companies and now a VC. And then in EP124 we had Jim Hackett on who had just stepped down as CEO of Ford. And we talked about all kinds of things but we also talked about self driving cars a little bit. So most of what we're going to talk about today is self driving car. George, why don't you start at the beginning? What in the hell motivated you to start your own open source self driving car software company?
[04:41]
George Hotz
So I met with Elon. I was originally going to do a contract to build software for Tesla that could Replace the mobileye chip which they.
[04:50]
Jim Rutt
Had spent a zillion dollars on. Oh no. Who bought that? Intel bought that for a zillion dollars.
[04:55]
George Hotz
Intel ended up buying mobileye. Yeah, it went up for a bit, now maybe it's down, but yeah, no. So the contract was to replace the mobileye chip with software. It was my first encounter with Elon, an absolutely fascina. The contract didn't work out for various reasons. But yeah, I was like, well okay, I'm not gonna do this as a contract. I'll just do this and then I'll sell to the car companies. I'll build autopilot clone and sell to the car companies. No, mobileye. The first part actually turned out to be easier than the second part. Building an autopilot clone took a couple months. Selling it to the car companies is impossible.
[05:26]
Jim Rutt
Explain what mobileye does by the way, and why that is or isn't important as a way to solve this problem.
[05:32]
George Hotz
They make chips. These chips do run proprietary perception algorithms to do things like perceive lane lines, perceive cars and they go in cars and they enable a lot of ADAS features mobileye has. You know, for all the criticisms I've ever dished out to them, they do understand that these things do need to be more sort of end to end. They have this thing called holistic path prediction. It's a computer vision chip for ADAS systems and cars.
[06:02]
Jim Rutt
Okay, sounds good. One of the interesting forks in people's approaches to self driving tech is those who believe you can do it with just cameras and those who believe you need lidar or other kinds of more penetrative sensing systems. Why don't you address that question a little bit? And where do you come down on that?
[06:20]
George Hotz
There's one system that can drive cars and it's human beings self driving cars. Most of the ones you see like Cruise and Waymo, are really fancy remote control cars. They are not autonomous robots operating in the world. As much as these companies might want you to believe that the only system that is truly capable of level 5 self driving is a human and a human does not have lidar. A human has two cameras.
[06:45]
Jim Rutt
That's of course Elon, he's a contrarian to that in that regard. He always denounces lidar vigorously because it's.
[06:52]
George Hotz
Is it still the contrarian viewpoint? It's correct now. Everyone accepts it, really.
[06:56]
Jim Rutt
Waymo still doing their thing, et cetera. We'll get to some numbers in a little few minutes. Show that Waymo, while one way impressive in terms of miles driven, is not Very impressive anymore. And that probably is also a good point to remind the audience because it's been a long time since we talked about it about the six levels of self driving automation, Level zero through level five.
[07:19]
George Hotz
So the truth about the levels is they say more about liability than capability. Level 2 was the highest level where the human is still fully liable for decisions that the car makes. Level two is supervision of the car. I'm actually curious. You probably have a totally different take on this than me. Level two is when the human is always liable. Level three is when the human is liable in certain scenarios. Level four is when the human is not liable in cities or certain areas. And level five, the human is never liable. It says almost nothing about capability.
[07:52]
Jim Rutt
Interesting. Yeah, I guess you could interpret it that way because that's probably why it was written, because a bunch of lawyers wrote it. No doubt. I do have to admit I tend to think of it in terms of what do I as a driver do? You know, back in 2018 and 19 when they all said, oh yeah, full self driving cars are two years away, they were essentially predicting full automation where you could sleep in the back seat while the car drove you to work. And that's level five, full automation, no humans in. And in fact, Google famously, in one of their first prototypes, built it without a steering wheel. Right.
[08:26]
George Hotz
I could do this in a level zero car. I can put a brick on the gas pedal and go to sleep in the backseat. It might not be a smart idea, but I could do it.
[08:32]
Jim Rutt
Okay. I was going to say not wise. Not wise is not what?
[08:35]
George Hotz
Depends what your risk tolerance is.
[08:37]
Jim Rutt
Well, not quacking nuts. Right. Oh, by the way, this is, I'm going to throw this out here even though it comes in later, because this actually turns out to be hugely important. When thinking about self driving cars, people say, well, that ought to be easy to write self driving software because humans, they suck so bad at driving. Well, when you look at the data, they don't actually suck so bad at driving. I looked it up and most civilized countries you get about one fatality per hundred million driven miles. That's a lot, you know, that's a lot more than the Waymos and cruises and stuff have logged so far. Right. And 100 million miles is about what I think you guys say you guys have driven and so, you know, you would have predicted less than one death from Cruise and Waymo and friends, Argo and Uber and those guys when they were still around. And you would predict about one death for you guys if you were at Human level equivalent. And we have zero and you got zero, so. So anyway, 100 million is pretty damn good. We should not denigrate humans. We're talking about we need to be better than humans.
[09:38]
George Hotz
I've heard it's even higher than that. The number I hear is more like 500 million.
[09:42]
Jim Rutt
I looked it up pretty carefully. It seemed to be 100 million.
[09:44]
George Hotz
Interesting. I mean, it depends a lot on what type of miles and what car you're in. I believe your number, it is somewhere in that order of magnitude. But yes, humans are absurdly good drivers. And the simple way that I express this to people is I say, how many times have you driven to work? I don't know, a couple thousand. How many times have you crashed? Zero. Maybe one. And you remember that time? That's a pretty reliable system.
[10:07]
Jim Rutt
Yeah. Or even how many close calls have you had in your life? I've had a few. Right. And I did have one bad wreck. Human stupidity. But you know, overall, yeah, humans are better than the AI guys were saying in 2018 when we had all this. Oh, yeah, this is easy. We can certainly exceed human capacity.
[10:26]
George Hotz
I never said anything like this.
[10:27]
Jim Rutt
I know you didn't, but other people did. Right. Especially again, the Google car with no steering wheel in it. Total hubris, right? In 2018 or 2019, I've heard self.
[10:36]
George Hotz
Driving is demo complete. You can build any arbitrary demo and still be arbitrarily far away from solving the problem.
[10:41]
Jim Rutt
All right, so talk to me a little bit about how you guys got started and what did you do first and tell the story.
[10:47]
George Hotz
So the first basic idea is I'm going to get a camera and I'm going to have the camera predict the angle the steering wheel should be at. I'm just going to do straight up supervised learning. F of X equals Y. X is the image, Y is the steering angle. Should work, right? This turns out not to work, and it's upsetting why this turns out not to work. You can get a great training set and test set, do all your classic machine learning, do it beautifully. You know, iid, everything's great. You get a really low loss on your test set and then you put it out on the road and it doesn't drive at all. It can't even go straight on the highway. It'll drift out of lane. And the reason for this is because even at test time, your model is not acting in the world. The video that's being shown to you is all video from the human policy, not from the machine policy. So I go out and gather a whole lot of data. Me driving, and then I want to learn a model to drive like me. All of that data that's collected was driven with the human policy, meaning my policy, the machine policy. Even though I'm approximating the human policy and getting as close as possible, there's always going to be some epsilon error. Normally the epsilon errors are no problem if your samples are truly iid, but your samples are not iid. They're temporal independent and identically distributed. It means that every sample is independent of every other sample, that your actions at time T will not affect the data@time T1. And this is true even if you have a complete holdout test set that was driven with the human policy. But as soon as you put the machine in the loop, this is no longer true because the action it took at time T affects the input data at time t +1. And it's that dependence that makes the problem of self driving cars so, so difficult.
[12:33]
Jim Rutt
And the other one we often hear about, Gary Marcus often talks about this AI guru and a headstrong contrarian who says self carving cars would never work, is that there's just a zillion corner cases and that no practical learning set could ever capture all the corner cases. What do you say to Gary?
[12:53]
George Hotz
That's absurd. There's a lot of criticisms of self driving cars, but it's definitely not that one. You talk about some corner case, how often does that corner case happen? Oh, it happens once in every 10,000 miles. Okay, I got 100 million mile data set, I got 10,000 examples of that. Also, how does that explain the human? Right, the human, the average human has seen so much less data than the data we actually train our system on today. So it's not corner cases that cause the problem.
[13:16]
Jim Rutt
Well, that's Gary's point actually, because he would argue humans are general intelligences, right?
[13:21]
George Hotz
They are meaningless. Meaningless term, completely meaningless.
[13:24]
Jim Rutt
They're GIs. These things are narrow AIs, and so they can only do what they're trained to do. They don't generalize the way humans do. We don't even quite know how humans generalize. But you know the weird case of the person who was run over by Uber, I think it was, you know, who was taking a bicycle with bags hanging on it across the street at night and between two cars, okay, we trained on this, this, this, this and this. But this particular combination of things which a human trivially says, oh yeah, that's a person with a bunch of bags hanging from their car coming out between Two cars. It doesn't really have a world model at the level that humans do. And the ability to integrate lots of clues and come up with an integrated solution more or less instantaneously.
[14:09]
George Hotz
Well, that's definitely true. I mean, you can go into the specifics of the Uber accident and it looks much more like a bug in classical software than any failing of AI. I believe it classified the object as unknown. It didn't know what to do. It's been a while since I've read about it, but it looks nothing like the failing of deep learning. It's straight up the failing of your if statement mumbo jumbo. I mean, yeah, uber way statistically below where they should have been with an accident. I think that accident happened at 4 million miles, you know, but yeah, so that's not a failure of deep learning. And I also don't believe that there's any such thing as general intelligence when you talk about a world model. There's definitely a real meaning to that. And it's true that again, it depends exactly what you mean by world model. But to have an integrated world model like the one that humans have, that are capable of predicting the way scenarios can play out in complex ways, that is the absolute cutting edge of machine learning today and has not deployed in any self driving cars.
[15:01]
Jim Rutt
So let's get back to where you started from. You hook up a camera that calculates the steering wheel doesn't work. What do you do then?
[15:08]
George Hotz
So it almost works. It almost works. It's off by epsilon and these epsilons accumulate over time. So all I need to do is add a small amount of corrective pressure and. Okay, fine. I train a quick algorithm to detect the two lane lines, take the center of the lane lines and compute a corrective pressure based on how far I am off from the center. So now it's 90% the machine learning algorithm, but 10% this correction pressure and.
[15:33]
Jim Rutt
This fixes it, so long as you have nice visible line markers.
[15:37]
George Hotz
So I talk about lane lines as the original sin of comma. I was terribly upset that we had to include them because I really wanted to make an end to end solution for driving. There is no definition of what lane line is. There is no physics based definition of a lane line. And when we started hand labeling the pictures, you quickly realize that there's pictures where 50% of humans believe something's a lane line and 50% of humans believe it's not. And wherever your calibration is on that, you will always find those pictures because there is no physics based definition of a lane line. So we have to figure out how to take them out. It took many years, but we did end up removing lane lines.
[16:09]
Jim Rutt
Interesting. Yeah. Here I live deep in the country and the main road that comes down from the state road goes up, I got a center line only about halfway. And then it stops for no good reason and there's no center line the last two miles till we get to our turnoff. When I drive that, I go, this probably give Tesla's autopilot fair fits.
[16:28]
George Hotz
Try ours now. It'll do great.
[16:29]
Jim Rutt
Interesting. And so you basically understand where the edge of the road is instead.
[16:34]
George Hotz
Nope, definitely not. So you don't do anything like that.
[16:36]
Jim Rutt
Okay, talk a little bit about how do you stay in the right part of the road. Because it's a good case, because it's a narrow. It's a road that's illegally under modern standards, narrow in one place. It's got a 30 foot cliff and a fall into a river. It's got turkey house, feed trucks and logging trucks on it. So it's a pretty ugly. You better be over in the right part of your road, especially go around some of those blind corners.
[16:59]
George Hotz
So, yeah, I mean, the more complex stuff is still, you know, we do have a level two system. But our new model asks the question, given this road, where would a human drive the car? And that's the whole question. So you ask, where is it going to go in that road? Well, it's going to see training data that looks kind of like that road. When the human was in control, where did they put the car? And then once we know where the human put the car, we can actually put the car there. But it's really hard to deal with that problem that I was talking about. We refer to it as behavioral cloning. That may not quite be the industry name for it, but it happens because the error accumulates over time. So one way to fix this is to train in simulation. If you train in simulation, the training data that I'm showing is no longer data that's driven with the human policy, as it would be in a supervised learning scenario. But it's data driven with its own policy. The simulator uses the policy that the model has learned to roll out a scenario. And then at the end of the scenario says, okay, how much did I diverge from where the human went? And then it knows when it's over here. And if the human was here, it should have been here. And then we can back prop through that and it can learn how to correct itself. It can learn corrective pressure over time so it will learn to converge. We have this test called the hugging test where we use a straight up classical unreal engine simulator and we initialize the car in different places in the highway lane and we see how long it takes after we let the model go for it to come back to the center. That's called the corrective pressure in the model. But if you don't train in simulation, if you train just as a supervised learning, behaviorally cloned problem, you're going to have no corrective pressure.
[18:43]
Jim Rutt
Yeah, I did go out and talk to the guys at Waymo years ago and they talked me through how much they depended upon Simon simulators. They were thousands to one of simulator to real miles. How about you? Two questions. One, where did you get your human data from? Just some dude, right? You hook it up to your own car and start, I suppose, taking the data and then how did you think about the evolution of actual driving data with simulation data and how those two things inform each other?
[19:11]
George Hotz
So first, when it comes to driving data, most people aren't aware of this. We have the second largest driving data set in the world after Tesla. We have 10,000 weekly active users all uploading data to us. This is a massively diverse set. We have tens of millions of miles of it and it's again not just in quantity, but in diversity. Waymo has all the same streets in Scottsdale, Arizona or wherever three cities they're in. Now we have everywhere in the world. You can ship these devices anywhere. So we have a huge, diverse, complex data set. And then our simulator is a bit different from Waymo Simulator. We didn't hand code it and it doesn't use a game engine. We call it the small offset simulator. It's reprojective. So you can take a human video and then you can apply small perturbations geometrically. If you know the depth of every pixel, you can reproject into a 3D world and it can make it seem like instead of driving here, you drove over here. So our simulator is not fully flexible. The problem with a fully flexible game engine simulator. One problem is what policy do you use for the other cars? How do the other cars drive? Sounds like you need to solve self driving cars in order to solve that problem. We solved that problem by just using what the cars really did in reality. Now there are some caveats with this, but at least for solving the behavioral cloning convergence problem, this works great.
[20:29]
Jim Rutt
I suppose you could perturb the other cars too, right? You could add noise to the trajectories of the other cars, you could.
[20:35]
George Hotz
And that starts to get very fancy, right? Because now I have to know where the other car is. I have to know how to move it. I have to fill in what the pixels should have been. Modern ML can do it, but again, very complex. So we call this whole simulator second paradigm, and we're moving to third paradigm now, which is even more generic. And I can go into that later, but that stuff doesn't work yet. This is pretty much what we're using today.
[20:56]
Jim Rutt
All right, well, let's now get down to the more tangible. For folks that are wondering, how do we do this at home? Right? As I understand, you've got 275 cars that you support to one degree or the other, though I did look them up, and none of the three we have do the full thing. My 2017 Jeep Grand Cherokee has adaptive cruise control available. My wife's 2019 Outback, nothing. And my 2016 Tacoma, nothing. So I'm out of luck. But there's a long list of cars that you guys can work with. Tell us, how does someone hook up your stuff on one of these 275 cars?
[21:29]
George Hotz
So, first off, people think that it is some kind of, oh, I'm going to have to put a motor on the steering wheel. It's nothing like this. Most new cars, in fact, almost all new cars shipping today have a camera mounted right behind the rearview mirror. And there's one plug that connects to it. All you have to do to install the comma, unplug that plug. We have a Y splitter plug in there, plug in there, plug it into our device. That's it takes about 15 minutes. It's completely electrical. And it's also. It's not hacking like people think this is hacking the car. It's not. It's just looking at the messages the camera is sending and saying, you don't actually want to do that. Here's a better message. And it sends the better message along to the steering system, the braking system, etc.
[22:09]
Jim Rutt
So is that camera that's already in the car, is that sending messages to do things like emergency braking and things like that?
[22:16]
George Hotz
So we selectively block and don't block some of them. The emergency braking by default, we don't disable by default, if you mess with it, you can disable it. But if you don't mess with it, if you just stock install comma, we don't disable the emergency braking. We pass all them through. The messages we will change are the lane keep assist messages. So many cars don't have a lane centering option. They have something that looks a lot more like if you get near a lane line, it'll put torque on the wheel. That's just stupid. We'll just put torque on the wheel to keep you in the center of the lane. We'll put torque on the wheel to not just keep you in the center of the lane, but to drive on your unmarked. No centerline road and put it where a human would put it to put the same torque on the wheel that a human would put in the same situation.
[22:58]
Jim Rutt
And this is all from two cameras that, as you say, emulate the human. Two eyes that go right behind your rear view mirror, essentially.
[23:05]
George Hotz
Yep. It's a little box. You can buy it 1250.
[23:08]
Jim Rutt
You guys sell that. It's your comma 3, right.
[23:11]
George Hotz
This is the 3X. It's the same thing as the 3.
[23:13]
Jim Rutt
So now let's take us through what your system will. Actually, I'll give an example of a popular, relatively inexpensive car this thing would work with.
[23:21]
George Hotz
Toyota Corolla.
[23:22]
Jim Rutt
Toyota Corolla, perfect. I also saw another one. My favorite I recommend to people all the time is the RAV4. Is another nice little RAV4.
[23:28]
George Hotz
Yeah, the Toyotas are great.
[23:30]
Jim Rutt
Great little functional car. If you just need a car to do random shit, it's a good one. So you plug it in. How long does it take for someone to set it up and then what will it do for them?
[23:38]
George Hotz
Takes about 15 minutes, 30 if you're careful, 5 if you rush through everything. There's plenty of videos of people online installing these things. It is way less hard than people think. People are intimidated by this. If you can set up a piece of Ikea furniture, it's easier than that. So what it does right now, think about whatever your driver assistance system is on your car. Think about how long you can go on the highway without touching anything.
[24:03]
Jim Rutt
My car don't have any driver's assistance, so I don't have to worry about it. Right.
[24:06]
George Hotz
Even most of the modern ones, it's 10 seconds, maybe a minute.
[24:10]
Jim Rutt
My wife's Outback's got. Which is annoying. It slows down when it's in cruise control. Someone's in front of you. I like the old style where it starts closing in on the guy, makes him speed up or get the hell out of the way or reminds you it's time to pass. Because sometimes you space out. You want to be doing 70, you're only doing 62 because damn adaptive cruise control slowed it down.
[24:30]
George Hotz
If your Subaru has that. I'm sure it works. Okay, that's enough. Cruise control, comma helps a lot more with the lateral stuff than the longitudinal stuff. We do the longitudinal as well. But the real difference. And this is a super hard thing to convey except for the fact that like it's so simple when you see it. You can go on the highway, press the cruise control button and sit back and it will not just sometimes, but usually drive for an hour without you.
[24:58]
Jim Rutt
Having to do anything that's on the highway or not.
[25:00]
George Hotz
Interstate highways.
[25:01]
Jim Rutt
Yeah, interstate highway. So this is interstate highway only.
[25:03]
George Hotz
So it works around town as well. You're not going to get an hour. We have a mode, experimental mode, which will stop at stop signs, stop at lights. It's a little worse than Tesla fsd, but this stuff does work. And there are people again you can look. We released a drive at the end of last year where we went from downtown San Diego to a Taco Bell in the suburbs without a disengagement, stopping at red lights, stop signs, 90 degree turns, highway interchanges. It can do all these things. These things just turn out to be a lot less useful in day to day usage. The main thing that the feature that you not just want but will be very upset if you ever don't have is hours on the highway without touching it.
[25:45]
Jim Rutt
That makes sense. As a figure of merit. Another data question I have. I know the old days, I still don't know if they still do Waymo invested shitload in super high resolution mapping. Do you guys use maps? And if so whose?
[25:58]
George Hotz
No, worthless. We have an experimental mode called Navigate on OpenPilot which will use mapbox maps to navigate. But it's the same exact maps that a human use. My general philosophy about all AI stuff is like you don't need special stuff for the computers, just look at what humans use. So yeah, humans use a nav system, but humans use like a normal standard definition map. And this turns out to work totally fine for self driving cars too. Humans don't do things with sentience centimeters of precision on a global scale. This is absurd. Like if you're trying to localize yourself within centimeters on a map, that's just such a non robust system. You've built this like super fragile. Oh, I gotta get the decimal. I can't even use float 32. I could use float 64 for my ECEF coordinates. And like this is not how humans drive cars.
[26:46]
Jim Rutt
Now do you know if that's still how Waymo is doing things using these high precision maps? They spend a Shitload of money creating.
[26:52]
George Hotz
Yeah. So Waymo is level four, not level five. Meaning they operate in defined regions that they can carefully map. And it's a very different approach to driving from how a human drives. Our approach to driving is much more like how a human drives. So to criticize the approach is like, that's not even what I criticize about Waymo. And people have like, I criticize lidar. That's not true. I criticize the unit economics of the Waymo. I think that the things they are building are. It's a $500,000 robo taxi. Oh, they'll come down over time. Yeah, maybe. But you know what will definitely be cheap over time? A cell phone. How do you make a cell phone drive a car? Why can't it drive a car?
[27:32]
Jim Rutt
Presumably you could tape two cell phones together and perhaps drive a car.
[27:36]
George Hotz
On one extreme of this, right? On one extreme of what Waymo's building is, they've built a train. They've built a train with virtual rails. And yeah, you can build trains with virtual rails. And you can get into all the economic reasons why building trains doesn't make that much sense.
[27:49]
Jim Rutt
Thinking back in the history of self driving cars, you know, getting back to famous promises, level five in two years and all that stuff. But then you had various startups and you had Waymo, which decided to do its first trials in Mesa, Arizona, a place famous for its nice square streets, level and perfect weather. And we spend a fair bit of time in Pittsburgh for family reasons. And there were Ubers mostly and I think some Argos too running around Pittsburgh. And that's a, well, much harder place because the roads are triangular and they're shitty and they're old and constant being dug up for sewer work. Weather sucks, all these bridges across these ravines and all this sort of stuff. And maybe they were trying to climb too high a hill. The guys that were doing their prototyping in Pittsburgh versus Mesa, the whole thing.
[28:39]
George Hotz
Never made any sense to me. Like what these things are, are not self driving cars. They're trackless monorails. And again, when you start to view it through that lens, it becomes much more of an economics question. Well, why don't we just replace all the streets with a normal monorail?
[28:58]
Jim Rutt
And of course, as you know, there were proposals back in the maybe 2010 time of putting a smart telemetry in all the roads, which would have cost like $5 trillion or something.
[29:09]
George Hotz
I don't know, I mean this is just what's like baffling to me. It seems that these people are very sort of out of touch with the real world. The government like won't even fix a broken stop sign. You know, like you think they're gonna, oh, we're gonna install this smart telemetry. I mean it's a, it's a scam.
[29:23]
Jim Rutt
The other thing you alluded to in passing, I like to dig in this a little bit more detail is you suggested that the Waymos, et cetera, Cruise are still using remote control driving more than they like to let on. What do you know about that?
[29:36]
George Hotz
I mean, Cruise admits it. Cruise admits this way more than Waymo. These cars have multiple operators for each car. They took the driver out of the car, gave them a title where they get paid twice as much and made two of them. Like again, it's all based on this premise that eventually the AI is going to come and eventually it's going to become economical because, well, okay, right now every five minutes a human has to intervene. And they're not rc, right? There's not a human with like a gas pedal and a wheel. It's a point and click intervention probably, but the decisions are fundamentally still being made by a human in a call center somewhere. Another way you can know this is you can just Google. All the cruises stop when the cell phone network goes down. They just stop again. I see a system like that and I'm like, you're building something that's so fragile, it's so centralized, it's so antithetical to everything. I want to see about technology. Your comma doesn't need an Internet connection. It just runs a little model on the device. It's like the AI that impresses me. I build AI that can do what an ant can do. Like we're not even close. An ant can self replicate, an ant can survive in new environments. And you have the pinnacle of these self driving car things that are the most fragile, dependent on the heights of civilization. And the minute anything goes down, oh well, it doesn't work anymore.
[30:55]
Jim Rutt
Yeah, sorry about that, guys. Right, yeah. Yeah. Because I suppose at one level you could say there's an economic curve. You know, let's say your Waymo or Cruise is getting better and better and it was five minutes. Once every five minutes the remote control driver has to intervene. Then it's 20 minutes, then it's an hour, then it's 30 hours. The economics start to work perhaps when you get out to, you know, once an hour or something like that. Even though it is still a crazy system at some level. And to your point, it's still very dependent on infrastructure that you'd not like to be dependent on. Do you think that's more or less Their play is to ignore all the infrastructure problems and gradually improve until they can tolerate the fact that they have to intervene once an hour and that's economical.
[31:38]
George Hotz
One of the most hilarious things I see in the projections of all these self driving car companies is they keep the cost of transportation fixed and they assume that they are going to be the sole winner and they are going to be able to eat all those margins. There is no way this is going to be true for two reasons. How many years ahead is Waymo of us? Let's say Waymo starts to get this down to the point where it's economical. If we get it to work, even economical for them might mean, oh, the car only costs $50,000, but if I'm doing it with a $500 cell phone, you're not competing with the Uber driver of the past, you're competing with me. I will win. Over a long time horizon, I will win. Waymo have may have a very, very short window to try to recapture any value. They're assuming a static world, which is just completely not true. I think also even if Waymo style approaches win, it's not going to be Waymo alone who solves the problem. It's going to be like 10 companies who solve it pretty much all at the same time. And then you have a market that doesn't even look like Uber. It looks like the scooter market market. It looks like Lime and Byrd and all these companies where you just basically pump these things out and it's a total race to the bottom and everybody.
[32:55]
Jim Rutt
Loses and everybody goes broke. Right? Yeah, I was watching those things proliferate a couple years ago, said ah, this reminds me of the 1982 famous debacle when 104 companies introduced all at the same time. Five and a quarter inch Winchester hard drives and four of them survived and then two and then one and that was the way it went. Now the one we haven't talked about, and this is much more analogous to what you're doing, is Tesla. Why don't you compare and contrast your approach with Tesla's approach to quickly compare Tesla and Waymo.
[33:26]
George Hotz
Tesla has positive unit economics and Waymo has hilariously negative unit economics. So regardless of whether Tesla succeeds at self driving or not, they are selling cars today and making a profit today. So when you compare and contrast us and Tesla, we're doing the same thing. We are selling boxes today and making A profit today, not quite at the same scale as Tesla, but that's very important to me. I don't believe in hockey stick growth. I don't believe in magical inflection points. I believe that slowly, over time you build value. And you can do this in such a way that you're profitable mostly along the way. Obviously at some point you go under. But this idea of it's all going to pay itself back in 20 years, years. It makes no sense.
[34:06]
Jim Rutt
As we would say, pie in the sky when we die.
[34:10]
George Hotz
Tesla and Comma both have businesses where we sell things to consumers at a profit. Now our autonomy approaches are. You can see the differences when you read Reddit posts that compare autopilot and open pilot. Tesla views driving much more as a fiscus problem and much more from a modernist perspective. They're talking now about end to end. But even their end to end stuff still looks a lot like rigid maneuvers. And thinking about what cars are. Look, they display their cars in a virtual 3D display. They localize every car. We don't do anything like that. We just say, where does a human drive the car? When does a human hit the brakes? So yeah, we have a much more holistic just tell me the action, don't tell me the state. I don't care what you know about the state.
[34:58]
Jim Rutt
You're using the human as the model. Right? What would a human know in this situation? And let's emulate as closely as we can what the human would do.
[35:06]
George Hotz
And humans don't have little cars with bounding boxes in their head, especially the ones on the other side of the dividing line of the highway, but your Tesla does.
[35:13]
Jim Rutt
Ah, this might be a way to get at the difference. Guesstimate the ratio of CPU power that Tesla applies to the real time problem compared to what your comma 3x box applies to the problem. Guesstimate, you know, good faith estimate it's about 100x. 100x. Oh, only 100x. Well that's interesting, but still, 100x is big. Two orders of max to do. Yeah, I know they got some hunk and big computers and some specialized silicon, all kinds of stuff.
[35:37]
George Hotz
We're spending about 3 watts, they're maybe spending about 60. They're doing uint8, so they have like another 5x there. So it works out to about 100x on both training and testing. So we'll train on like 40 GPUs, they'll train on 4000.
[35:52]
Jim Rutt
And they are a bit ahead of you functionally. Right. What would you say is the Gap, where are they ahead of you and where are you head of that if you are ahead of them?
[35:59]
George Hotz
Well, it's tricky. Tesla certainly has more capability than us. If you're asking the question like if you're trying to drive from point A to point B without a disengagement. There's many things that just comma will never do and Tesla may do with some percent chance, but I think they're behind us considerably in usability. And again Reddit reflects this. You put a Tesla on on the highway and every once in a while it makes some really sketchy mistakes. It'll phantom brake not just the highway, but if you're like going through an intersection, it'll like mistrack the lane. It'll put you one lane over instead of this lane as the continuation. And then it applies a lot of torque to the wheel. It's a such a jarring experience as a user. Our torque limits, the amount of torque that we can apply to the wheel is way lower than the amount of torque Tesla can apply to the wheel. And we have a saying that like smooth driving is safe driving. So as far as a practical day to day usability thing, I do now say we're ahead of Tesla as far as high end capabilities, Tesla's. Yeah, Tesla is multiple years ahead of us.
[37:06]
Jim Rutt
I also did see some of these YouTube side by sides and things that you know, you guys say making driving chill, right? And Tesla does apparently I've never messed with a self driving car so I don't personally have any hands on experience but apparently it does. It is not chill, right? So where are they ahead? Give me an example of where they're clearly feature further ahead of you.
[37:26]
George Hotz
I would say that comma's slogan is make driving chill and Tesla's slogan is look at this crazy feature. So we have rudimentary stuff now shipped where you can put in a destination and it will navigate there. But it looks a lot more like the earliest versions of FSD than the versions of FSD that are out now. The versions of FSD that are out now may not be comfortable and may not be particularly good driving, but they are very capable. A Tesla can make a right turn at a light, it can go get in the right turn lane, turn the blinker on, wait, appropriately make a turn. It's rigid but it can do that. Whereas the comma when it gets overwhelmed, they behave very differently. When the Tesla gets overwhelmed, it freaks out, it'll jerk the wheel, it'll slam on the brakes. When the comma gets overwhelmed It'll just get a little bit more shaky and unsure. We're putting the neural net policy a lot more in than Tesla is. Even Tesla's new end to end thing, they're still using the same planner. They just move the planner off of the car and onto the back end. And it's a very rigid mpc, cost Based planner. MPC stands for Model Predictive control. So you can put in like a list of costs and then optimize a trajectory given those costs. But you can have things that look very snappy with that kind of thing. Right. You might have like two local minima of the function and it can like snap to either one, even if one's a little higher than the other. Whereas ours looks a lot more like the failure modes of neural networks, which look a lot more human like than the failure modes of like a powerful optimizer.
[39:02]
Jim Rutt
On the other hand, the similarities between you and Tesla is that they do most of their processing locally. You do all your processing locally pretty much. Right.
[39:12]
George Hotz
Tesla does all the back end, trains the model.
[39:14]
Jim Rutt
Oh, I see.
[39:14]
George Hotz
Okay, so both us and Tesla have, have data centers, okay. That train the model, but then once the model is uploaded to your car, everything about it is local.
[39:23]
Jim Rutt
Got it. Okay.
[39:24]
George Hotz
Tesla doesn't have the unit economics. You think there's a guy helping you out with Autopilot? No way. It's all on the car. It's all software.
[39:30]
Jim Rutt
I was like, maybe the big, big computer might have helped you out once in a while or something.
[39:33]
George Hotz
You know, it becomes very hard to make something like that robust, especially for something like us and Tesla, which can operate anywhere. Anywhere. If you're Waymo, you can bribe the city of Scottsdale to install a new cell phone tower.
[39:43]
Jim Rutt
That is actually a big distinction that you guys actually still are on the track for. It works anywhere, which was the original story back in the mid 15s.
[39:53]
George Hotz
I don't want to solve self driving. Self driving is a stepping stone. I want to solve life. I want to build artificial life, silicon stack life. A car is just a, you know, it's another form of life.
[40:04]
Jim Rutt
I've always loved self driving cars because it is narrow AI, Right? You know, the thing that will drive your will not make you a sandwich also, but it's a really big, narrow piece of AI and we're bound to learn a bunch of cool things from solving that problem. And then we can then apply those to what comes next. Some of the data I dug up is as of December 2023, Waymo had only driven theirs in no Driver mode. I was surprised. Only 7 million miles. Relatively small amount. Guesstimates of engaged autopilot. 3.3 billion miles. You guys something north of 100 million.
[40:43]
George Hotz
Yeah, we're 10x bigger than Waymo and Tesla is 30x bigger than us.
[40:48]
Jim Rutt
And I think when I reached out to you, I said, you guys probably ought to be more well known than you are. There ain't a lot about you out there on the Internet.
[40:54]
George Hotz
It doesn't help.
[40:55]
Jim Rutt
What doesn't help?
[40:56]
George Hotz
People knowing about us.
[40:58]
Jim Rutt
Well, how are they going to buy your stuff they don't know about you? We used to say. Well, I won't say what I was going to say, which was very politically incorrect.
[41:04]
George Hotz
But I mean we're trying to be a profitable company but we don't do marketing. We're thinking about it now. But fundamentally our mission is to solve self driving cars. And contrary to what people believe, we are not at all limited by data. We only train on about 5% of the data we have and the only reason for this is diminishing returns once you train on more. And we can iterate faster if we train on less of the data. So we're not data limited, we're not money limited. No one's money limited today. No one who like has good ideas is usually limited by money. I can raise all the money I want if I had a way to deploy it, but I don't. We're limited by solving the problem.
[41:42]
Jim Rutt
So what does that mean? You say you have 10,000 active users, only 10,000 people in the world that would like your level of capability.
[41:47]
George Hotz
I'm sure there's many more people who would like it and they'll find out about it over time. But it doesn't help me if they find out about it today versus they find out about it in two years. It's about the end point. It's not about making money tomorrow. I don't care.
[42:00]
Jim Rutt
Gotcha. All right, let's now dig into a little bit more than nitty grades. I'm sure the listeners are just waiting to get into. What about the legal issues? You know, the federal government, state governments, who's liable? Are you an open source guy? Hands off. If this thing blows up, tough lock. It's on you. What about the legal liability, regulatory environment that you're operating in?
[42:20]
George Hotz
So you know there was a lot of sort of fake news about Comma and Nishta. We've had many back and forth with Nishta since then. For the most part they're relatively reasonable. I feel that the Way that cars are regulated in America is quite reasonable. You know, people are always like, is this comma thing certified? Like, who do you think certifies it? Like, the way automotive works in America is manufacturers self certify and we self certify that we're in compliance with the same set of standards that like Bosch and Continental do when they make ADAS systems for cars. We have a safety system which follows ISO 26262. There's two more standards now, like we limit our torque. The EU led a lot of this regulation, but we really, for the most part, like regulation. If it's good regulation, if it regulates things like torque on the wheel, max braking force, you know, max acceleration, we're interested in those numbers and we make sure that our system complies with them. It's a level 2 system, meaning if you know you're in control of the vehicle at all times, the only thing that karma can guarantee you, the only thing we promise you, is that the car will never become uncontrollable. You can always reach out, hit the brake pedal and the brakes will work. You can always massively overpower any torque we're putting on the steering wheel. Again, we put very little torque on the steering wheel. You can override it with two fingers if the car crashes. Yeah, I mean, it's on you. But pay attention at all times.
[43:41]
Jim Rutt
I will say I saw some youtubes of people hands off.
[43:44]
George Hotz
Well, hands off is very different from eyes off. Look, we don't say hands off, but people drive hands off with no driver assistance systems in their car. Car. Whether you choose to take your hands off the wheel is completely up to you or not. You'll get a feel for the max amount of torque it's ever going to put on the wheel. We absolutely say you must keep your eyes on the road at all times. We actually have a camera which monitors and make sure you do that. Again, it's very non intrusive as long as you're paying attention. But if you think you're going to get this thing and use your cell phone or take a nap, you just can't. I mean, again, can you? Well, sure. You can also get a normal car, put a brick on their gas pedal and sleep in the back seat.
[44:19]
Jim Rutt
And of course, we know cases of Tesla where people have, you know, made out with their girlfriends or something, got their heads cut off when it ran into the back of a truck. And of course, part of that, what I've read about is the Cadillac system apparently is extremely Germanic in keeping its Eye on you and making sure that you have your eyes on the road and all this sort of shit. How far along that line are you with, say, compared to the high end GM system?
[44:42]
George Hotz
We have the best driver monitoring in the world. We have the best driver monitoring in the world. And then also our driver monitoring. I mean again, we're in a pretty good space with the open source too, which is if you make a system that has too many false positives or alerts people when they feel it's unreasonable, what you'll get is alert fatigue and they'll just stop paying attention to the system. It's really important to us that people respect the system and we're not going to force that respect through an iron fist. We're going to force that respect through. Wait a second, wait. I actually did look off the road for too long there. I'm actually kind of happy that thing beeped. Will wake people up from sleeping. People are very happy to be woken up from sleeping. So a good driver monitoring system should not be viewed as an adversary. It should really be viewed as something that helps you out. And again, it's all completely local on the device. Unless you specifically opt in. We're not uploading any pictures of you.
[45:33]
Jim Rutt
Are you capturing telemetry on everybody or is that also opt in driving telemetry?
[45:39]
George Hotz
Telemetry is opt out. You can run a fork or you can disable the uploader or you could just not connect to WI fi.
[45:45]
Jim Rutt
Gotcha.
[45:45]
George Hotz
So it is optional in that sense, but it is opt out in the know.
[45:48]
Jim Rutt
Truthfully, it is a common good to upload your telemetry so that the system will get better. You would think if you were a non free rider moral person, unless you're going to your illicit girlfriend's house or something, you would probably want to upload your telemetry. All right, well, this has been very interesting. What else can you tell us about your vision for the road ahead?
[46:08]
George Hotz
You know what you said before about self driving cars are narrow AI? I don't really exactly understand the distinction between narrow AI and general AI, but I will say that I think it's all on a spectrum. Self driving cars have some things about them that make them an easier problem than general purpose robotics. A robot that can make you a sandwich and clean your house is a lot more complicated than a car for two reasons. One, it's very easy to gather data of good driving and it's very easy to gather data of good driving. From the perspective of the car, it's much harder to gather Data of good sandwich making from the perspective of the human. Maybe I could do it with cameras today and some fancy recovery algorithms, but really if I wanted the true thing, I'd have to put them in like a motion capture suit. So it's hard to get data sets for these other things. And then the driving problem is low dimensional. So a car is basically a two dimensional system. You have a steering and you have an acceleration, whereas a hand. Oh, look how many dimensions it has. You know, it's this crazy complex. Like even if you're just talking about my hand as an end actuator, right. And then grip is one, that's still seven, you know, it's six, sixth off to put it in space and then one to do the grip. And the hand is of course way more complicated than that. So our goal is to solve self driving cars, but not as an endpoint. Our goal is to solve self driving cars as a jumping off point to general purpose robotics. And the end dream of comma is to sell you the comma body, the $25,000 robot companion that comes home and cooks for you, cleans for you and you know, does whatever else you might want. We don't judge.
[47:46]
Jim Rutt
Interesting. Okay, yeah, that makes sense. Do you have any other competitors other than Tesla? Let's say that. Is there anybody else trying to do what you're doing?
[47:54]
George Hotz
Doing Wave AI has a lot of similar stuff. We like them. They have the fancy simulator now out there. We're doing, we're pushing now on like very similar simulator technologies. These things were all enabled by transformers. Transformers are allowing like all sorts of data, not just language. Our new simulator is basically a video transformer.
[48:15]
Jim Rutt
All right, well then more tangibly on the road ahead, you're currently at level two, kind of the equivalent of, of, you know, what you might get in a high end Subaru today. Right. Do you see yourself climbing the level three, level four things, was it the top of the line GM1 is that level three plus or is that four these days?
[48:34]
George Hotz
This is why I don't think the levels are particularly good. We have no interest in ever going past level two. We have no interest in taking liability. We have no interest in being an insurance company. Other people can definitely use our software and provide that service. On top of that, it our goal is to build software that is a better driver than a human, maybe a 10x better driver than a human. But as far as comma AI ever shipping something that's not level 2, I have no interest. Somebody can take our open source software, do the Statistics themselves and be like, wow, wait, if we just had this thing controlling the car instead of humans, we'd have 10x less accidents. And then they can provide that liability level 5 layer on top of it. I'm also confused of a believer that, like, there's level two, maybe there's a little bit of level three and then there's level five. I don't think level four is viable. I don't think a car that works in one precision mapped city. Not that it's not buildable, it's just never a good business model. The Level 5 cars will come too quickly after the Level 4 cars for you to ever recapture the amount of value that you burned creating that thing.
[49:38]
Jim Rutt
Gotcha. That's interesting for the audience. Level two means, at least officially, people have to have their hands on the wheel at all times.
[49:46]
George Hotz
Whether you put your hands on the wheel is completely up to you. We don't issue any official guidance on this. All we say is that we limit the maximum amount of torque the system is capable of applying to the wheel.
[49:57]
Jim Rutt
What do you believe that implies about liability for your company? You know, suppose you have a bad bug that causes your software to run over somebody.
[50:05]
George Hotz
Well, I mean, the software didn't run over somebody. The human driving the car ran over somebody.
[50:10]
Jim Rutt
Unless they didn't have their hands on the wheel. Right?
[50:12]
George Hotz
Well, that's their choice, right? What did IBM say? You can never let a computer make a decision because a computer can't be held accountable. That's my philosophy on this kind of stuff. It's like the human is in control of the car at all times. They can decide for themselves how much they'd like to. All cars have driver assistance to some level. Power steering is a form of driver assistance. Cruise control is a form of driver assistance. We just continue to move up this gradient of who's liable. If you jam the wheel to the right and drive the car into somebody, can you say the power steering system is liable?
[50:49]
Jim Rutt
Has this been litigated? Has there been any claims against you guys for bugging the software caused this problem?
[50:56]
George Hotz
Not us. No. We were involved in one lawsuit with a patent troll that we quickly crushed.
[51:01]
Jim Rutt
I used to love crushing patent trolls when I was the the CTO of Thomson Reuters. When anybody called up with a patent thing, they'd send them to me and I said, you know who we are? We own West Publishing. We have a building in Minneapolis with 7,000 fucking lawyers in it. I will litigate your ass into your dust if it costs $5 million. I don't care. I ain't giving you a fucking penny. There's a lot of weaker sisters out there than me. Go fuck with them.
[51:25]
George Hotz
This is exactly what I said.
[51:27]
Jim Rutt
We never paid a penny ever to a patent troll when I was there.
[51:30]
George Hotz
But you have to. I mean, this is the only way to do it. I'm legit willing to do it. I like, like there was, I was willing to spend whatever to make sure he doesn't get money.
[51:37]
Jim Rutt
There needs to be more people like that in our industry. There are a whole bunch of the companies succumbed to stupid ass patent trolls and paid out hundreds of thousands of dollars which you're just feeding the troll. Right.
[51:47]
George Hotz
When you do that, comma, will eventually get sued. And I will take the exact same approach. We are not going to settle. We are going to. You are in control of the car at all times. And I'm not tongue in cheek about, about this. Like, I'm not trying to sell devices. You know, some of the test, the marketing stuff is like, it's way beyond anything we do. I wouldn't like call the system full self driving. We don't do that.
[52:06]
Jim Rutt
Yeah, that's kind of nuts. That was a stupid hassle. If I was their lawyers, I would not have let them do that.
[52:11]
George Hotz
But I mean, look, Elon loves these kind of things. Elon loves pushing the boundaries and pushing the limits. I'm much more of the quiet scientist guy who like, I want to solve this problem very carefully. I don't want problems from other people. I don't want to want to oversell anything. Buy it or don't buy it, that's up to you.
[52:25]
Jim Rutt
I presume there's some form of contract people have to sign that lays all this out.
[52:30]
George Hotz
I mean there's a term service and the terms of service is pretty clear about this. It's not just you indemnify comma for liability.
[52:37]
Jim Rutt
When you're using the system, you got good lawyers.
[52:40]
George Hotz
Yeah.
[52:41]
Jim Rutt
If you don't, I've got a great one that does this kind of shit. He drew up some contracts for me and when we did get sued, we were 65 and oh, because he wrote the contracts and he was our head of litigation, it will happen.
[52:53]
George Hotz
We will get sued. We do have great lawyers. And like these things have also been well litigated throughout automotive history to be that like you can't hold the car manufacturer liable if a person does something stupid in a car.
[53:07]
Jim Rutt
That is true. On the other hand, if there is a mechanical failure, that should not have happened at that point. Let's say a bearing brake falls off left front wheel and it causes a head on collision that could result in actionable litigation.
[53:21]
George Hotz
Absolutely. And we do distinguish the concept of like functional safety. If for example you put comma in your car and then the brake stopped working. Well, that's a very different story. That's a very different story from you made the decision to not have your hands on the wheel and not pay attention and something bad happened.
[53:37]
Jim Rutt
So you draw a line that you will be liable for mechanical side effects caused by your system. But you are saying that any actions taken, judgment calls done reside liably with the human.
[53:49]
George Hotz
I'm not saying I will take liability, but this is my basic understanding of product liability.
[53:54]
Jim Rutt
Gotcha.
[53:54]
George Hotz
Which is yes, of course, if the product malfunctions in a way that like you can no longer use the steering wheel or you can no longer use the brake. Yeah, that's very different from. But we've never had anything like that happen. There's so many redundancies in place to make sure. Also again, we don't hack the car. We use the messages that are put in there by the manufacturer that are intended to be used for adas and we use them within the spec. It's a reverse engineered spec. But I think for some of these we know more than the manufacturer does.
[54:23]
Jim Rutt
That's quite possible.
[54:25]
George Hotz
Well, we have telemetry, they don't.
[54:26]
Jim Rutt
That's true though. You have the full loop, right?
[54:28]
George Hotz
We got everything. I got the full can bus coming back on these cars. And we find quirks in these cars. We found some Volkswagens that the power steering that wasn't initializing in certain scenarios and the bug was in the Volkswagen software, not in comma.
[54:41]
Jim Rutt
All right, any final things you want to talk about OpenPilot and comma before we go on for a brief chat about your other thing. Tiny, whatever the fuck it is.
[54:50]
George Hotz
I think that's mostly it. We had a big breakthrough last year that I think is going to start to pay off to move to this third paradigm. I'm so grateful for all the people who are doing research on things like vqvae transformers, all the people building hardware and infrastructure to enable us to like extract signal from all of this data and make progress in solving these very hard problems that are following in the footsteps of God, building brains and building.
[55:21]
Jim Rutt
Yeah, it is amazing. I'm working on a project where we use LLMs and related technologies to write movie screenplay and I acknowledge all the time the heavy lifting is being done by others. We are basically smart appliers of tools being built by other folks. And it is amazing how much cool, good work is being done by so many people in this world right now that are allowing all this to come on. All right, now let's talk about your Tiny Grad. Just a little, just a few minutes here. That's another project that you're working on. Tell us what that is, why you're doing it, and what implications you think it has.
[55:55]
George Hotz
I'm the CEO of Tiny Grad. That is where I'm full time right now. What Tiny Grad is is a machine learning framework. It competes with TensorFlow, PyTorch and Jax and it allows you to train models on various hardware. The big difference between Tiny Grad and its competitors is tiny grad is 100x simpler. The code base is right now 5200 lines.
[56:19]
Jim Rutt
Wow, 3200 lines. That's amazing.
[56:21]
George Hotz
5200 lines. It can run stable diffusion, it can run llama, it can train cifar, it can train resnet, it's a fully featured and it's pretty fast too. And I think that a lot of the problems in something like Pytorch is combinatorial explosion. You have an operator, a D type, a device, and Pytorch will write a kernel for each one of those things and those become multiplies. We support devices in a generic way, D types in a generic way, operations in a generic way such that those become ants. You can add a new dtype and that dtype will new like data type and that data type will automatically work for every operation Tiny Grad supports and on every device Tiny Grad supports. There's also a whole bunch of other things where it's just. I've refactored it and thought about things so much that we care about the simplicity of the library because eventually this stuff's going to be translated into hardware. The long term goal of Tiny Grad is to build machine learning asics. But we start with the software, not with the hardware. Because I don't want to end up like Dojo.
[57:21]
Jim Rutt
Has anybody taken up using it at this point? Because that's obviously where the rubber meets the road on machine learning frameworks.
[57:28]
George Hotz
So yeah, so it's actually used in OpenPilot. It's used in OpenPilot to run the model on the device. There's a whole bunch of people using it for what look like similar use cases to that. It's very good at the like embedded weird. Like it's very easy to port a new new system, to port a new kind of accelerator to Tiny Grad. You can deploy it in microcontrollers. There's a whole bunch of people who've deployed it in those settings. The big settings. Like I'm training huge models on Nvidia. Not yet, but we're working on it.
[57:56]
Jim Rutt
Okay. Very cool. Folks who are interested in those categories, probably 5% of my listeners go check it out. The tiny grad.com. is that your website?
[58:03]
George Hotz
Org.
[58:04]
Jim Rutt
Org.
[58:05]
George Hotz
Tiny grad.org yeah.
[58:06]
Jim Rutt
And then the self drawing Robin Karstoff is at comma AI that one I know. So check these things out. I really want to thank George Hotz for a heck of an interesting conversation here today.
[58:16]
George Hotz
Thank you for having me.
[58:17]
Jim Rutt
It's been great. Audio production and editing by Andrew Blevins Productions. Music by tom muller@modernspacemusic.com.