
Latest Interview of Elon Talking About AI Grok 3!!! #ElonMusk Source: https://x.com/i/broadcasts/1gqGvjeBljOGB Follow me on X https://x.com/Astronautman627?t=RFQEunSF2NwRkCOBc6PkkQ&s=09
Loading summary
Narrator
When you visit Arizona, time is measured in moments, not minutes. Like the moment your work stress disappears as you kayak through the canyons. Or the moment you discover the life changing effects of prickly pear chocolate. But nothing beats the moment you see the Grand Canyon for the very first time. Visit a new state of mind. Learn more at. Here you are.
Elon Musk
Az.com welcome to the Grok 3 presentation. The mission of Xai and Grok is to understand the universe. We want to understand the nature of the universe so we can figure out what's going on. Where are the aliens? What's the meaning of life? How does the universe end? How did it start? All these fundamental questions were driven by curiosity about the nature of the universe. And that's also what causes us to be a maximally truth seeking AI, even if that truth is sometimes at odds with what is politically correct. In order to understand the nature of the universe, you must absolutely, rigorously pursue truth or you will not understand the universe. You'll be suffering from some amount of delusion or error. So that is our goal. Figure out what's going on. And we're very excited to present Grok 3, which is, we think, an order of magnitude more capable than Grok 2 in a very short period of time. And that's thanks to the hard work of an incredible team. I'm honored to work with such a great team and of course we'd love to have some of the smartest humans out there join our team. So let's go.
Igor
Hi everyone. My name is Igor, lead engineering at xai.
JB
I'm JB pa, leading research.
Tony
I'm Tony, working on the reasoning team.
Elon Musk
All right, Ilana. I don't do anything. I just show up occasionally.
Igor
Yeah. So like Elon mentioned, GROK is the tool that we're working on. GROK is our AI that we're building here at xai. And we've been working extremely hard over the last few months to improve GROK as much as we can. So we can give all of you access to it. We think it's going to be extremely useful. Do you think it's going to be interesting to talk to? Funny. Really, really funny. And we're going to explain to you how we've improved GROK over the last few months. We've made quite a jump in capabilities.
Elon Musk
Actually, we should explain, maybe also, why do we call it grok? So, Grok is a word from a Heinlein novel, Stranger in a Strange Land, and it's used by a guy who's raised on Mars and the Word GROK is to sort of fully and profoundly understand something. That's what the word GROK means, fully and profoundly understand something.
JB
And empathy is important.
Elon Musk
True.
JB
So yeah, if we charted Xai's progress in the last few months has only been 17 months since we started kicking off our very first model. Grok 1 was almost like a toy by this point, only 314 billion parameters. And now if you're proud of the progress, the time on X axis, the performance of favorite benchmark numbers, MMLU on the Y axis, we're literally progressing at unprecedented speed across the whole field. And then we kick off Grok 1.5 right after Grok 1 release, after November 2023 and then Grok 2. So if you look at where the all the performance coming from, when you have a very correct engineering team and all the best AI talent, there's only one thing we need is a big intelligence comes from big cluster. So we can reconvert the entire progress of XAI now replacing the benchmark on the Y axis to the total amount of training flops, that is how many gpus we can run at any given time to train our large language models to compress the entire Internet.
Elon Musk
So after GraphQL, all human knowledge, really?
JB
That's right, yeah.
Elon Musk
Internet being part of it, but it's really all human knowledge. Everything.
JB
Yeah, the whole Internet fits into a USB stick at this point.
Elon Musk
It's like all the human tokens very.
JB
Soon into the real world. We have so much Trouble actually training Grok 2 back in the days. We kick off the model around February and we saw we had a large amount of chips, but turned out we can barely get 8k training chips running coherently at any given time. And we had so many cooling and power issues. I think you were there in the data center.
Elon Musk
Yeah, it was like really sort of more like 8k chips on average at 80% efficiency. More like 6500 effective H1 hundreds training for several months. But now we're at 100k.
JB
Yeah, that's right.
Elon Musk
More than 100k.
JB
That's right. So what's the next step? Right, so after Grok 2. So if we want to continue to accelerate, we have to take the matter into our own hands. We have to solve all the coolings, all the power issues and everything. Yeah.
Igor
So in April of last year, Elon decided that really the only way for XAI to succeed, for X to build the best AI out there, is to build our own data center. So really we realized we have to build the data center in about four months, it turned out. It took us 122 days to get the first 100k GPUs up and running. And that was a monumental effort to be able to do that. We believe it's the biggest fully connected H100 cluster of its kind. We actually decided that we need to double the size of the cluster pretty much immediately if we want to build the kind of AI that we want to build. So we then had another phase which we haven't talked about publicly yet. So this is the first time that we're talking about this, where we doubled the capacity of the data center yet again, and that one only took us 92 days. So we've been able to use all of these GPUs, use all this compute to improve GROK in the meantime. And basically today we're going to present you the results of that. The fruits that came from that.
JB
Yeah, so all the paths, all the rows leads to Grok3 10x more compute.
Elon Musk
More than 10x really maybe 15x ish.
JB
Compared to our previous generation model. And Grok3 finished the pre training early January and the model is still currently training actually. So this is a little preview of our benchmark numbers. So we evaluated Grok3 on three different categories on general mathematical reasonings, on general knowledge about STEM and science, and then also on computer science coding. So Amy, American Invitational Math Examination hosted once a year. If we evaluate the model performance, we can see that the Grox 3 across the board is in a league of its own. Even its little brother Grok 3 mini is reaching the frontier across all the other competitors. You will say, well at this point, all these benchmarks, you're just evaluating the memorization of the textbooks, memorization of the GitHub repos. How about a real time usefulness? How about we actually use those models in our product? So what we did instead is we actually kicked off a blind test of our graphic model codenamed Chocolate. It's pretty hot.
Elon Musk
Yeah, Hot Chocolate.
JB
Been running on this platform called Chatbot arena for two weeks. I think the entire X platform at some point speculated this might be the next generation of AI coming away. So how this Chatbot arena works is that it strip away the entire product service. Right? It just raw comparison of the engine of those AGIs, the language models themselves, and place the interface where the user will submit one single query and you get to show two responses. You don't know which model they come from and in the end you make the vote. So in this blind test, Grox 3, an early version of Grox 3 already reached like 1400. No other models has reached an Elo score. Had to high comparison to all the other models at this score. And it's not just one single category, it's 1400 aggregated across all the categories in chatbot capabilities, instruction following, coding. So it's number one across the board in this blind test and it's still.
Elon Musk
Climbing so we actually have to keep updating it. So it's about 1400 and climbing.
Igor
Yeah. In fact we have a version of the model that we think is already much better than the one that we tested here. We'll see know how far it gets, but that's the one that we're, you know, working on, we're talking about today.
Elon Musk
Yeah. So actually one thing, if, if you, if you're using Grok 3, you, I think you may notice improvements almost every day because we're, we're continuously improving the model. So literally Even in, within 24 hours you'll see improvements.
JB
Yep. So, but we believe here at xai, getting the best pre training model is not enough. That's not enough to build the best AI. And the best AI need to think like a human. You to contemplate about all the possible solutions, self critique, verify all the solutions, backtrack and also think from the first principle. That's a very important capability. So we believe that as we take the best pre trained model and continue training it with reinforcement learning, it will elicit the additional reasoning capabilities that, that allows the model to become so much better and scale not just in the training time, but actually in the test time as well. So we already found the model is extremely useful internally, saving hundreds of hours of coding time. So Igor, you are the power user of our graphic reasoning model. What are some use cases?
Igor
Yeah, so like Jimmy said, we've added advanced reasoning capabilities to GROK and we've been testing them pretty heavily over the last few weeks. I wanted to give you a little bit of a taste of what it looks like when GROK is solving hard reasoning problems. So we prepared two little problems for you. One comes from physics and one is actually a game that GO is going to write for us when it comes to the physics problem. What we want GROG to do is to plot a viable trajectory to do a transfer from Earth to Mars and then at a later point in time a transfer back from Mars to Earth. And that requires some physics that GROK will have to understand. So we're going to challenge Grok, you know, come up with a viable trajectory, calculate it and then Plot it for us so we can see it. And yeah, this is totally unscripted, by the way. This is the GROK interface and we've typed in this text that you can see here. Generate code for an animated 3D plot of a launch from Earth landing on Mars and then back to Earth at the next launch window. And we've now kicked off the query. And you can see GROK is thinking. So part of grok's advanced reasoning capabilities are these thinking traces that you can see here. You can even go inside and actually read what GROK is thinking as it's going through the problem, as it's trying to solve it.
Elon Musk
Yeah, we said we are doing some obscuration of the thinking so that our model doesn't get totally copied instantly. So there's more to the thinking than is displayed.
Igor
And because this is totally unscripted, there's actually a chance that GROK might made a little coding mistake and it might not actually work. So just in case, we're going to launch two more instances of this. So if something goes wrong, we were able to switch to those and show you something that's presentable. So we're kicking off the other two as well. And like I said, we have a second problem as well. Actually, one of our favorite activities here at XEI is having GROK write games for us. Not just any old game, any game that you might already be familiar with, but actually creating new games on the spot and being creative about us. So one example that we found was really, really fun is create a game that's a mixture of the two games, Tetris and Bejeweled.
Elon Musk
This is maybe an important thing like this. Obviously, if you ask an AI to create a game like Tetris, there are many examples of Tetris on the Internet. Or game like Jeweled, whatever, it can copy it. What's interesting here is it achieved a creative solution combining the two games that actually works and is a good game. We're seeing the beginnings of creativity.
Igor
Fingers crossed that we can recreate that.
Elon Musk
Hopefully it works.
Igor
Actually, because this is a bit more challenging, we're going to use something special here, which we call Big Brain. That's our mode in which we use more computation, which is more reasoning for Grok, just to make sure that there's a good chance here that it won't actually might actually do it. So we're also going to fire off three attempts here at solving this game, at creating this game that's a mixture of Tetris and Bejewels. Let's see what Grok comes up with.
Elon Musk
I've played the game, it's pretty good. You're like, wow, okay, this is something.
Igor
Yeah. So while Grok is thinking in the background, we can now actually talk about some concrete numbers. How well is Grok doing across tons of different tasks that we've tested it on? So we'll hand it over to Tony to talk about that.
Tony
Yeah. Okay, so let's see how Grog does on those interesting challenging benchmarks. So, yeah, so reasoning again refers to those models that actually thinks for quite a long time before it tries to solve a problem. So in this case, around a month ago, the GRASS three pretraining finishes. So after that we work very hard to put the reasoning capability into the current Graph 3 model. But again, this is very early days. So the model is still currently in training. So right now what we are going to show to people is this beta version of the Graph 3 reasoning model. Alongside, we also are training a mini version of the reasoning model. So essentially on this plot you can see the Graph 3 Reasoning Beta and then Graph 3 Mini Reasoning. The grassroots mini reasoning is actually a model that we train for much longer time. You can see that sometimes it actually perform slightly better compared to the graph 3 reasoning. This also just means that there's a huge potential for the graph 3 reasoning because it's trained for much less time. All right, so let's actually look at how it does on those three benchmarks. So Jimmy also introduced already. So essentially we're looking at three different areas. Mathematics, science and coding. And for math, we're picking this high school competition math problem. For science, we actually pick those PhD level science questions. And for coding, it's also actually pretty challenging. It's competitive coding and also some leet code, which is some code interview problems that people usually get when they interview for companies. So on those benchmarks you can see that the gra3 actually performed quite well across the board compared to other competitors. Yeah, so it's pretty promising. These models are very smart.
JB
So Tony, what are those shaded bars?
Tony
Yeah. Okay, so I'm glad you asked this question. So for those models, because it can reason, it can think, you can also ask them to even think longer. You can spend more what we call test and compute, which means you can spend more time to, to reason, to think about a problem before you spit out the answer. So in this case, the shaded bar here means that we just ask the model to spend more time. You can solve the same problem many, many times before it tries to conclude what is the right solution. And once you give this compute or this kind of budget to the model, it turns out the model can even perform better. So this is essentially the shaded part in those box.
JB
Right. So I think this is really exciting. Right. Because now instead of just doing one chain of thoughts with AI, why not do multiple things at once?
Elon Musk
Yes.
JB
So that's a very powerful technique that allows to continue scale the model capabilities after training. And you know, people often ask, are we actually just overfitting to the benchmarks?
Tony
Yes.
JB
So how about generalization?
Tony
So, yes, I think, yeah, this is dumpling. A question that we are asking ourselves whether we are overfitting to those current benchmarks. Luckily, we have a real test. So about five days ago, Amy 2025 just finished. This is where high school students compete in this particular benchmark. So we got this very fresh new competition and then we asked our two models to compete on the same benchmark at the same exam. It turns out very interestingly, the Grass three reasoning, the big one actually does better on this particular new fresh exam. This also means that the generalization capability of the big model is stronger, much stronger compared to the smaller model. If you compare to the last year's exam. Actually this is the opposite. The smaller model kind of learns the previous exams better. So yeah, so this actually shows some kind of true generalization from the model.
JB
So 17 months ago, our Grok Zero and Grok One barely solved any high school problems.
Tony
That's right.
JB
And now we have a kid that just already graduated. The GROK is ready to go to college. Is that right?
Elon Musk
Yeah, I mean, it won't be long before simply perfect the human exams won't be hard. They'll be too easy.
JB
Yeah. And internally we actually, as the GROK continue evolves, we're going to talk about what we're excited about. But very soon there will be no more benchmarks left.
Elon Musk
Yeah.
Igor
One thing that's quite fascinating, I think, is that we basically only trained grok's reasoning abilities on math problems and competitive coding problems. It's very, very specialized kinds of tasks, but somehow it's able to work on all kinds of other different tasks. Including creating games? No, lots and lots of different things. And what seems to be happening is that basically GROK learns this ability to detect its own mistakes and its thinking, correct them, persist on a problem, try lots of different variants, pick the one that's best. So there are these generalizing abilities that GROK learns from mathematics and from coding, which it can then use to solve all kinds of other problems. So that's pretty.
Elon Musk
I mean, reality is the instantiation of mathematics.
JB
That's right. And one thing we're actually really excited about, that going back to our founding mission, is what if one day we have a computer just like Deep Thought that utilize our entire cluster just for that one very important problem. In the test time, all the GPU turned out right. So I think back then we were building the GPU clusters together. You are plugging cables. And I remember that when we turn on the first initial test, you can hear all the GPUs hummingbird in the hallway. That almost feel like spiritual.
Igor
Yeah, that's actually a pretty cool thing that we're able to do that. We can go into the data center and tinker with the machines there. So, for example, we went in and we unplugged a few of the cables and just made sure that our training setup is still running stably. So that's something that I think most AI teams out there don't usually do, but it actually totally unlocks a new level of reliability and what you're able to do with the hybrid.
Elon Musk
Okay, so when are we going to solve Riemann?
JB
So the easiest solution is to enumerate over all possible strains, and as long as you have a verifier enough compute, you'll be able to do it. My projection will be.
Elon Musk
What's your guess? What is your neural net calculator?
JB
So my bold prediction. So three years ago I told you this. I think now it's two years later, two things are going to happen. We're going to see machines win some medals, Turing's award, Fields Medal, Nobel Prize, with probably some expert in the loop. Right. So the expert. Uplifting.
Elon Musk
So this year or next year, that's what it comes down to, really.
JB
Yeah.
Igor
So it looks like Grok finished all of its thinking on the two problems. So let's take a look at what it said. So this was the little physics problem we had. No, we've collapsed the faults here, so they're hidden. And then we see Grok's answer below that. So it explains it. Wrote a Python script here using matplotlib, then gives us all of the code. So let's take a quick look at the code. It seems like it's doing reasonable things here. Not totally off the mark. Solve Kepler. Says here. So maybe it's solving Kepler's laws. Kepler's law numerically. Yeah. There's really only one way to find out if this thing is working. I'd say let's give it a try. Let's Run the code. All right. And we can see GOK is animating two different planets, Earth and Mars, here. And then the green ball is the vehicle that's transiting the spacecraft that's transitioning between Earth and Mars. And you could see the journey from Earth to Mars and looks like. Yeah, indeed, the astronauts return safely at the right moment in time. So now, obviously, this was just generated on the spot, so we can't tell you if that was actually a correct solution. So we're going to take a closer look. Maybe we're going to call some colleagues from SpaceX, ask them if this is legit.
Elon Musk
It's pretty close. There's a lot of complexities in the actual orbits that have to be taken into account. But this is. This is pretty close to what it looks like. Awesome. In fact, I have that on my pendant here. This has got the Earth, Mars Hohman Transfer on it.
JB
When are we going to install Grok on a rocket?
Elon Musk
Well, I suppose in two years.
JB
Two years? Everything is two years away.
Elon Musk
Well, Earth and Mars transit occurs every 26 months. The next. We're currently in a transit window. Approximately the next next one would be November of next year, roughly end of next year. And if all goes well, SpaceX will send Starship rockets to Mars with Optimus robots and Gro.
Igor
I'm curious what this combination of Tetris and Bejold works like the Tetrist, as we've named it internally. So, okay, we also have an output from Grok here. It says, wrote a Python script, explains that it's what it's been doing. If you look at the code, there are some constants that are being defined here, some colors, then the Tetrominos, the pieces of Tetris are there. Obviously very hard to see at one glance if this is good. So we got to run this to figure out if it's working. Well, let's give it a try. Fingers crossed. All right, so this kind of looks like Tetris, but the colors are a little bit off. The colors are different here. And if you think about what's going on here, Bejeweled has this mechanic where if you get three jewels in a row, then they disappear, and also gravity activates. So what happens if you get three of the colors together? Oh, yeah. So something happened. So I think. I think what Brock did in this version is that once you connect at least three blocks of the same color in a row, then gravity activates and they disappear. And then gravity activates and all the other blocks fall down. Kind of curious if there's still a Tetris mechanic here where if the line is full, does it actually clear it or what happens then it's up to interpretation. Also who knows it'll do different variants.
Elon Musk
When you ask it. It doesn't do the same thing every time.
Igor
Exactly. We've seen a few other that work very differently, but this one seems cool.
JB
Are we ready for game studio at X AI?
Elon Musk
Yes. So we're launching an AI gaming studio at xai. If you're interested in joining us and building AI games, please join xai. We're launching an AI gaming studio. We're announcing it tonight. Let's go Epic Games. But wait, that's an actual game studio? Yeah.
Tony
All right.
JB
So I think one thing is super exciting for us is that once you have the best pre trained model, you have the best reasoning model. Right. So we already see that when you actually give the capability for those model to think harder, think longer, think more broad, the performance continues. We're really excited about the next frontier that will happen if we not only allow the model to think harder, but also provide more tools just like how real humans to solve those problems for real humans. We don't ask them to solve Riemann hypothesis just with a piece of pen and paper, no Internet. So with all the basic web browsing search engine and code interpreters that builds the foundations and the best reasoning model builds the foundations for the GROK agent to come. So today we're actually introducing a new product called Deep Search that is the first generation of our GROQ agents that not just helping the engineers and researchers and scientists to do coding, but actually help everyone to answer questions that you have day to day. It's like a next generation search engine that really help you to understand the universe. So you can start asking questions like for example, hey, when is the next starship launch day? For example? So let's try that. If we hit the answer on the left hand side, we see a high level progress bar. Essentially the model knowledge is going to do one single search like the current RAC system, but actually thought very deeply about hey, what's the user intent here and what are the facts that you consider at the same time and how many different websites actually go and read their content. So this can save hundreds of hours of everyone's Google time if you want to really look into certain topics. And then on the right hand side you can see the bullet of how the current model is doing, what website is browsing, what source is verifying and oftentimes actually cross validate different sources out there to make sure the answer is actually correct before it's output final answer. And we can at the same time fire up a few more queries. How about you're a gamer, right?
Elon Musk
Sure.
JB
Yeah. So how about what are some of the best builds and most popular builds in passive Excel? Hardcore Hardcore league.
Elon Musk
If you can technically just look at the hardcore ladder, might be a fast way to figure it out.
JB
Yeah, we'll see what the model does and then we can also do something more fun. For example, how about make a prediction about the March Madness out there.
Elon Musk
Yeah. So this is kind of a fun one where Warren Buffett has a billion dollar bet. If you can exactly match I think those sort of the entire winning tree of Mosh Madness, you can win a billion dollars from Warren Buffett. So like it would be pretty cool if AI could help you win a billion dollars from Buffett. That seems like a pretty good investment.
JB
Let's go.
Elon Musk
Yeah.
Tony
All right.
JB
So now let's fire up the query and see what model does. So we can actually go back to our very first one.
Elon Musk
How about the Buffett wasn't counting on this.
JB
That's right. Okay. So we got the very first one and model thought around one minute. Okay so the key insight here the next starship is going to be on 24th or later. So no earlier than February 24th.
Elon Musk
It might be sooner.
JB
Yeah. So I think we can go down what the model does. So it does a little research Fly seven what happened got grounded and actually it look into the FCC filing from the data collections and that should made a new conclusion that if we continue to scroll down. Let's see. Right. Yeah. So it makes the little table I think inside Xai we often joked about the time to the first table is the only latency that matters. Yeah. So that's how the model make the inference and look up all the sources and then we can look into the gaming one. So how about the. Right. So for this particular one we look at hey the. The builder's light and it's kind of cool meta. So with the. The Infernal is. But if we go down so the surprising fact of all the other builds. So it look into the 12 classes. So we'll see that the Minion build was pretty popular whenever the game first came out and now the. The invokers of the world took over.
Elon Musk
Invoker Monk Invoker for sure.
JB
Yeah, that's right. Yeah. Followed by the Stoneweavers. Then that's really good at mapping. And then we can see the Match Madness. How about that? So one interesting thing about the Deep search is that if you actually go into the panel where it shows what are the subtasks, you can actually click the bottom left of this right. And then in this case you can actually scroll through, actually reading through the mind of Grok. What informations does the model actually think about are trustworthy? What are not? How does it actually cross validate different information sources? So that makes the entire search experience and information retrieval process a lot more transparent to our users.
Igor
This is much more powerful than any search engine out there. You can literally just tell it only use sources from X. You know, it will try to respect that. And so it's much more steerable, much more intelligent than.
Elon Musk
I mean it really should save you a lot of time. So something that might take an hour or an hour of researching on the web or searching media, you can just ask it to go do that and come back and 10 minutes later it's done an hour's worth of work for you. That's really what it comes down to.
JB
Exactly.
Elon Musk
And maybe better than you could have done it yourself.
JB
Yeah. Think about infamount of interns working for you now. You can just fire up all the tasks and come back a minute later. This is going to be interesting one. So Marchmade had not happened yet, so I guess we had to follow up with a next live stream.
Elon Musk
Yeah, it seems like pretty good. Like $40 might get you a billion dollars $40 subscription.
JB
That's right.
Elon Musk
I mean my work.
JB
So when are the users going to have their hands on Grok 3?
Igor
Yes. So the good news is we've been working tirelessly to actually release all of these features that we've shown you. The Grokvi base model with amazing chat capabilities. That's really useful, that's really interesting to talk to. The deep search, the advanced reasoning mode, all of these things. We want to roll them out to you today starting with the plus subscribers on X. So it's the first group that will initially get access. Make sure to update your X app if you want to see all of the advanced capabilities because we just released the update as we're talking here. And if you're interested in getting early access to Grok, then sign up for Premium plus. And also we're announcing that we're starting a separate subscription for Grok that we call Super Grok for those who are those real Grok fans, that one of the most advanced capabilities and the earliest access to new features. So feel free to check that out as well.
Elon Musk
This is for the dedicated Grok app and for the website.
Igor
Exactly. So our new website is called grok.com.
Elon Musk
Yeah.
Igor
And you also find.
Elon Musk
You never guess.
Igor
Yeah, you never guess. And you can also find our Grok app in the iOS app store and that gives you even more polished experience that's totally Grok focused. If you want to have grok easily available, one tap away.
Elon Musk
Yeah. And the version on grok.com on a web browser is going to be the latest and most advanced version because obviously it takes us a while to get something into an app and then get it approved by the App Store. And then if something's in a phone format, there's limitations of what you can do. So the most powerful version of Grok and the latest version will be the web version@grok.com?
Igor
Yeah. So watch out for the name Grok Free in the app.
Elon Musk
Did giveaway.
Igor
Yeah, exactly. That's the giveaway that you have Grok Free. And if it says Grok True, then grokfree hasn't quite arrived for yet. But we're working hard to roll this out today and then to even more people over the coming days.
JB
Yeah. Make sure you update your phone app too, where you are going to get all the tools we showcase today with the thinking mode, with the deep search. So, yeah, really looking forward to all the feedbacks you have.
Igor
Yeah.
Elon Musk
And I think we should emphasize that this is kind of a beta like, meaning that you should expect some imperfections at first, but we will improve it rapidly almost every day. In fact, every day I think it'll get better. So if you want a more polished version, I'd like maybe wait a week, but expect improvements literally every day. And then we're also going to be providing a voice so you can have conversational. In fact, I was trying it earlier today. It's working pretty well, but not. We need these a bit more polish, the sort of way where you can just literally talk to it like you're talking to a person. That's awesome. It's actually, I think, one of the best experiences of Grok, but that's probably about a week away.
JB
So with that said, I think we.
Igor
Might have some audience questions.
Elon Musk
Sure.
Igor
All right, let's take a look.
JB
Yeah, let's take a look. The audience from the apps platform.
Elon Musk
Yeah. Cool.
Igor
So the first question here is when Grok Voice Assistant, when is it coming out? As soon as possible. Just like Elon said, just a little bit of polishing away from being released to everybody. Obviously it's going to be released in an early form and we're going to rapidly iterate on that.
JB
Yeah. The next question is like, when will Grox3 be in the API? So this is coming in the Grox3 API with both the reasoning models and deep search is coming your way in the coming weeks. We're actually very excited about the enterprise use cases of all these additional tools that now Groq has access to and how the test time, compute and tool use can actually really accelerate how all the business use cases.
Igor
Another one is will voice mode be native or text to speech? So I think that means is it going to be one model that is understanding what you say and then talking back to you, or is it going to be some system that has text to speech inside of it? And the good news is it's going to be one model, like a variant of GR that we're going to release, which basically understands what you're saying and then generates the audio directly from that. So very much like Grog free generates text, that model generates audio. And that has a bunch of advantages. I was talking to it earlier today and it said, hi, Igor, reading my name, probably from some text that it had, and I said, no, my name is Igor. And it remembered that. So it could continue to say Igor just like a human would. And you can't achieve that with text to speech. So on.
JB
Yeah. So here's a question for you. Pretty spicy, Elon. Is Grok a boy or a girl? And are they single?
Elon Musk
Grok is whatever you want it to be.
JB
Are you single?
Elon Musk
Yes.
JB
All right. The shop is open.
Elon Musk
Honestly, people are going to fall in love with grok. It's like 1000% probable.
JB
The next question, will Grok be able to transcribe audio into text? Yes. So we'll have this capability in both the app and also the API. We found that Grok should just be your personal assistant, looking over your shoulder and follow you along the way, learn everything you have learned and really help you to understand the world better, become smarter every day.
Elon Musk
Yeah. I mean, the Voice Metal Grok isn't simply. It's not just voice, text. It understands tone, inflection, pacing, everything. It's wild. I mean, it's like talking to a person.
JB
Yep. So any plans for conversation memory?
Igor
Absolutely, we're working on it right now.
JB
That's right. Let's see, what are the other ones? So what about the DM features? So if you have personalization, if you have Grok remembers your previous interactions, should it be one Groq or multiple different Grox?
Elon Musk
It's up to you. You can have one Grok or many Groks. I suspect people will probably have more than one.
JB
Yeah, I want to have a Dr. Grok.
Elon Musk
Yeah, the Grok doc.
Igor
That's right. Cool. So in the past we've open sourced GROK one. So somebody's asking us, are we going to do it again with GROQ2?
Elon Musk
Yeah, I think once Grok, our general approach is that we will open source the last version. When the next version is fully out, like when Grok3 is mature and stable, which is probably within a few months, then we'll open source Grok 2.
JB
Okay, so we probably have time for one last question. What was the most difficult part about working on this project? I assume Grok3 and what are you most excited about? So I think me looking back, you know, getting the whole model training on the 100k H100 coherently, that's almost like battling against the final boss of the universe, the entropy. Because any given time you can have a cosmic ray that beaming down and flip a bit in your transistor. And now the entire gradient update, if it's fit Mantissa bit, the entire gradient update is out of whack. And now 100,000 of those, it will orchestrate them every time edit. At any given time, any of the GPUs can go down.
Elon Musk
Yeah, I mean, it's worth breaking down. Like, how were we able to get the world's most powerful training cluster operational within 122 days? Because we started off, we actually weren't intending to do a data center ourselves. We were going to just. We went to the data center providers and said, how long would it take to have 100,000 GPUs operating coherently in a single location? And we got timeframes from 18 to 24 months. So we're like, well, 18, 24 months, that means losing is a certainty. So the only option was to do it ourselves. Then if you break down the problem, I guess I'm doing like reasoning here.
JB
It makes you think one single chain though.
Elon Musk
Yeah, exactly. So what we needed a building. We can't build a building, so we must use an existing building. So we looked for basically for factories that have been abandoned. But the factory was in good shape, like the company had gone bankrupt or something. So we found an Electrolux factory in Memphis. That's why it's in Memphis, home of Elvis and also one of the oldest. I think it was the capital of ancient Egypt. And it was actually very nice factory that for whatever reason, that Electrolux had left. And that gave us shelter for the computers. Then we needed power. We needed at least 120 megawatts at first, but the building only had 15 megawatts. And ultimately for 200,000 GPUs, we needed a quarter gigawatt. So we initially leased a whole bunch of generators. So we have generators on one side of the building, just trailer after trailer, trailer of generators, until we get the utility power to come in. But then we also need cooling. So on the other side of the building, it was just trailer after trailer of cooling. So we leased about a quarter of the mobile cooling capacity of the United States on the other side of the building. Then we needed to get the GPUs all installed and they're all liquid cooled. So in order to achieve the density necessary. This is a liquid cooled system. So we had to get all the plumbing for liquid cooling. Nobody had ever done a liquid cooling data center at scale. So this was incredibly dedicated effort by a very talented team to achieve that outcome. I may think not now it's going to work. Nope. The issue is that the power fluctuations for a GPU cluster are dramatic. So it's like this giant symphony that has taken place. Like Mag having a symphony with 100,000 or 200,000 participants in the symphony and the whole orchestra will go quiet and loud in 100 milliseconds. And so this caused massive power fluctuations, which then caused the generators to lose their minds. And they weren't expecting this. So to buffer the power, we then used Tesla megapacks to smooth out the power. So the megapacks had to be reprogrammed. So with Xai working with Tesla, we reprogrammed the megapacks to be able to deal with these dramatic power fluctuations to smooth out the power so the computers could actually run properly. And that worked. It was quite tricky. But even at that point, you still have to make the computers all communicate effectively. So all the networking had to be solved. And debugging a bazillion network cables, a debugging nickel at 4 in the morning or we solved it like roughly 4:20am yes, thank you. Was when figured out, like there's some. Well, there were a whole bunch of issues. Well, like one, there was like a BIOS mismatch.
Igor
The BIOS was not set up correctly. We had. There are LSPCI outputs between two different machines. One that was working, one that was not working. Many, many other things.
Elon Musk
Yeah, exactly. This would go on for a long time if we actually listed all the things. But it's interesting. It's not like, oh, we just magically made it happen. You have to break down the problem just like Grok does for reasoning into the constituent elements, and then solve each of the constituent elements in order to achieve a coherent training cluster in a period of time that is a small fraction of what anyone else could do it in.
Igor
And then once the training cluster was up and running and we could use it, we had to make sure that it actually stays healthy throughout, which is its own giant challenge. And then we had to get every single detail of the training right in order to get a Grock level model, which is actually really, really hard. So we don't know if there are any other models out there that have Grock's capabilities. But whoever trains a model better than Grokri has to be extremely good at the science of deep learning, at every aspect of the engineering. So it's not so easy to pull this off.
JB
And this is now going to be the last cluster we built and last model we trained.
Elon Musk
Oh yeah, we've already started work on the next cluster which will be about five times the power. So instead of a quarter gigawatt, roughly 1.2 gigawatts, what's the back to the future was? What's the power? There's like the back to the Future car anyway, the back to the Future power, it's like roughly in that order, I think. So these will be the sort of the GB200,300 cluster. Once again, it'll be the most powerful training cluster in the world. So we're not stopping here and our.
JB
Reasoning model is going to continue to improve by accessing more tools every day. So yeah, we're very excited to share any of the upcoming results with you all.
Igor
Yeah, the thing that keeps us going is basically being able to give free to you and then seeing the usage go up, seeing everybody enjoy, that's what really gets us up in the morning.
JB
Thanks for tuning in.
Elon Musk
Thanks, guys.
Podcast Summary: Elon Musk Thinking – Latest Interview of Elon Talking About AI Grok 3!!!
Release Date: February 19, 2025
Host: Astronaut Man
Guest: Elon Musk and the Xai Team
In the latest episode of Elon Musk Thinking, hosted by Astronaut Man, Elon Musk introduces Grok 3, the latest iteration of Xai's artificial intelligence project. Elon opens with a compelling vision for Grok:
"The mission of Xai and Grok is to understand the universe. We want to understand the nature of the universe so we can figure out what's going on."
– Elon Musk [00:28]
Elon explains the origin of the name "Grok," drawing inspiration from Robert A. Heinlein's novel, Stranger in a Strange Land:
"Grok is a word from a Heinlein novel... it means to fully and profoundly understand something."
– Elon Musk [02:13]
This name encapsulates Xai's aspiration for Grok to achieve deep and comprehensive understanding across various domains.
Elon and the team detail the rapid progression of Grok's development over 17 months:
"Grok3 is, we think, an order of magnitude more capable than Grok 2 in a very short period of time."
– Elon Musk [00:28]
To support Grok 3's computational demands, Xai undertook the ambitious task of constructing its own data center:
"We had to build our own data center... it took us 122 days to get the first 100k GPUs up and running."
– Igor, Lead Engineering at Xai [04:17]
Grok 3 has demonstrated exceptional performance across various benchmarks:
"Grok3 across the board is in a league of its own... it's number one across the board in this blind test."
– JB Pa, Leading Research [05:47]
Grok 3 achieved a remarkable Elo score of 1400, outpacing all other models in the Chatbot Arena blind tests.
The team showcased Grok 3's advanced reasoning capabilities through live demonstrations:
Physics Problem Solving: Grok successfully generated a Python script to plot a viable trajectory for a transfer from Earth to Mars and back, adhering to Kepler's laws.
"It's pretty close to what it looks like... This has got the Earth, Mars Hohmann Transfer on it."
– Elon Musk [21:12]
Game Creation: Demonstrated Grok's creativity by merging elements of Tetris and Bejeweled into a novel game, showcasing the AI's ability to innovate beyond scripted responses.
"We've seen the beginnings of creativity."
– Elon Musk [12:10]
Xai unveiled Deep Search, a next-generation search engine powered by Grok 3:
"Deep Search... really help you to understand the universe."
– JB Pa [24:15]
Grok 3 and its advanced features are set for public release with a phased approach:
"We are launching an AI gaming studio at Xai... We're launching it tonight."
– Elon Musk [23:53]
In an engaging Q&A session, Elon and the team addressed various audience inquiries:
Voice Assistant:
"Grok Voice Assistant is coming out as soon as possible... it's being polished now."
– Igor [33:48]
API Integration:
"Grok3 API with both the reasoning models and deep search is coming your way in the coming weeks."
– JB Pa [34:04]
Conversation Memory:
"Absolutely, we're working on it right now."
– Igor [36:15]
Customization and Personalization:
"You can have one Grok or many Groks... people will probably have more than one."
– Elon Musk [36:38]
Open Sourcing:
"Once Grok3 is mature and stable, we'll open source Grok 2."
– Elon Musk [36:59]
Future Challenges and Developments: Discussed the complexities of maintaining the massive GPU cluster and plans for scaling beyond current capabilities.
"We're already started work on the next cluster which will be about five times the power."
– Elon Musk [43:03]
Elon Musk concluded the episode by emphasizing the relentless pursuit of innovation and excellence:
"We will improve it rapidly almost every day. In fact, every day I think it'll get better."
– Elon Musk [32:50]
The episode highlighted Grok 3's groundbreaking advancements, Xai's strategic infrastructure developments, and the promising future of AI-driven solutions in understanding and interacting with the universe.
Notable Quotes:
"In order to understand the nature of the universe, you must absolutely, rigorously pursue truth or you will not understand the universe."
– Elon Musk [00:28]
"The most difficult part was orchestrating the GPU cluster to handle dramatic power fluctuations."
– Elon Musk [38:44]
"Grok is whatever you want it to be... people are going to fall in love with Grok."
– Elon Musk [35:27]
Final Thoughts:
This episode of Elon Musk Thinking provides an in-depth look into the monumental efforts behind Grok 3, showcasing not only technological prowess but also visionary leadership aimed at unlocking the mysteries of the universe. For enthusiasts and skeptics alike, Grok 3 represents a significant leap forward in artificial intelligence, setting the stage for future innovations that blend creative problem-solving with profound scientific inquiry.