Elons Xai Grok Gets 2X Faster After Devs Re-Write Code in 3 Days - The AI Podcast

Summary5 min read

The AI Podcast: Episode Summary Title: Elons Xai Grok Gets 2X Faster After Devs Re-Write Code in 3 Days
Release Date: November 19, 2024
Host: The AI Podcast

Introduction to XAI and Grok

In this episode of The AI Podcast, the host delves into the recent advancements made by XAI, Elon Musk's artificial intelligence venture. With a palpable sense of excitement, the host acknowledges Elon Musk's track record of building impactful companies while also expressing initial skepticism about XAI’s timing in the competitive AI landscape.

"I just felt like XAI was coming a little bit late to the party. It felt like Elon Musk has kind of missed his boat with OpenAI and kind of stepping away from that right before it went parabolic."
— Host [00:02]

Despite these reservations, the host reveals a growing admiration for the progress made by XAI's flagship AI model, Grok, emphasizing the significant enhancements introduced in the latest iteration, Grok 2.

Grok 2: Doubling the Speed

The core focus of the episode centers on the remarkable speed improvements achieved by Grok 2. The host explains that within a mere three days, XAI's development team successfully rewrote the inference code stack, resulting in Grok 2 operating twice as fast as its predecessor.

"Grok 2 mini is now 2 times faster than it was yesterday. In the last 3 days, Liam Zhang and Seed Malik rewrote the inference stack from scratch using SGLang."
— Igor Babushkin [04:15]

This swift enhancement not only boosts the model's speed but also slightly improves its accuracy, marking a substantial achievement without necessitating a complete retraining of the AI model.

The Role of SGLang in Code Optimization

A significant portion of the episode is dedicated to discussing SGLang, the programming language employed by XAI's developers to achieve these performance gains. SGLang, an open-source language with an Apache 2 license, is lauded for its efficiency in executing complex language model programs. Originating from collaborations between prestigious institutions such as Stanford University, UC Berkeley, Texas A&M University, and Shanghai's Jiaotong University, SGLang offers unparalleled throughput capabilities.

"SGLang can get up to 6.4 times higher throughput than existing systems. For Grok 2, that was able to get it two times faster, which is quite impressive."
— Host [08:30]

The adoption of SGLang by developers Liam Zhang and Seed Malik exemplifies how leveraging advanced programming tools can lead to significant improvements in AI performance with minimal resource expenditure.

Impact on AI Efficiency and Development

The host underscores the broader implications of XAI’s achievement, highlighting that such efficiency gains can be replicated across various AI models without the exorbitant costs typically associated with model retraining and infrastructure scaling. This breakthrough demonstrates a viable pathway for AI companies to enhance their offerings rapidly and cost-effectively.

"We're able to use technology to make these AI models more efficient and faster without having to do like a whole retrain and spend millions of dollars."
— Host [09:45]

This approach not only accelerates development cycles but also democratizes access to high-performance AI by reducing the barriers to entry related to financial and technical resources.

Broader Implications and Adoption of SGLang

Beyond Grok 2, the host mentions that SGLang is currently supporting a variety of other models, including Llama, Mistral, and Lava. These models are compatible with open weights and API-based frameworks, suggesting a versatile and scalable solution poised for widespread adoption in the AI community.

"There’s Llama, Mistral, and Lava, which are all compatible with open weight and API-based models. So this is going to be very interesting to see who adopts this next and how this goes."
— Host [12:00]

The potential for SGLang to become a standard tool in AI development is significant, given its proven ability to enhance performance across multiple models seamlessly.

Grok 2's Performance and User Reception

The episode also touches upon Grok 2's standing in the competitive AI market. According to the Limsys Chatbot Arena Leaderboard, an independent ranking system based on over 6,000 user votes, Grok 2 has secured the number two position, trailing only behind ChatGPT 4.0. This achievement underscores Grok 2's robust performance and growing user acceptance.

"After all of these upgrades, Grok 2 has now secured the number two spot after ChatGPT4.0 with an impressive score of 1293."
— Host [14:20]

Additionally, the integration of image generation capabilities positions Grok as a versatile tool, with the host personally opting for Grok’s image generator over alternatives like Dolly from ChatGPT due to its superior photorealistic output.

"The images seem photorealistic, so that's fantastic."
— Host [18:10]

Conclusion and Future Outlook

In wrapping up, the host expresses optimism about XAI's trajectory and the potential for further enhancements in Grok's performance. The host encourages listeners to monitor XAI's developments and to explore Grok's capabilities firsthand, highlighting its relevance and competitiveness in the rapidly evolving AI landscape.

"It's quite an impressive model. I've been quite impressed with it and especially with their new image generator."
— Host [20:35]

The episode serves as a comprehensive overview of XAI's latest advancements, emphasizing the significance of innovative programming solutions like SGLang in pushing the boundaries of AI performance and efficiency.

Key Takeaways:

XAI has significantly improved Grok 2's speed by 2X through a rapid code rewrite using SGLang.
SGLang, developed collaboratively by top universities, offers substantial throughput enhancements and is open-source.
Grok 2 not only performs faster but also achieves higher accuracy, securing the second spot on an independent chatbot leaderboard.
These developments highlight a scalable and cost-effective approach to enhancing AI models without extensive retraining.

Notable Quotes:

"Grok 2 mini is now 2 times faster than it was yesterday." — Igor Babushkin [04:15]
"SGLang can get up to 6.4 times higher throughput than existing systems." — Host [08:30]
"After all of these upgrades, Grok 2 has now secured the number two spot after ChatGPT4.0." — Host [14:20]
"It's quite an impressive model. I've been quite impressed with it and especially with their new image generator." — Host [20:35]

This episode provides valuable insights into how agile development practices and innovative programming languages can drive significant improvements in AI technologies, positioning XAI and Grok 2 as noteworthy players in the artificial intelligence domain.

Loading summary

Transcript1 lines

[00:02]
A
The team over at XAI continues to blast out new AI updates at a blistering pace. I'm actually very impressed. This was a company that, I mean, I would say, like, never underestimate Elon Musk and people love him or hate him, whatever, don't underestimate him. He's built a lot of big companies that have done a lot of impressive things. But at the same time I was. I just felt like XAI was coming a little bit late to the party. It felt like Elon Musk has kind of missed his boat with OpenAI and kind of stepping away from that right before it went parabolic. And so starting Xai, I wasn't 100% sure where it was going to go. I've been quite impressed with the progress that GROK has made. And this week something very interesting happened and that is that Grok 2, which is their big model, it got a huge speed bump, which is interesting because developers essentially rewrote all the code in three days and then got saw some massive speed bumps. I'm going to talk about how this happened, what it's looking like, what, you know, how they were able to do this. Before I get into that, I wanted to mention that my podcast course, you might have heard, I launched this about a week ago. Right now I'm trying to get more reviews and testimonials for it. So for this week only, I am doing, for AI chat listeners, I'm doing a discount code. Normally this thing's $300. If you're interested in starting a podcast, growing it, all the strategies I use to scale my podcast, over 4 million downloads will be available in that course. So go get it. If you're interested in it, do it this week. I'll obviously increase that in the future, but go get it. The link is in the description and the promo code is AI Chat. My only thing I ask is if you are going to take advantage of that to please shoot me over a review or testimonial on LinkedIn if you love the course. So let's talk about Grok. The thing that I think is really interesting here is that in order to get Grok, it's of course the LLM, it's on X, so you have to have at least the $8 a month subscription, which to be honest, not bad because chat GPT is $20. So if you just get an X premium subscription or whatever, you do get grok and it's essentially the same thing, you know, Chat GP is slightly better. I won't discount that, but GROK has just added image generation which is impressive and now this new speed boost. So they have kind of like chat GPT with GPT mini or what, 400 mini or whatever. They have Grok 2 and then Grok 2 mini which is a little bit less powerful but it's faster. So you can use both of those. But this has all been due to something that has happened over the last three days where some developers at XAI were able to rewrite the inference code stack completely in three days. So a lot of this comes from a tweet over on X by Igor Babushkin who posted and said Grok 2 mini is now 2 times faster than it was yesterday. In the last 3 days Im Zagan and Maliki Saeed rewrote inference stack from scratch using Sglang. This has also allowed us to serve the big Grok 2 model which requires multi host inference at a reasonable speed. Both models didn't just get faster, but also slightly more accurate. Stay tuned for further speed improvements. What I think is really interesting here, this is something that a lot of these AI models can do. It's not like they had to retrain the whole model or spend hundreds of millions of dollars to do that. They essentially just rewrote the inference stack using SGLang, something that, you know, virtually. Yeah, a lot of different people could do. It's a. Sg lang is on GitHub, so very, very interesting. So the two developers that did that are Liam Zhang and Seed Malik and they were able to use SG Lang, which is actually an open source, Apache 2 licensed. It's, you know, a highly efficient system for, for essentially what it does is it executes complex language model programs, but it can get up to 6.4 times higher throughput than existing systems. And so for Grok 2 that was able to get it two times faster, which is quite impressive. SG Lang was actually developed over at Stanford University a bunch of researchers and the University of California, Berkeley, Texas A and M University and then Shanghai's Jiaotong University. So all of these big universities working together, this is something that I love. Essentially we're able to use technology to make these AI models more efficient and faster without having to do like a whole retrain and spend millions of dollars and Data Centers and GPUs and all the crazy infrastructure that's required. They're just rewriting code and it's kind of an undertaking. But two developers for GROK did this in three days, which is, I mean obviously these are superhero developers, but two developers, three days and this thing's two times as fast. That is absolutely phenomenal and sets an amazing precedent. So great work over to the team there and I hope to see this rolled out other places. So the same system though is currently supporting a bunch of different models. There's Llama, Mistral and Lava, which are all compatible with open weight and API based models. So this is going to be very interesting to see who adopts this next and how this goes. So the last thing I wanted to bring up is, you know, what are the results? How is Grok too and Grok too many doing? Does anyone even use this? So there is a third party thing called Limsys Chatbot Arena Leaderboard and it's essentially ranking chatbots. People vote on them and rate their performance. It's independent, it's third party. And impressively, after all of these upgrades, Grok 2 has now secured the number two spot after ChatGPT4.0 and they have a really impressive score of 1293. This is based off of over 6000 votes of people testing and reviewing this. And so yeah, very, very impressive. They made some big updates and this, it seems like this is getting better. The last thing that I wanted to mention is that it didn't just get faster, it also got better. They said that the responses were slightly better, so I thought that was also very interesting. It'll be interesting to see how this changes and if other people adapt to this in the future. So definitely a company to keep your eye on and I'll keep you up to date on everything happening with Grok. If you haven't tried it already, I would highly recommend making an account on X or you know, getting it if you don't have one and giving Grok a try. It's quite an impressive model. I've been quite impressed with it and especially with their new image generator. I think their new image generator is better than what you get out of Dolly on ChatGPT. So I'm exclusively having my having my production team use it for thumbnails and things like that instead of Dolly from ChatGPT, which feels a little too cartoony. The AI generator on Grok, which is an API they have on there, but it's much better. The images seem photorealistic, so that's fantastic. If you enjoyed the episode today, I would love it if you could leave a review for the podcast. I hope you all have an amazing rest of your day. And I also wanted to mention that if you are interested in ways to make money with AI or start a side hustle with AI, you can join The AI Hustle School Community this is a school community I launched with the co host of AI My AI Hustle podcast, Jamie. We record exclusive content every single week talking about different ways that we are personally making money from AI tools. Last week we released an episode, or I guess this week we posted an episode about the Amazon Influencer program and a new initiative that they're doing. Last year I made $12,500 from this initiative in about three days of making videos. They had an incentive program and they're running the incentive program again this year. So if you're interested in making money from AI and AI tools, go and join the community. It's $25 a month. We'll increase it to 50 or $100 a month in the future, but for now it's $25 a month. We would love to have you as part of the community. You can get a link in the description to the AI Hustle School community. Hope you all have an amazing rest of your day.