Transcript
A (0:00)
In what I view as an absolutely wild turn of events for AI, Alibaba has come out with a brand new way of generating high quality AI model responses. And this isn't something that you've ever heard before, so something they just dropped a research paper on and it is called Zero Search. Essentially what it's doing is allowing an AI model to essentially Google itself, but it's not using any sort of AI model and it's cutting training costs by about 88%. So that's the big, like the big headline is this is cutting training costs a ton. I expect to see a lot of AI models essentially copy this template, but this is absolutely fascinating. So researchers out Alibaba came up with this. We're going to be diving into all of this. Before we do, I wanted to mention that my startup AI Box is officially launched. We have our beta at AI Box AI for our Playground, which essentially allows you to use all the top AI models, text, image, audio, in the same chat for $20 a month, so you don't have to have subscriptions to everything. For $20 a month, you can access all the top AI models from Anthropic, OpenAI, Meta, Deep Seek, 11Labs for audio, like all of these top ones, Ideogram and stuff for image, and you can chat with them all in the same chat. One of the features I love about Playground is the ability to, to ask a question to a certain model and then rerun the chat with another model. So a lot of times I'll, you know, get ChatGPT to write a document for me or help me with an email or change some wording and I'm like, I just don't like the tone of that. I rerun it with Claude, I found a better result. Or sometimes I'm like, you know what, I want it to be a little bit edgier. I run it with grok, so you have all the different options there, and then you have a little tab where you can open up all of the responses side by side and compare them, see which one you like the best. So if you're interested, check it out. AI Box AI. The link is in the description. All right, let's get back to what's going on over at Alibaba. So this new technique they've unveiled, like I mentioned, it's called Zero search, and essentially it is allowing them to develop what are they calling advanced search capabilities. But essentially what they're doing is they're just simulating search result data. So, like, you ask it a question and it's creating a simulated Google response page where it's literally generating like. So when you do a search on Google and you get 20, you know, links to websites that you could go look at or whatever, it's, it's like generating 20 fake websites or AI generated websites that it thinks would be, you know, commonly shown for that question. And at first I was like. And then essentially it, it has the AI model run through, it has an algorithm, it picks which ones are high quality and low quality pictures, which ones are the best responses. And this is essentially helping it to give you a good answer. And this is so fascinating to me. At first I was like, why would, like, why would they do this? This seems so weird. You know, why are you generating multiple results? Why do you have to generate like an AI model? It's essentially just the latest addition in a way to, they're accomplishing a couple things. Number one, higher quality results, right? It's kind of like when we came up with chain of thought or we told it to walk through its thought process, all of a sudden it started getting higher quality results. This is really cool because it's like it's generating 20 pages and it's going through and scraping and looking at the 20 different results and it's determining what the best answer is. So it's like it's generating the same thing kind of 20 times. So you're getting better responses there. But the other interesting thing they're saying is they're like, this replaces having an expensive API to Google Search. So Google Search gives you an API and if you want to train an AI model off of, you know, all the data on the Internet, you just grab the Google API, you run it through and you can train your model off of, you know, all the content on the Internet. But that is really expensive and you're paying Google a ton of money for that. So they've essentially replaced that Google API with synthetic data. It sounds crazy, it sounds impossible, but it's not actually that far off. And the interesting thing about this is that because, sorry, because these AI models already have all of the data in the, you know, in the, in the whole Internet pretty much. They've already sucked, slurped up all the data from Wikipedia and all the data sets that they can grab, they really have all the responses already. So if they've already went and scraped everything from Google, they don't need to re scrape it again just because they're doing a new model training. They can use synthetic data from an old model to essentially create new data to Train on. So it sounds kind of crazy, but this is what they said specifically about it. They said reinforcement learning training requires frequent frequency, frequent rollouts, potentially involving hundreds of thousands of search requests, which incur substantial API expense and severely constrained capability. To address these challenges, we introduced Zero Search, a reinforcement learning framework that incentivizes the search capabilities of LLMs without interacting with real search engines. This is, this is just so fascinating to me, such an interesting concept. And what they found while they were doing this is that this is actually outperforming Google. So one thing that they also mentioned, they said our key insight is that LLMs have acquired extensive world knowledge during large scale pre training and are capable of generating relevant documents to a given search query. The primary difference between a real search engine and a simulated LLM lies in the contextual style of the returned content. So like they mentioned, they already have all the data from their pre training and when they're actually going to train it, they don't want to go query again Google and pay all that money all over again to the thing. So like how good is the quality of the output? This was kind of my big question and I was blown away. So they did a bunch of experiments. They did seven different kind of question answer data sets and zero search. Their new method not only match but often was actually better than the performance of, you know, a model that had real search engine data. So they have a 7 billion parameter retrieval model which is not very huge and it actually achieved the same performance compared to a Google search. So when you go do a search on Google, they're just saying like the quality of the response that you get or the responses that you get, those first 20 links, the quality of the information combined on that was the same quality of what the 7 billion parameters model could do. So this kind of smaller model and then they bumped it up a little bit and they had a 14 billion parameter model which still isn't like the biggest model. I think Meta has like a 500 billion param or 400 billion parameter model might be their best. So like there's way bigger models, right? But their 14 billion parameter model actually outperformed the Google search. So 7 billion parameters, they were on par with the Google search engine with NLM with Google search and 14 billion parameters was better. So the cost savings are absolutely huge. With about 64,000 search queries using Google Searches API that would cost them about $586. So when they're using their 14 billion parameter model and they're just simulating with an LLM on, you know, a 100 GPUs, it costs about $70, so 580 to $70 on this training 10. That is an 88% reduction. In their paper they said, quote, this demonstrates the feasibility of using a well trained LLM as a substitute for real search engines in reinforcement learning setups. And I would argue we'll get to the point where it replaces search engines altogether like in a real literal way. We're seeing ChatGPT pretty much do this. People are just using ChatGPT instead of Google. But I think like the need for Google will be gone as all the data on Google is now sucked into these and as they get better and better at space out the data and not hallucinating and giving it in a real way like Google in the way we see it won't really need to exist and send people to places. Now I know what you're thinking. You're like, well how could you possibly replace Google? There's all this new information coming out. This article for example, is new information that came out that is not in their model but it's in Google. And so I think there's always going to be a place for quote, unquote news, new information. You probably are going to need like an API to wherever that news or new information breaks, which is like social media which of course Facebook's completely locked down, so that's off except for I guess Meta has access but then you have something like Twitter or Reddit. So I think Twitter and Reddit and maybe even Twitter more because it's got a lot of first hand like journalism, video kind of stuff. So the Twitter, slash X, whatever you want to call it, I think that data set is incredibly valuable. And so I think GROK is going to do very, very well in this new world. They could essentially create their own search engine which just ties information on Grok, which will link out to news articles and other things. So like they really have everything you need. And then of course news articles is kind of the other thing you kind of want like news. And you see OpenAI is obviously aware of this because they're making all these different deals with Axel Springer and all these different, you know, all of these different news organizations to get their data essentially. So journalists making all this, all of these new news articles and stuff is great, but also oftentimes they're grabbing it from Twitter. So it's kind of like I think a Twitter and news combo tied to an LLM. You just essentially don't need Google anymore. You don't need that API, you can run without it. And for companies like Meta that have access to Facebook, they probably are just good to go on their own because users are sharing news, they can grab what's trending there and add it to their LLM. Boom, they're good to go. And then of course Twitter, where a lot of stuff is getting uploaded firsthand, they should be good. Reddit could maybe even make a play or they're licensing their stuff to Google to do stuff. So that's kind of, I think the partnership is probably going to be between Reddit and Google. But this is fascinating. This is completely shifting the way we are looking at information for better or for worse, because I'm sure tons of people with websites that have been scraped and they're no longer, you know, their information is no longer needed because it's been scraped and now it's in there are unhappy about it. So it's going to be interesting to see where this goes. But very fascinating. I've been blown away by the cost savings. I've been blown, blown away by the way they're able to outperform Google on this. So this is a very, very interesting tool. Coming out of Alibaba, a fascinating new training concept. Thank you so much for tuning into the podcast today. If you enjoyed it, make sure to leave a rating and review. And if you are looking for a way to cut down on your 20 different subscription costs, different AI models, check out AI box AI we have a ton of exciting new features coming soon and we have access to the top 30 AI models all on there that you can use for $20 a month. So a ton of fun. Thank you so much for tuning in and I will catch you next time.
