Transcript
A (0:00)
Welcome to the podcast, I'm your host, Jaden Schaefer. Today on the show I want to talk about a bunch of AI advancements that have been coming out of Google. Number one, they just released Gemini 3.1 Pro. So this is basically a huge upgrade to their, their flagship model and it's breaking a whole bunch of high scores on a bunch of different benchmarks. So I want to break that down, including some really important stuff for AI agents. And then they've also had a big YouTube TV expansion where they're putting a bunch of AI into that program as well. I love Google because anytime they have an update, update on, you know, basically their, their base Gemini model, you get to see a whole bunch of implications across all of their products because Gemini is built into Gmail and YouTube and every other, you know, Google Drive and every other Google product and there's so many of them. So I'm excited to get into everything going on with Google. Before we do, if you want to try all of the Latest models from OpenAI Anthropic Google Grok 11 Labs for audio, tons of other cool image models, go check out AI Box AI. We have a platform that has a playground so it's like talking to chat GBT but you get access to over 50 of the top AI models for 8.99amonth. And if you want to vibe code different AI apps and tools, if you're not a developer, we have a AI builder that we've also just launched the entire platform I have just redesigned painstakingly from scratch over the last, over the last few weeks and we've rolled this out so the whole thing is different. If you've tried it in the past, we've tried to simplify it dramatically and make this an amazing tool for non developers like myself to create incredible tools. So go check it out at AI Box AI. The whole reskin is officially live on this whole product. Okay, let's talk about what's going down with Google. So right now I think they're really trying to speed up how fast they push out updates to their AI model. There's a bunch of different places that they're doing this, but I think you have of course just the baseline Gemini model getting a lot better. I mean we just saw in the past they just rolled out music inside of their inside of Gemini and a whole bunch of other cool upgrades like that with the latest from Gemini. The one thing that I will say, so this is obviously the latest from their flagship ll, but this is not an actual general release. That everyone can try right now. This is their kind of a preview that they put out. So certain people, academics and you know, people that are testing this on benchmark can try it out, but it's not generally available to everyone. So a lot of people that are trying it right now are saying that this is a big upgrade. But again, when you kind of have these general releases, like if I'm being 100 transparent, Google usually and all of the AI model companies are going to give the early release to people that are big fans and are quite positive and supportive, right? Like if Google came to me and was like, hey, we'll let you test out Gemini 3.1 for a week before everyone else gets it. Let us know what you think. I think there'd be a pretty strong pressure for me to say good things about it and if I didn't, I wouldn't get early access next time. So I mean I'm not trying to say Google's cooking the books in any way, shape or form on the benchmarks, but I mean this is the reality of the situation. So they have a general release out. A lot of people are excited about it, saying that it is a big upgrade. The one thing that I will give Google a lot of credit for is that the last version, Gemini 3 came out in November. So here we are in February. We it's not that far, far in the distant future and we already have the, the next model that is out. Even if people are testing and it's not fully released to everyone, it is out on the market. And so I'm really excited by that and I'm, you know, I'm stoked that Google has been pushing so hard and has made a big upgrade in on, on speed specifically. Now the other thing that I think is funny, which is completely not important, but the naming convention, I appreciate that they're going Gemini 3.1 kind of like OpenAI is doing GPT 5.1, 5.2 because these aren't like completely new trained models. They just have fine tuned it a and the one thing that I'll say is some people will be like, well why are they having to fine tune it? Why don't they just do that before they release it? I appreciate getting Gemini 3 Pro out, getting to play with that for a couple months and then having them come up with all of their tweaks and a lot of times when they make these tweaks, I'll say when they come up with like let's say Gemini 4, a lot of the tweaks that they put into Gemini 3.1, they're kind of these like software integrations, these upgrades. If you've noticed with ChatGPT recently, when you ask it some sort of math question actually pulls up a literal calculator inside of Chat GPT where it computes your question for you and shows you on a calculator. Those are kind of like nice things that you would see for example, in GPT 5.2, like that calculator feature came out. And what's nice is once GPT 5.3 comes out or even when GPT 6 comes out, that calculator tool is getting built in. So what's exciting to me is when they make these incremental updates, the 3.1, 3.2, 3.3, all of the little features, all of the little nice to haves and the things that they're kind of building in are going to get rolled over when the whole model gets an entire overhaul. So that's what I'm excited about. They were sharing a bunch of the results from some independent evaluations, a bunch of the benchmarks, especially Humanity's last exam. This is kind of, it feels like the AI models cooked a lot of the benchmarks and just kind of beat them. And they weren't basically hard enough or built well enough for the AI models. And so now we've kind of come up with some more challenging ones. One of the more challenging ones is Humanity's last exam. Gemini 3.1 pro outperformed Gemini 3. I mean, obviously if it didn't, I don't think they'd be releasing it to us, but it did it by a huge margin. So the model's also coming up climbing on a bunch of real world performance leaderboards. This is what I think is actually the most important. A lot of the leaderboards where people are like, look, we like basically anytime these AI companies can, can test their own model on a benchmark, it feels like they are cheating, they're being scammy in some way. And I mean, I don't mean to call the kettle black, but I feel like anthropic, Google and OpenAI have all been caught doing some form of this over the last few years. So I don't really put as much stock in, you know, because basically those really screenshots where they're like, we scored, you know, 72% on this exam and they have a screenshot where it's like they, they skipped out a couple of the question probably didn't do good on anyways, I'm not saying this is Google, but there is an AI company that has done this. And so when it comes to these companies testing themselves, I trust them a lot less than the real world leaderboards. So some of those examples are when essentially they have side by side comparisons of their model versus another model and they just give, have people give blind, they vote blindly on which response they like better. And when a new model really starts crushing it on those type of leaderboards, I take stock in that because this is actual people blind testing saying that their model is better. So that that's great. One of these kind of real world leaderboards comes from a company called Merkur. Their CEO, who's Brendan Foody, said he's he was posting about this. He says that Gemini 3.1 Pro is now the number one company on they have a leaderboard called the Apex Agents Leaderboard. It's basically a benchmark that is designed to measure how well these AI systems handle professional knowledge based tasks. And he says that this is, I mean basically just showing how quickly this can move into a lot of the systems that agents are using to improve real work. So what's interesting to me is it feels like Google's putting a lot of stock in kind of this knowledge based tasks field. They're doing a lot with education and it seems like it's paying off in the benchmarks with this whole release. This is obviously really heating up the competition. OpenAI anthropic. Everyone's rolling out new systems and it feels like they're only months apart. The other exciting update from Google is that they are expanding where their AI shows up. So on the consumer side, YouTube is bringing the latest Gemini AI assistant to smart TVs. They're bringing it to their gaming consoles and they're springing it to their streaming devices. So I think previously this was like Gemini was basically just kind of on the mobile and on web and it was an experimental feature that they've now just added, which is basically letting viewers ask questions about what they're watching directly from their tv, which is honestly, it's kind of cool. Obviously you could see some of these features on maybe YouTube, on your phone or on your computer, but now you're going to actually start seeing these from your game console or from your TV or from your streaming device. So what's cool is users essentially can just click to ask. There's like an ask button next to the assistant. If they're watching something, they can ask questions. I mean, I would imagine this is useful for a TV show. Maybe you missed a couple episodes and you want to get the backlog. Maybe you fell asleep for 10 minutes and don't want to rewind. Or maybe you really just don't get it. Because sometimes I feel like when I'm watching a show with my wife, she understands at least 20% more about what's going on than I do. And maybe that's just a me problem. But I do think it would be useful rather than having to like whisper to her like wait, what did they say about that? And then getting shushed because she's trying to listen to what they're currently saying. It'd be nice to maybe just be able to ask on the side. I don't know, maybe this use case is only useful for me. In any case, you basically can use your remote microphone, ask questions more so than a TV show. One of the suggestions that they were saying that you could do was to ask about the recipe ingredients in a cooking video you're watching. Or you could get the mean behind song lyrics that you're listening to. I will say that this feature is currently only available to users that are 18 years old and older and they support English, Hindi, Spanish, Portuguese and Korean. I think right now YouTube is really trying to dominate and become the biggest screen in your house. This is the thing that they're, they're trying to have you watch everything on YouTube so they're adding as many features as they can. According to a report in April that came out last year from Nelson, YouTube is now 12% of all television viewing time, which is beating both Disney and Netflix, which is quite interesting. There's a lot of rivals that are trying to get into this kind of AI conversation, into kind of home entertainment. Amazon of course has Alexa plus on fire TV and they are able to let you do a lot of similar things where you can have conversations about shows, actors, scenes. Roku has upgraded their voice assistant to handle some like open ended questions about movies. Netflix is also testing its own AI powered search experience. So I think beyond just conversational prompts, YouTube is also layering AI into other parts of their TV ecosystem. They had a recent feature where they basically automatically enhance lower resolution uploads to full hd. This is really fascinating to me. I mean I know this costs a lot of, costs a lot for them, a lot of compute, but personally I think this is a great feature. Now why is this a great feature? Sometimes when you're watching a news clip or some sort of like live world event that's happening, I think YouTube is trying to become more and more popular for that Sometimes those are taken on low quality cell phones out of, you know, maybe people have less high quality phones in other parts of the world. They're uploading clips and if you could get those in auto enhance, which you know, AI is pretty good at, this could basically make them better and, and more digestible and easier for people to, to watch. So I think that's kind of cool. They also have a content our comment summarizer. They have an AI search result carousel which is trying to help people navigate content more efficiently. In January, YouTube also announced that creators are going to be able to generate shorts using AI created versions of their own likeness. And last week YouTube launch dedicated app for the Apple Vision Pro which lets users watch content in a theater sized virtual screen. So I think taking everything together, Google is definitely firing on all cylinders. Gemini 3.1 Pro is doing incredible on the benchmarks. YouTube is expanding their AI footprint and I think this is just showing that Google is, has a really broad strategy. Strategy. They're trying to build basically all of the state of the art foundational models and on top of that they're trying to deploy them across all of their different consumer platforms. And so I think that they have, have, I think they have the power to be the winner in AI. Even though it feels like OpenAI came out of the gate ahead of time and had a big lead, I feel like Google is going to either be slightly ahead of OpenAI, slightly behind them, tied with them. Google's the top leader and you can't count them out. They just have too big of a footprint when it comes to consumer software. We love all of the software they create and now that they've embedded their AI into it, I think Google's going to stay at the front. I love the competition in the market. I don't want OpenAI to run away with it. I don't want Anthropic to run away with it. I don't want Google's run away with it. So I apprec that we have a lot of really solid companies competing at a high level. I'm excited to see everything that I'm able to do with Gemini 3.1 once it's out and I'll let you guys know and keep you up to date. Now if you want to try it once it's publicly available, along with all of the other Google Gemini anthropic OpenAI models, you can test them all side, side by side and get rid of all of your subscription plans and keep everything in one place. Go check out AI box AI1 platform with access to all of your AI models for 8.99amonth. I'll leave a link in the description. Thanks so much for tuning in and I'll catch you in the next episode. I'll.
