
Loading summary
A
It feels like every time you check the news there's another AI update.
B
Right. It's hard to keep up.
A
Yeah. Like you're constantly worried you're going to miss something big.
B
Yeah, exactly.
A
But, well, we're here for you because today we're doing a deep dive.
B
That's right.
A
To help cut through all that noise and get to the like, the most important AI stuff.
B
The real meat. Yeah. So you can like stay, stay up to date but not get totally overwhelmed.
A
Yeah, exactly. So to do this, we've got the latest AI Deep Dive newsletter. So we're really going to try to bring together the most important stuff from these.
B
That's right.
A
So there's four main things we're going to cover.
B
Okay.
A
Google's got like a brand new AI Reasoning Model. ChatGPT has upgraded its image generation, which is a really big deal.
B
Huge.
A
Microsoft is adding AI powered research tools to Copilot.
B
Yep.
A
And then Earth AI is using his algorithms to discover critical minerals.
B
That's a fun one.
A
Yeah, that is a fun one.
B
Yeah.
A
Okay, so let's jump right into Google's announcement. Okay, so they've unveiled this next gen family of AI reasoning models, the Gemini 2.5 family.
B
2.5.
A
But the really key thing here is they can pause to think before answering questions.
B
Yeah, that's really the big story here.
A
Yeah. What do you think about that?
B
Well, it's definitely a different approach to how they've done it in the past. So the first model they're releasing is the Gemini 2.5 Pro experimental.
A
Okay.
B
Google's calling it their like, quote, most intelligent model yet.
A
Oh, wow.
B
And it's also multimodal. Multimodal, meaning it can handle text, images and more.
A
So it's available like right away?
B
Yeah, you can get it right now through Google AI Studio so developers can start playing around with it.
A
Cool.
B
And it's also available for anyone who subscribes to their gem advance plan.
A
Okay, and how much does that cost?
B
20 bucks a month.
A
Okay.
B
Pretty affordable.
A
Yeah. And it sounds like this isn't just like a one time experiment for them.
B
No, no, not at all.
A
Google's saying that all their AI models are going to have reasoning capabilities.
B
Yeah. They want this to be like a core part of how their AIs work.
A
That's really interesting.
B
It is. And the AI deep dive newsletter pointed out that this whole push for reasoning models really started after OpenAI released their O1 model.
A
Oh, right, right.
B
Back in September 2024.
A
Yeah, that was.
B
It was.
A
So if you look around the industry There are a bunch of other companies doing the same thing.
B
Oh, yeah, definitely. Anthropic, Deepseek, xai, they're all working on these reasoning models.
A
So what is it that makes these reasoning techniques so valuable?
B
Well, basically, they allow the AI to, like, take more time and use more resources to think through problems. So ideally, they can give you better, more accurate answers.
A
So it's almost like we used to do in math class, you know, like, show your work.
B
Yeah, exactly. Show your work.
A
So the AI has to kind of, like, show its work?
B
Yeah, they have to show how they got to the answer.
A
Yeah. And the newsletter said that this is, like, especially good for math problems.
B
Yeah, it really shines in those kinds of complex tasks.
A
And coding.
B
Coding, too. Yeah. And a lot of people think that these models are going to be really important for developing, like, fully autonomous AI agents.
A
Okay, now explain that to me.
B
So an AI agent is basically a system that can handle tasks on its own without much human intervention. Like, imagine a robot that can do your laundry, fold it, and put it away all by itself.
A
Okay, so how does that relate to these reasoning models?
B
Well, for an AI to be truly autonomous, it needs to be able to think for itself.
A
Right.
B
Make decisions, solve problems.
A
Okay.
B
And that's exactly what these reasoning models are good at.
A
So basically, we're trying to teach the AI to think like we do?
B
In a way, yes.
A
Okay. But, you know, this all sounds great.
B
It does.
A
But I imagine there's a downside.
B
Well, yeah, there is.
A
Like, what?
B
Running these models takes a lot of computing power, which means it's more expensive.
A
So is it, like, a lot more expensive?
B
It can be, yeah. It depends on how complex the model is.
A
But it sounds like it's worth it.
B
Well, it depends on what you need it for.
A
Okay.
B
If you need an AI that can handle really complex tasks, then, yeah, it's probably worth the extra cost.
A
Okay, so back to Google.
B
Yeah.
A
This Gemini 2.5 Pro is like, their big attempt to beat OpenAI, right?
B
Yeah, it seems like it.
A
They had that thinking version of Gemini back in December, Right, Right. But this is, like, their big push.
B
It seems that way, yeah.
A
So Google's actually released some performance results.
B
They have. Yeah.
A
So on this benchmark called Aidor Polyglot, which tests how well an AI can edit code, Gemini 2.5 Pro scored 68.6%. And they say that's better than OpenAI, anthropic and deepseek.
B
That's pretty impressive.
A
Yeah, it is. But on another benchmark called Swebench Verified, which is more about, like, overall software development. Gemini 2.5 Pro did better than OpenAI's O3 mini and Deepseek's R1, but Antropics Claude 3.7 Sonnet actually did better.
B
Interesting.
A
Yeah. So it's not like, the best across the board.
B
Right.
A
But it's good at certain things.
B
Yeah, it's definitely up there with the top models.
A
Then there's this benchmark called Humanity's Last Exam.
B
What?
A
Yeah.
B
What is that?
A
I know, it sounds kind of ominous.
B
It does.
A
It's like a multimodal test with thousands of questions on all kinds of subjects.
B
Wow.
A
Math, humanities, science, you name it.
B
And how did Gemini do on this?
A
Well, Google says it got almost 19%.
B
Okay.
A
And that's apparently better than most other leading models.
B
So it's good at a lot of different things.
A
Yeah, it seems like it. Okay, now this part really blew me away. Okay, the context window for Gemini 2.5 Pro is a million tokens. Wow. The newsletter says that's like 750,000 words.
B
That's longer than Lord of the Rings.
A
I know, it's crazy.
B
That's a lot of information.
A
And they're going to double it.
B
Double it?
A
Yeah, to 2 million tokens.
B
So what does that actually mean?
A
So, basically, the AI can process way more information at once. Okay, so think about it. A researcher could feed an entire research library into the model.
B
Wow.
A
And then ask it to, like, connect.
B
The dots, find things that humans might miss.
A
Exactly.
B
It's pretty amazing.
A
Yeah. It's not just about reading more. It's about being able to handle these complex analytical tasks and stay on topic for these really long interactions.
B
Yeah, that's really important.
A
Okay, one last thing about Gemini.
B
Okay.
A
Google hasn't said how much it's going to cost to use the API.
B
Oh, okay.
A
But they said they'll release that info soon.
B
Okay. So stay tuned.
A
Exactly. Stay tuned. Okay, let's move on to something a little more visual.
B
All right.
A
ChatGPT's image generation just got a huge upgrade. So Sam Altman, OpenAI CEO, announced this himself.
B
Oh, wow.
A
And it's their first major update in over a year.
B
That's a long time.
A
I know. So ChatGPT can now, like, create and modify images directly using their GPT4O model, which is a big deal because before GPT4O was just for text.
B
Right.
A
And they used Daily3 for the images.
B
Yeah.
A
But now it's all integrated.
B
So GPT4O can do it all now.
A
Exactly. And this is available right now.
B
Really?
A
Yeah, in ChatGPT and Sora for Pro Plan subscribers, which is $200 a month, but they're going to roll it out to everyone else soon.
B
That's good.
A
Yeah, plus users. Free users and developers using the API.
B
Cool.
A
So OpenAI says that this new GPT4O image generation produces more accurate and detailed images.
B
Okay.
A
And that it can edit existing images too.
B
Oh, wow.
A
Even ones with people in them.
B
So you can like, change people's clothes or something?
A
Yeah. And do all kinds of transformations and inpainting.
B
Inpainting?
A
Yeah, it's like where you seamlessly fill in or modify parts of an image.
B
Oh, I see it.
A
Yeah. So, you know, it does make you wonder, like, how much data they're using to train these models.
B
That's a good question.
A
So the Wall Street Journal, according to the newsletter, said that OpenAI trained GPT4O on a mix of publicly available data and proprietary data.
B
Proprietary data?
A
Yeah, like from partnerships.
B
Okay.
A
So they've got to deal with Shutterstock, for example.
B
Oh, I see.
A
But there's not a lot of transparency around training data.
B
Yeah, that's pretty common in the AI industry.
A
Yeah, A lot of companies consider it like a competitive advantage and they're worried about lawsuits.
B
Makes sense.
A
Yeah. Intellectual property lawsuits.
B
Right.
A
So it's good that OpenAI at least, like, acknowledges respecting artists rights.
B
Yeah, that's important.
A
They say they don't want their models to copy living artists.
B
Okay.
A
And they have a way for creators to opt out of having their work used.
B
Oh, that's good.
A
Yeah. They also respect requests to block their.
B
Web scraping bots, so they're trying to be responsible about it.
A
Yeah. It seems like this whole thing reminds me of when Google tried to do native image output. Oh, right, with Gemini 2.0 flash.
B
Yeah, but they had some problems with that, didn't they?
A
Yeah, their guardrails weren't great.
B
So what happened?
A
So they ended up removing watermarks.
B
Uh oh.
A
And generating copyrighted characters.
B
Oh, no.
A
Yeah, it wasn't good.
B
Yeah. It just shows how difficult it is to control these powerful tools.
A
It really does. Okay, moving on to Microsoft.
B
Okay.
A
They're adding these new AI powered tools to Microsoft 365 copilot called researcher and Analyst.
B
Okay.
A
And they're all about like, deep research.
B
Key research. Okay.
A
But this is actually part of a bigger trend, you know, with ChatGPT, Gemini and Grok.
B
Right.
A
They're all incorporating these deep research agents.
B
And these agents use those reasoning models.
A
To think through problems more thoroughly.
B
Okay.
A
So researcher combines OpenAI's deep research model with like what they call advanced orchestration and deep search capabilities.
B
So what can you actually do with it?
A
Well, they say you can use it to develop a go to market strategy.
B
Okay.
A
Or create a client report.
B
That's pretty impressive.
A
Yeah, it is. Then there's Analyst, which uses OpenAI's O3 mini reasoning model.
B
Okay.
A
And it's specifically for data analysis.
B
Okay.
A
So it uses, like, an iterative approach to problem solving.
B
Iterative.
A
Yeah. So it means it refines its thinking through multiple steps.
B
Oh, I see.
A
Yeah. And it can run Python scripts for.
B
Like, complex data queries.
A
Exactly. And this is important. It shows you its work.
B
It shows you how it got to the answer.
A
Exactly. So you can see the steps it took.
B
That's good for transparency.
A
Exactly.
B
So. So what makes Microsoft's tools different?
A
Well, the newsletter pointed out that they can access both your work data and the Internet.
B
Oh, wow.
A
So they've got connectors for things like Confluence, ServiceNow, and Salesforce.
B
So it can pull in information from all over the place.
A
Exactly.
B
That's pretty powerful.
A
It is. But there's still the problem of hallucinations.
B
Right. The AI can still make mistakes.
A
Yeah. Even these Advanced models, like O3 mini, can get things wrong.
B
So you can't just blindly trust the results.
A
Exactly. You still need human oversight.
B
So is this stuff available now?
A
Well, they're starting a program called frontier.
B
Okay.
A
For Microsoft 365 copilot customers. So they can get early access starting in April.
B
So they're kind of testing it out.
A
Yeah. And get feedback before they release it to everyone.
B
That's a good idea.
A
Okay. Last story.
B
Okay.
A
This one's about a startup called Earth AI.
B
Okay.
A
And they're using their algorithms to find critical minerals.
B
Oh, wow.
A
Yeah. In Australia.
B
Interesting.
A
In places that traditional mining companies have missed.
B
Really?
A
Yeah. It's kind of like cobalt did with copper in Zambia.
B
Oh, right, right.
A
Using AI to find new deposits.
B
So AI is changing the way we find minerals.
A
It really is.
B
That's pretty cool.
A
It is.
B
So what exactly did Earth AI find?
A
So they found copper, cobalt and gold.
B
Wow.
A
In the Northern Territory.
B
Okay.
A
And then silver, molybdenum and tin.
B
Wow.
A
In New South Wales.
B
And where is that?
A
About 310 miles northwest of Sydney. So Earth AI actually came out of this guy's doctoral research. Roman Tesluk.
B
Okay.
A
Of the University of Sydney.
B
Okay.
A
And he found this national archive with all this mining data.
B
Mining data?
A
Yeah. Going back to the 1970s.
B
Wow.
A
But nobody was really using it.
B
Really?
A
Yeah. So he thought, why not build an algorithm to learn from all this data.
B
To see if they could find new deposits.
A
Exactly.
B
That's a great idea.
A
So Earth AI started out as a software company making these predictive models.
B
Okay.
A
But they had trouble convincing the mining companies to buy in.
B
Oh, why is that?
A
Well, you know, mining's a pretty conservative industry. They like to stick to what they know.
B
I see.
A
So Earth AI decided to build their own drilling equipment.
B
What?
A
Yeah, to prove that their predictions were right.
B
Wow. That's a big step.
A
Yeah, it is.
B
So did it work?
A
Well, they got into Y Combinator.
B
Okay.
A
In spring 2019.
B
Okay.
A
And then they raised $20 million in series B funding.
B
Wow.
A
In January.
B
So people are starting to believe in them.
A
Yeah, it seems like it. So their algorithms are trained to scan, like, large areas really quickly.
B
Okay.
A
To find these overlooked deposits.
B
So it's a different approach than some other companies are taking.
A
Yeah, it's more about, like, finding the hidden gems.
B
That makes sense.
A
Okay, so that's it for our deep dive today.
B
We covered a lot of ground.
A
We did.
B
But hopefully you feel more informed about the most important AI developments.
A
Exactly.
B
Without feeling overwhelmed.
A
Yeah. So, to recap, we talked about Google's Gemini 2.5, ChatGPT's upgraded image generation, Microsoft's new research tools, and Earth AI's mineral discoveries.
B
It's a lot to think about.
A
It is. So, as AI models get better at reasoning and are used in more fields, how do you think this is going to change our relationship with technology?
B
That's a great question.
A
Yeah. What new possibilities will we see?
B
And challenges, too.
A
Exactly.
B
It's an exciting but also kind of uncertain time.
A
It is.
B
But definitely something to keep an eye on.
AI Deep Dive Podcast Summary
Episode: Google’s Gemini 2.5, ChatGPT’s New Visual AI, and Earth AI is Finding Hidden Minerals
Release Date: March 26, 2025
Host: Daily Deep Dives
In this episode of the AI Deep Dive Podcast, hosted by Daily Deep Dives, the hosts delve into the latest advancements and applications in the field of artificial intelligence. Covering groundbreaking developments from industry giants like Google, OpenAI, and Microsoft, as well as innovative startups such as Earth AI, the episode provides listeners with an in-depth analysis of how AI is reshaping various sectors. The discussion is enriched with insightful quotes and detailed explanations, ensuring that even those unfamiliar with the latest AI trends can grasp the significance of these advancements.
Unveiling Gemini 2.5
The episode opens with the hosts discussing Google’s latest AI reasoning model, Gemini 2.5, introduced at [01:02]. Speaker A notes, “Google's got a brand new AI Reasoning Model. ChatGPT has upgraded its image generation, which is a really big deal” ([00:37]). This model is touted as Google’s most intelligent AI yet, boasting multimodal capabilities that handle text, images, and more ([01:37]).
Key Features and Availability
Gemini 2.5 Pro, an experimental variant, is available through Google AI Studio and for subscribers of the Gem Advance Plan at $20 per month ([01:50]). The hosts highlight that Google intends to integrate reasoning capabilities across all its AI models, aiming to make reasoning a core component of their AI systems ([02:03]).
Industry Context and Significance
The push for enhanced reasoning models follows OpenAI’s release of the O1 model in September 2024, sparking a competitive surge among companies like Anthropic, Deepseek, and XAI ([02:22]). Speaker B elaborates, “Running these models takes a lot of computing power, which means it's more expensive” ([04:01]), addressing the challenges associated with deploying such advanced systems.
Performance Benchmarks
Google has released performance metrics for Gemini 2.5 Pro:
Innovative Context Window
One of the standout features is Gemini 2.5 Pro’s context window of one million tokens ([05:39]). This translates to approximately 750,000 words, allowing the AI to process extensive information bundles. The hosts express amazement, with Speaker A exclaiming, “So, the AI can process way more information at once” ([06:08]).
Future Enhancements and API Availability
Google plans to double the context window to two million tokens, enabling even more comprehensive data processing ([06:06]). However, details regarding the API’s pricing remain undisclosed, with Google promising to release this information soon ([06:38]).
Major Update Announcement
Shifting focus to OpenAI’s ChatGPT, the hosts discuss the upgrade to its image generation capabilities announced by CEO Sam Altman ([06:50]). This marks ChatGPT’s first significant update in over a year ([06:59]).
Integration of GPT4O
ChatGPT now integrates GPT4O, enabling the model to create and modify images directly. Previously, GPT4O was limited to text, but the new update allows for seamless image generation and editing ([07:03]). This feature is currently available to Pro Plan subscribers at $200 per month and will soon extend to free users and API developers ([07:25]).
Capabilities and Ethical Considerations
The enhanced image generation boasts more accurate and detailed visuals and introduces capabilities like inpainting, allowing users to modify specific parts of an image seamlessly ([07:48]). Speaker A raises concerns about the data used for training, noting that OpenAI utilizes a mix of publicly available and proprietary data ([08:09]).
Respecting Intellectual Property
OpenAI has emphasized respecting artists' rights by allowing creators to opt out of having their work used and by blocking web scraping bots ([08:48]). This reflects a commitment to ethical AI development, addressing potential legal and moral issues related to data usage ([08:33]).
Lessons from Google’s Previous Attempts
The hosts draw parallels to Google’s earlier venture with Gemini 2.0 Flash, which faced challenges like improper watermarking and unauthorized generation of copyrighted characters ([09:12]). Speaker B cautions, “It just shows how difficult it is to control these powerful tools” ([09:23]).
Introduction of New Tools
Microsoft is expanding its Microsoft 365 Copilot with two new AI-powered tools: Researcher and Analyst ([09:31]). These tools are designed to facilitate deep and key research ([09:40]).
Functionality and Integration
Researcher: Combines OpenAI’s deep research model with advanced orchestration and deep search capabilities. This tool enables tasks such as developing go-to-market strategies and creating client reports ([10:07]).
Analyst: Utilizes OpenAI’s O3 mini reasoning model specifically for data analysis. It employs an iterative approach to problem-solving and can execute Python scripts for complex data queries ([10:21]).
Advantages and Limitations
Microsoft’s tools stand out by accessing both work data and the internet, integrating with platforms like Confluence, ServiceNow, and Salesforce ([10:49]). However, the hosts caution about AI hallucinations, emphasizing the need for human oversight to ensure accuracy ([10:58], [11:05]).
Availability and Feedback
These tools are being introduced through a frontier program for Microsoft 365 Copilot customers, with early access starting in April ([11:18]). This phased rollout allows Microsoft to gather feedback before a broad release ([11:21]).
Innovative Mineral Discovery
The episode concludes with a spotlight on Earth AI, a startup leveraging AI to discover critical minerals in Australia ([11:34]). Their algorithms have identified deposits of copper, cobalt, gold, silver, molybdenum, and tin in regions previously overlooked by traditional mining companies ([12:04]).
Origin and Development
Earth AI originated from Roman Tesluk’s doctoral research at the University of Sydney. Discovering a national archive of mining data dating back to the 1970s, Tesluk developed algorithms to analyze historical data and predict new mineral deposits ([12:23]).
Overcoming Industry Challenges
Initially facing resistance from the conservative mining industry, Earth AI took a bold step by building their own drilling equipment to validate their predictions. This proactive approach led to their acceptance into Y Combinator in Spring 2019 and subsequent $20 million Series B funding in January ([13:07]).
Impact and Future Prospects
Earth AI’s ability to scan large areas rapidly and identify overlooked mineral deposits exemplifies the transformative potential of AI in traditional industries. Speaker B remarks, “AI is changing the way we find minerals” ([11:59]), highlighting the startup’s role in driving innovation within mining.
The AI Deep Dive Podcast masterfully navigates through significant AI advancements, offering listeners a comprehensive understanding of how these technologies are evolving and impacting various industries. From Google's ambitious Gemini 2.5 and OpenAI’s enhanced visual capabilities to Microsoft’s integrated research tools and Earth AI’s groundbreaking mineral discoveries, the episode underscores the profound influence of AI in shaping our future. As the hosts aptly summarize, “As AI models get better at reasoning and are used in more fields, how do you think this is going to change our relationship with technology?” ([14:02]). This reflection encapsulates the episode’s exploration of both the possibilities and challenges that lie ahead in the ever-evolving landscape of artificial intelligence.
Notable Quotes:
Stay tuned to the AI Deep Dive Podcast for more insights and updates on the rapidly advancing world of artificial intelligence.