
Loading summary
A
All right, welcome to the Deep Dive, where we take a whole pile of, you know, reports and news and try to boil it down to the stuff that really matters. And really matters for you in particular, because you're the kind of person who wants to get a good understanding of these, you know, complex topics quickly and thoroughly without you feeling like you need to read every single article and report out there. And that's really what we try to do. We try to help you get to that point without having to put in all that work. And today we're going to deep dive into some recent AI news, trying to summarize some of the big developments, insights, so that you can kind of stay ahead of the curve without spending hours and hours reading all these different things yourself. The AI landscape is changing so fast, so it really is this constant kind of fire hose of information. So consider this your shortcut to getting the most important stuff.
B
Yeah, I mean, it really can be overwhelming trying to keep up with it all. What I found particularly interesting in this batch of news is that there's just so much activity across all different levels of AI. We've got things happening all the way down to the chip level with new designs and strategies, and then all the way up to, you know, super sophisticated applications in cybersecurity. And even how we measure AI intelligence itself is changing and evolving.
A
Okay, so let's break this down. Let's start with maybe one of the big players making waves in the AI security space, which seems to be Microsoft. They're really pushing this idea of an AI first end to end security platform. What does that actually mean, you know, in practical terms? Like, how does that affect people?
B
Yeah, so they're really focused on two key things. One is using AI to, like you said, make their security offerings better and stronger. And the second is making sure that AI systems themselves are actually secure. And, you know, a big part of this is their introduction of these new AI agents that work within their security copilot platform.
A
Okay, so these AI agents for security, that sounds like a pretty significant kind of step. What do they actually do? Can you paint me a picture of how this actually works in practice?
B
Sure. So think of them as kind of specialized autonomous tools that are designed to tackle very specific security challenges. Right. And when you consider the sheer scale of cyber threats out there, I mean, Microsoft's processing something like 84 trillion security signals every single day, and that includes tens of billions of phishing emails. So you can imagine, I mean, it's just a huge volume of data. It's impossible for Human security teams to keep up with all that. And these AI agents are designed to kind of help them out by automating and streamlining some of the key tasks.
A
84 trillion signals, that's an insane number. You know, it's mind boggling. It really highlights the scale of what we're dealing with here. So what are some specific examples then of how these AI agents are being used?
B
Yeah, so let's say, for example, you have the phishing triage agent, which is part of Microsoft Defender. This agent analyzes, you know, those billions of phishing emails that are coming in. It automatically tries to figure out which ones are actually real threats and which ones are just false alarms. Right. So it's like a filter for the human analyst so they can really focus on the things that actually matter. And then you have alert triage agents in Microsoft Purview, and these work in a similar way, but they're focused on data loss prevention and insider risks. So making sure that the really important issues are flagged and addressed quickly.
A
So it's all about using AI to sort of cut through all the clutter and the noise and help these security teams actually focus on what's really important. Right. Especially since they have limited time and resources. What other areas are these AI agents targeting?
B
Yeah, so they're also working on agents that focus on identity management and device security. So for example, you've got the Conditional Access Optimization Agent in Microsoft Entrance, which looks at user and application access policies and it tries to identify any gaps where maybe a new user or a new app might not be covered by existing policies. And then it can even suggest fixes. You've also got the Vulnerability Remediation Agent in Microsoft Intune, which helps manage all the software patches, and it kind of identifies vulnerabilities and figures out which ones should be prioritized. And then there's the Threat Intelligence Briefing Agent in Security Copilot. And that's like having a personal intelligence analyst, because it pulls together all the most relevant threat information based on your organization's specific risks, which is pretty cool.
A
It's a lot to keep track of. That's a pretty robust set of AI assistants. When are we actually going to see these things in action?
B
Well, Microsoft's saying that they should be available for preview in April 2025. And actually it's not just Microsoft who's developing these agents. They're working with a bunch of different security vendors too.
A
Yeah, I think I remember seeing something about partner agents or something. Who are these partners? What are they bringing to the table?
B
Yeah, so they got five partners, OneTrust, Aviatrix, Blvoyant, Tanium and Fletch. And they're all building their own AI agents that will then integrate with Security Copilot. So for instance, OneTrust is working on an agent that's all about making privacy breach responses much more streamlined, you know, helping organizations kind of navigate all those complex regulatory requirements. Then you have Aviatrix and they've built an agent that's going to help with network issue analysis, you know, making it easier to figure out what's wrong with your VPNs and other connections. Bluevoin's agent is designed to evaluate and optimize your security operations center, basically figuring out how well it's working. And then Panium's agent is going to give security analysts more context around security alerts, which can be super helpful. And finally Fletch is building an agent that's all about prioritizing threats. So it's really going to help reduce alert fatigue for security teams.
A
It's interesting how they're really like kind of broadening it out and bringing in all these different partners to build these agents. It sounds like a force multiplier for security teams. Beyond the agents though, what other sort of AI driven security improvements did Microsoft talk about?
B
Right, they're also beefing up data security with some new AI powered investigation capabilities within Microsoft Purview. So this involves using AI to do some deep content analysis to really try to understand, you know, the risks associated with sensitive data being exposed. And these AI driven investigations are going to be tied into security incidents in defender and insider risk cases in Purview. So it's all going to be kind of connected together to give a more complete picture. And that's supposed to be in preview in April 2025 as well?
A
Yeah, I mean, with all the sensitive data organizations are handling these days, that kind of deeper AI analysis sounds pretty crucial. So. Okay, let's shift gears now to a different part of the AI landscape. The, the chips, you know, the hardware. I was reading about Ant Group, the Alibaba affiliate and their chip strategy. It sounds like they're taking a pretty interesting approach, especially considering all the global stuff happening with semiconductors.
B
Yeah, the really interesting thing about what Ant Group's doing is that they're apparently using a mix of Chinese and US made semiconductors for their AI development. And this isn't just about saving money. It's a strategic move to reduce their reliance on any one supplier, especially Nvidia, especially with all the geopolitical stuff and Restrictions on technology exports to China.
A
That makes perfect sense. Spreading the risk, diversifying the supply chain is always a good idea, especially for these critical components like AI chips. And you know, there was also something in there about a mixture of experts approach. I wasn't really sure what they meant by that.
B
Yeah, so the mixture of experts or moe is kind of like a trend in the industry. Imagine you've got a team of specialized AI models. Each one is really good at a specific part of a complex task. And instead of just using one massive general purpose model, you use these smaller, more efficient models that all work together to accomplish the same goal. Ant Group is saying that by using this MOE approach and using lower cost hardware, in some cases, they've been able to cut their computing costs by something like 20%.
A
So it's like being smart about where you deploy the really expensive chips and using this mixture of experts idea to get more bang for your buck. Who are they actually working with? Which chip makers?
B
So Bloomberg reported that they've been using chips from Alibaba, their parent company. They've also used chips from Huawei. And while they have used Nvidia in the past, it seems like they're shifting more towards AMD and other Chinese chip companies.
A
Yeah, it's pretty interesting how these Chinese companies are finding ways to work around these restrictions. Ant Group seems like they're pretty resourceful. They also had some interesting news about their AI and healthcare, right?
B
That's right. They announced some major upgrades to their AI powered healthcare solutions which are being used by seven big hospitals. These solutions are built on a combination of different AI models, including DeepSeq's R1 and V3, Alibaba's Quinn and Ant's own bailing model.
A
What are these AI healthcare solutions actually doing? I mean, what are they good at?
B
Well, they're designed to help with a whole bunch of different things, from answering complex medical questions to just streamlining all sorts of patient services so they can help doctors with diagnosis, automate administrative tasks, and ultimately improve the patient experience.
A
Yeah, it sounds like they're taking all these investments in AI infrastructure and actually putting them to work in a really important area. Okay, let's move on to another interesting development in the AI chip space. This story about Furiosa AI and how they turned down this big acquisition offer.
B
Yeah, the South Korean AI chip startup, Furiosa AI, they reportedly turned down an $800 million acquisition offer from Meta, and that's a lot of money. So it definitely makes you wonder about their future plans and how they see their value?
A
Yeah, 800 million. That's not chump change. The article said it wasn't about the money. It was disagreements over what would happen after the acquisition, like the business strategy and the organizational structure. So it sounds like Furiosa AI has a pretty clear vision for what they want to do.
B
It definitely does. And it also tells us something about Meta's motivations. You know, they're investing heavily in large language models and AI infrastructure, so they're trying to reduce their reliance on Nvidia for those crucial training chips. You know, they're developing their own custom AI silicon and they have massive plans for AI investment this coming year. So buying a company like Furious AI could have really helped them become less dependent on Nvidia.
A
So Meta saw the value in what Furious AI is doing, especially in their expertise. What do you know about Furious AI as a company?
B
Well, they were founded back in 2017 by someone who used to work at both Samsung and amd. So they definitely have a strong background in chip design. And they've already developed two AI chips, Warboy and Renegade, or rngd. They're trying to be a real competitor in this AI chip market, which is getting pretty crowded, and they're aiming to match the performance and efficiency of companies like Nvidia and amd.
A
And what about their Renegade or RNGD chips? What are they good at?
B
Yeah, they're supposed to be especially good at handling reasoning models, which are really important for a lot of the cutting edge AI applications. They've been testing them with LG AI Research and Aramco and apparently LG AI Research is planning to use them in their infrastructure. Their aim for a commercial launch later this year. And they're also in the process of raising about $48 million in additional funding.
A
So it sounds like they're doing their own thing. Even without Meta, it'll be interesting to see how they do. Okay, for our last topic, let's move away from the hardware and talk about a more fundamental question, which is how do we actually measure AI intelligence? It sounds like there's a new benchmark that's making things a lot harder.
B
That's right, the ArcPrize foundation, which was co founded by Francois Chollet, a very well known AI researcher. They've just released this new benchmark called ARC AGI 2. And this test is meant to be a much tougher evaluation of general AI intelligence. And the initial results are showing that even the most advanced AI models are really struggling with it.
A
I saw those scores and it was kind of surprising. Even those top reasoning models like OpenAI's 01 Pro and Deep Seqs R1, they only got scores of 1 to 1.3%. And those powerful non reasoning models like GPT 4.5, those were only around 1%. And humans are getting 60%. That's a pretty huge difference.
B
Yeah, and it really shows just how big the gap is between what current AI systems can do and what we would consider to be true artificial general intelligence. At least according to this benchmark. These ARC AGI tests are basically these visual reasoning puzzles. The AI has to look at patterns and figure out what the next grid in the sequence should be. And the key is that these puzzles are specifically designed to test the AI's ability to adapt and generalize to completely new problems, things they've never seen before. It's not just about recognizing patterns they've already learned.
A
So it's not about memorization or just matching patterns. It's about actually understanding and applying new rules. What's different about this arc AGI2 compared to the first one? What makes it so much harder?
B
Well, Francois Chollet has said that this new version is meant to be a much more accurate measure of real intelligence. And they've done that by making it much harder for the AI to cheat, to get a high score without actually understanding. He calls it brute force, which is basically using a ton of computing power to try every possible solution until they get it right. So in RC AGI 2, they've added a new efficiency metric. The AI now has to figure out the patterns and find the solutions in a more efficient way, instead of just trying every possible combination till they stumble on the right answer.
A
That makes a lot of sense, because real intelligence is not just about getting the right answer. It's about doing it in a smart and efficient way. Greg Kamrat from the ArcPrize foundation said the core question being asked is not just can AI acquire the skill to solve a task, but also at what efficiency or cost? Which is really interesting.
B
Yeah, it's a really important point. And when you compare it to the old benchmark, it's pretty revealing. OpenAI's O3 Low model, which is a pretty powerful model, got an incredibly high score of 75.7% on the first version of the test. It even beat humans in some cases. But on ARC AGI 2, it only got about 4%. And it did that while using a crazy amount of computing power, like $200 per task, which is ridiculous.
A
Yeah, you can't just throw more computing power at the problem and expect it to get Smarter. I think the AI community is realizing that they need better ways to measure progress towards AGI. Ways that are harder to cheat.
B
Yeah, definitely. Thomas Wolf, who co founded Hugging Face, he was saying that we just don't have enough good standardized tests to measure things like creativity and adapt, you know, the real hallmarks of intelligence. So this ARC AGI 2 is a big step in the right direction. And to really push things forward, the ArcPrize foundation has announced a new competition, the ArcPrize 2025. They're challenging developers to reach 85% accuracy on Arc AGI 2, but they can only spend 42 cents per task.
A
Wow, that's a tough one. It'll be interesting to see if anyone can pull that off. It just shows how far we still have to go to really crack this AGI thing.
B
Yeah, it definitely does. We're seeing them amazing progress in specific areas, but when it comes to replicating that broad general intelligence that humans have and doing it efficiently, there's still a long way to go.
A
Well, this has been a really great deep dive into some of the biggest things happening in AI right now, from AI security, which is clearly a huge priority, to all the new strategies for AI chips, and, of course, the ongoing quest to figure out what AGI really is and how to measure it. It feels like there's never a dull moment in this field.
B
Absolute. And even though it might seem like all these things are separate, they're all connected. Each one is playing a part in shaping the future of AI and how it's going to impact our world.
A
All right, so as we wrap things up, here's a final thought for everyone listening. We're seeing all these advancements. Things are moving so fast, and it's not always easy to understand AI, especially when it comes to intelligence itself. So I want you to think about this. What are the most important ethical and practical things that we should be focusing on right now as individuals and as organizations, as we're navigating this new world of AI? It's something to really think about. Thanks for joining us for this deep dive.
AI Deep Dive: Microsoft’s AI Security Suite, Meta’s Failed Acquisition, and The Toughest AI Benchmark Yet
Episode Release Date: March 25, 2025
Host: Daily Deep Dives
Welcome to a comprehensive summary of the latest episode of the AI Deep Dive podcast, hosted by Daily Deep Dives. In this episode, the hosts delve into three pivotal topics shaping the AI landscape: Microsoft's advancements in AI-driven cybersecurity, Meta's notable decision to decline an acquisition offer from Furiosa AI, and the introduction of a new, stringent AI benchmark by the ArcPrize Foundation. Below, we explore these topics in detail, enriched with direct quotes and insights from the conversation.
Overview: Microsoft is spearheading innovation in AI-driven cybersecurity with its new AI-first, end-to-end security platform. The focus is twofold: enhancing security offerings through AI and ensuring the security of AI systems themselves.
AI Agents in Security Copilot Platform: The podcast delves into the deployment of specialized AI agents within Microsoft's Security Copilot platform.
Functionality: These agents act as autonomous tools targeting specific security challenges. For instance, Microsoft processes approximately 84 trillion security signals daily, including billions of phishing emails—a volume that overwhelms human security teams.
Host A emphasizes the scale:
"84 trillion signals, that's an insane number. You know, it's mind boggling." ([02:39])
Agent Examples:
Partnerships with Security Vendors: Microsoft collaborates with several partners to expand the capabilities of its AI agents:
Partners Included: OneTrust, Aviatrix, Blvoyant, Tanium, and Fletch.
Specific Contributions:
Quote on Partnerships:
"It's interesting how they're really like kind of broadening it out and bringing in all these different partners to build these agents. It sounds like a force multiplier for security teams." ([05:38])
Additional AI-Driven Security Enhancements: Microsoft is also enhancing data security with AI-powered investigation capabilities within Microsoft Purview, integrating deep content analysis to assess risks related to sensitive data exposure.
Availability:
These advancements are slated for preview in April 2025, with ongoing collaborations across multiple security vendors.
Overview: Meta reportedly declined an $800 million acquisition offer from Furiosa AI, a South Korean AI chip startup. This decision underscores Furiosa AI's commitment to its strategic vision despite significant financial incentives.
Background on Furiosa AI: Founded in 2017 by a veteran from Samsung and AMD, Furiosa AI has developed two AI chips, Warboy and Renegade (RNGD), aimed at competing with industry giants like Nvidia and AMD in performance and efficiency.
Quote on Acquisition Decision:
"It wasn't about the money. It was disagreements over what would happen after the acquisition, like the business strategy and the organizational structure." ([09:28])
Meta’s Motivation: Meta's interest in Furiosa AI was driven by its goal to reduce dependency on Nvidia for AI training chips. By potentially acquiring Furiosa AI, Meta aimed to bolster its in-house AI silicon capabilities amidst extensive AI infrastructure investments.
Furiosa AI’s Strategic Moves:
Quote on Furiosa AI’s Vision:
“So it sounds like Furiosa AI has a pretty clear vision for what they want to do.” ([09:42])
Overview: Ant Group, an affiliate of Alibaba, is strategically navigating the global semiconductor landscape by diversifying its AI chip sources and adopting innovative approaches to reduce costs and dependency on specific suppliers like Nvidia.
Mixture of Experts (MoE) Approach: Ant Group employs the MoE strategy, which utilizes specialized AI models for different tasks instead of relying on a single, general-purpose model. This method enhances efficiency and reduces computing costs by approximately 20%.
Quote on Chip Strategy:
"By using this MOE approach and using lower cost hardware, in some cases, they've been able to cut their computing costs by something like 20%." ([07:18])
Chip Manufacturers and Partnerships:
AI in Healthcare: Ant Group has also enhanced its AI-powered healthcare solutions, deployed in seven major hospitals. These solutions integrate multiple AI models, including DeepSeq's R1 and V3, Alibaba's Quinn, and Ant’s proprietary Bailing model, to assist in medical inquiries, streamline patient services, aid in diagnostics, and automate administrative tasks.
Quote on Healthcare Solutions:
“They're designed to help with a whole bunch of different things, from answering complex medical questions to just streamlining all sorts of patient services so they can help doctors with diagnosis, automate administrative tasks, and ultimately improve the patient experience.” ([08:44])
Overview: The ArcPrize Foundation, co-founded by renowned AI researcher Francois Chollet, has introduced ARC AGI 2—a new benchmark designed to rigorously evaluate general AI intelligence. This benchmark poses greater challenges than its predecessor, emphasizing efficiency and true understanding over mere pattern recognition.
Benchmark Details:
Quote on Benchmark Design:
“It's the key is that these puzzles are specifically designed to test the AI's ability to adapt and generalize to completely new problems, things they've never seen before.” ([12:32])
Performance Insights:
Quote on Benchmark Results:
“When you compare it to the old benchmark, it's pretty revealing. OpenAI's O3 Low model, which is a pretty powerful model, got an incredibly high score of 75.7% on the first version of the test. It even beat humans in some cases. But on ARC AGI 2, it only got about 4%.” ([13:17])
Community and Future Directions:
Competitions: ArcPrize Foundation has launched ArcPrize 2025, challenging developers to achieve 85% accuracy on ARC AGI 2 while maintaining a cost efficiency of only 42 cents per task.
Industry Perspectives: Leaders like Thomas Wolf of Hugging Face and co-founder Aquilaentes discuss the necessity for standardized tests that measure creativity and adaptability, which ARC AGI 2 aims to address.
Quote on Future Competitions:
“ArcPrize foundation has announced a new competition, the ArcPrize 2025. They're challenging developers to reach 85% accuracy on Arc AGI 2, but they can only spend 42 cents per task.” ([14:11])
The episode of AI Deep Dive underscores the rapid advancements and complex challenges within the AI ecosystem:
AI-Driven Cybersecurity: Microsoft's innovative use of AI agents enhances security operations, offering scalable solutions to manage vast amounts of security data efficiently.
Strategic Industry Moves: Meta's rejection of Furiosa AI's acquisition offer highlights the competitive and strategic maneuvers companies undertake to secure their positions in the AI chip market.
Efficiency and Innovation in AI Hardware: Ant Group's strategic diversification and adoption of the MoE approach demonstrate the importance of flexibility and cost-effectiveness in AI hardware development.
Evolving AI Benchmarks: The introduction of ARC AGI 2 represents a significant step towards more accurately measuring AI's general intelligence and fostering advancements that move beyond superficial pattern recognition.
Final Reflections: Host A leaves listeners with a thought-provoking question on the ethical and practical considerations individuals and organizations must navigate in the evolving AI landscape, emphasizing the importance of staying informed and proactive in addressing the challenges and opportunities presented by AI advancements.
Closing Quote:
“What are the most important ethical and practical things that we should be focusing on right now as individuals and as organizations, as we're navigating this new world of AI?” ([15:27])
This episode of AI Deep Dive provides valuable insights into the current state and future directions of AI, highlighting the interplay between technological innovation, strategic business decisions, and the ongoing quest to understand and measure artificial intelligence's true potential.