
Loading summary
A
Claude Mythos is finally here for us mere mortals, sort of mostly kind. Well, you tell them, Gavin.
B
Claude Fable is Anthropic's newest AI. That is right. It is one of the Mythos family, but it is not the full Mythos model. It is going to be very good. It is also going to be very expensive and maybe kind of guardrailed.
A
For right now, you never go full Mythos. That is a hard and fast rule. We will tell you everything we know about this just released model as of this recording and try to figure out what our future looks like when we do finally hit AGI. Even though I think we did actually hit it just now.
B
Plus, Apple Intelligence is finally getting intelligent with a brand new Siri that's remarkably called Siri AI and it might actually work this time. We believe that truly helpful AI must be centered around you and your needs. Hey Craig, my needs were last year. I own an iPhone 16 Pro and it's not happening for me. What's up with that?
A
Oh, don't worry, Gavin. That phone was built from the ground up for Apple Intelligence. It just doesn't support Apple Intelligence.
B
Fair enough. We will dive into that and so much more on this week's episode of AI for Humans.
A
For humans.
B
Welcome everybody to the wonderful world of AI. This is AI for Humans, your twice a week guide to AI And Kevin, today we have big, big, big news. On top of the Apple Intelligence stuff which we will get back to literally about 30 minutes ago mythos dropped. It is not called Mythos or this version of. It is not called Mythos. This is Anthropics new LLM model. They're calling it Fable 5. So not only do we have a
A
5 in the number remembers the choices that you make as you navigate from village to village. Are you a chicken kicker? It knows.
B
By the way, I will say the original Fable video game is a classic. I am very excited for the new Fable. This is a a new top of the line model for Anthropic. This is based on their Mythos models which we have talked about here before that have not been out yet. This version available, both Kevin and I have access to it. It has just appeared in my Claude. It should be appeared in your Claude. But Kevin, there are a few things we have to talk about before we even ask it to do anything. First and foremost we should talk very quickly about there is benchmarks and bring in the benchmarks boys for the very fast intro here will with a benchmark voice turn the test up loud, check
A
the charts, make the game benchmark boys are clearing the bleachers for this one. This isn't even in the same stadium as other models. If we're just talking about the numbers, Gavin.
B
So numbers wise, Kevin, I think we are seeing a pretty large jump here. I want to be clear, not on everything from Opus 4.8. So if you remember, whatever it was three weeks ago Opus 4.8 launched, this was their previous anthropics. Previous day. They are. But the big number that I think is really important to pay attention to is the, the agentic coding numbers and all of the numbers around coding because we know Anthropic has focused on coding for a bit of. And Kevin, the Fable 5 number on agentic coding is an 80.3%. So these are all numbers. I understand they don't mean anything until they get practical, but opus 4.8 was at 69.2%. So that is a pretty large jump within a. What is it like a three week span to this set of models? So when you see these numbers without playing with them, what is your first kind of immediate feeling?
A
I'm sorry. To everybody that bet on a slowdown on polymarket.
B
Yes.
A
Like we're not. It's not. It's still not slowing down. There's still clearly plenty of Runway left for this takeoff and the wheels might not even be on the tarmac. If you look at the other agentic coding benchmark, Gavin, which is Frontier Code, otherwise known as the diamond benchmark GPT 5.5, which by the way is still my daily driver within their Codex app. It scores 5.7% on extremely high thinking mode. Claude's Mythos 5 or Fable 5 scores 29.3%.
B
Oh, wow.
A
This is a significant jumping.
B
Yeah.
A
That's amazing. Significant jump. And, and look, it's incredibly early and I like, I love slash.
B
I just, I'm just standing up, Kevin. So if you, if the view. If the listeners can't see I'm standing up and applauding.
A
We've.
B
We've. That's a pretty big jump. I, I feel like that's an applause jump. That's an applause.
A
That's fair. Okay. I'll be honest though. Your video is frozen and it literally startled me. I didn' know what that was. It was a bit of a jump scare. But again, we will see how the vibes pan out. A lot of times you and I record these things right on the heels of a release and then over time things, you know, pan out that they're maybe not as incredible, but it's hard to ignore these just the raw numbers. It's really hard to ignore it, especially against the hype, which has mostly proven real, at least in the security community, over the capabilities of Mythos, the parent model. On that front, Gavin, I wanted to play a quick clip from one of the release videos from Anthropic regarding security.
B
Our safety systems for Fable 5 automatically
A
review requests that touch on high risk areas like cybersecurity or biology. Those requests are Then redirected to Opus 4.8.
B
We do that intentionally so people can
A
continue to benefit from the capabilities of
B
a powerful model like Fable without the cyber and biology risks that come with it.
A
So that's an interesting so they're dumbing
B
it so it's interesting. So they're dumbing it down for specific prompts. Now the question I will have is how good is it determining what is a cybersecurity risk? What is in a cybersecurity risk?
A
Turns out it's a phenomenal foundational model at everything but classifying what's a biological weapon prompt exactly.
B
Like if I'm asking it to figure out how to fight 64 animals against each other, is there something in that prompt that is going to say, no, you cannot do that? That is not allowed?
A
Well, heck, I mean, it's super anecdotal and we won't get too in the weeds. We will get right back to this stuff. But even last week Claude was refusing to take action for you based off the notes of our show. And this was the old model.
B
Oh my gosh, I forgot about that. Kevin. But you're right. One of the things that was so frustrating about last week, I use Claude to help me write the show notes every week. I still go over them with my own human brain, but it's a way to kind of speed up the process of being a YouTuber. It's a very difficult thing. Sometimes all the extra stuff you have to do and Claude is didn't want to write the show notes for the story about the Chipotle AI thing from last week. If you remember, there was kind of a hack around where people are using Chipotle AI to kind of do their coding stuff because there was a backend hack. They found the thing had already been plugged, but Cloud was like, I don't think you should include this story and I am not going to write the notes for it. And I was like, God, just give me this thing. It's 10:38 10:30pm I just need this thing done. So ended up writing it for him. But that is A big thing to be aware of. Right? Because now if intelligence is this important and this big, it does have that like Big Brother starting to feel where you're like, well, was Claude gonna let me do this? Am I gonna be able to get this thing through? Like, will it allow this thing? And then, you know, I think we are gonna get past this point. Like Anthropic is famously the most safety oriented company. They're being very careful with this. It does feel like right now, I mean, Mythos, the leak of Mythos came out, whatever it was, three months ago now for and Glasswing, they've had people project glasswing, people that have access to it for a while. I am really curious if Anthropic is a step ahead of OpenAI or. We'll see, I guess what OpenAI's response to this is. But Kevin, the other big thing we have to talk about here, and it's a big warning that comes up when you select the Fable 5 model. This is going to cost two times as much in tokens as Opus 4.8. So if you remember, Opus 4.8 was the most expensive tool to use at this point. And when you think about tasks, things you want to do with AI. In fact, a good example of this is just this week I got my wife onto her own cloud account so she could use it to do a bunch of work stuff she's doing. One of the things she wanted to do was go through her Gmail and get every email she had so that she can make a list for her book. They use a lot of tokens to do that. Right. Because it has to kind of understand how to go through this stuff. And also, you know, Google's MCP has a limit of 50 messages per thing, so blah, blah, blah. She used up all of her tokens within like a very quick window on the Pro account. I am worried that this is just going to be mostly unus, not very useful for me in this way. I don't know. What do you think about the cost side of this?
A
It is clearly expensive. It shouldn't be. I don't expect it will be the daily driver for a majority of the people that are in our audience. I don't expect they're going to route use it to power an Open claw or a Hermes agent. This is the model that you will go to for massive code base refactoring or for super high level thinking or if you need a super long horizon task or to orchestrate your 15 other agents that are running with little check ins. But they're much cheaper sub agents that are going off. Like, this is not. Like this is appears to be full stop, the most powerful general or wide release model in the history of AI.
B
Like that. Yes.
A
I mean that. But that doesn't mean it's for everyone.
B
Right.
A
If you're using it to. To. To do email inbox triage, you know, which, you know, I don't think anybody should, but that is like summoning an Astero. There's an ant on the sidewalk and you want to make sure.
B
Sometimes that can be fun, Kevin. Sometimes you just want that ant to get just demolished. Right? That's.
A
I am okay with you. I am more than okay with you taking those little fingers of yours and dipping into the Patreon jar to rip a. A fable 5:1 shotter of your animal Battle tournament site.
B
Like, or whatever happens.
A
Yeah, let's do it. But again, I don't think it's for everybody for. But for the people that it is for.
B
Yes.
A
Again, the early demos that are already coming out are insane. And so we have to, as we typically do, shout out every. And Dan Schipper, who his whole team got early access to this. Did you see the Library of Babel? Yeah.
B
Very cool. So this follows up on a bunch of other leaks that we had seen earlier. But like basically Dan and his team at every built like a. Like a 3D environment of the Library of Babylon that you can walk through. And this is again, the idea of it one shot, this thing. Right. And I think the important thing to understand is as anybody out there who's listening has tried agentic coding or has tried kind of back and forth working with AI to do stuff. It's a lot of work. Right. It does not automatically do stuff. So this is actually we. Hopefully you'll be seeing this if you're watching the YouTube video right now. Pretty impressive to one shot as an experience. And it does make me think, Kevin, not only for the video game world, but just at large, like, how much the world of like our interactions with computers are going to change if something like this is doable in a single attempt in some form.
A
Yeah. I mean, wild. The amount of research that it went off and did and the amount of taste that it injected into the project as well. Like it made it went out of its way to not be lazy and to make some creative decisions that, you know, improve the quality of the product. And we're seeing that with all of the early leaks. I don't know if you've seen some of the voxel Stuff that is.
B
Oh, they're crazy. Yeah. Or the Minecraft stuff. There's a lot of really great Minecraft leaks that have come out from people. You know, I do want to point out a couple things that Dan's team at every set and I think is a really important understanding. You just kind of echoed one of these too. It is really important to use this if you're going to use it as a kind of a planner and an orchestrator, not as the thing that's going to do the grunt work of this thing. Because you will eat up your tokens crazy fast if you want to use this. It's actually great. Probably again, we are going to dive into it this week. We'll have more on it on Friday to plan out bigger systems and to kind of like help orchestrate smaller coding agents amongst itself. I think the other thing is, like, that's really important. I've heard this now for multiple people who have tested it. It is very slow. So you also have to plan on these things taking a while. So if it's something you want to kind of walk away from, let it go, work for a bit. I mentioned this on my. My bear project that. My bear drum project that I've talked about before. I had put in about 80 hours of work into it. I might spend some fable tokens to see what it can go on this just to see what happens. I did end up blowing through my whole like, you know, weekly codex use case for letting it work. But I think this will be interesting to see as we go forward because this does feel like, you know, I always think it's really important. There was a really, really good post from Gnome Brown and probably was posted on today specifically about Gnome Brown, if you remember, his goes by paulinomial on x and is an OpenAI researcher, very famously part of the AgentA coding team. Very smart guy. He wrote a long post about how we have to start judging AIs based not only on their capabilities, but because they will be able to work for so long and churn through so many tokens that we also have to judge them based on cost and how many tokens they. So I think this may be a slight change. And obviously this is a nice OpenAI talking point, right? Mythos is very expensive. It is many, many tokens. And OpenAI says 5.5 is very good. And I assume 5.6 will have some version of that as well too. But it does feel like, oh, oh, what do you got there? You got some. Some triangles.
A
The tried and true Pyramid, baby. You got quality, speed, cost.
B
Yes, you're right.
A
Three points.
B
You're right.
A
That's exactly what the benchmarks need to be. As we now seeing that, okay, quality is going to go through the roof. All of these companies are going to release amazing foundational models. Speed and cost, they're going to be there as well. But it did beat Pokemon just using Vision and there's a time lapse video of it on their official blog, so.
B
Oh, damn, now we're talking. See, this is the kind of benchmark I want. And if that's the kind of benchmark you want, you know what, you got to hit that subscribe button. You got to like it. You got to hype us. This is what we do every week. We talk about the interesting stuff and the dumb stuff stuff in equal, in equal measure. Because that's who we are. Interesting and dumb at the same time.
A
We tend to focus on one side of that a bit more. But okay, listen, can we, can we bang through the WWDC and all these new Siri artificial intelligence updates? I think we can knock them out, Gav.
B
I think so too. So that's the other thing we have to talk about today, Kevin, is the big event that happened on Monday, which is wwdc. Let's start. You start first. What are we going to the very beginning of this.
A
If you're getting the visual version of this old podcast, into your eye holes you'll see the concentric rings that Apple uses to describe their approach. So 10,000 foot view. All devices within their ecosystem running Apple foundational models that can, you know, support image, voice and text that are then enriched by a system orchestrator. On screen awareness, your personal context, some world knowledge. That means up to date info, by the way, Gavin, not provided by Google.
B
Go ahead.
A
What go.
B
I say this image is so ridiculous to me. It's like you look at this and like I understand what they're trying to do and I get it if you. I think of it as a 3D cone essentially.
A
Yeah. But this is how it was pitched to Tim. You know that this is exactly the slide that Tim got. He was like, okay, that looks like a plan. Let's do it. Well, listen, okay, the Too long didn't read is that Apple Intelligence or Siri AI is finally going to be a thing they demoed with real actual running in at least some semblance of real time demonstrations from their presentation. So there was no more like, oh, is this real? Is this actually working like they went actually their way?
B
It works for Frickin's sake. It works. And they promised stuff that didn't work last year and now it works. Thanks a lot, Apple.
A
And does it do anything that you couldn't believe that I could do? No, absolutely not. It just, I mean, but it works. So that means if you say, hey, Siri, what was that thing that Gavin was talking about? Make an appointment and if I need antibiotics, go ahead and set a reminder and it will do that.
B
Yeah. Well, here's the thing I'll say about this, and I don't want to underestimate what a big deal this is, because AI that works and is cheap and is local, we can talk briefly about that too, is a lot more valuable to most people than, you know, Fable 5, which is going to cost you a fortune to go do stuff on the very far edge. Right. And I think the important thing, what you just said is there is a demo they showed off of somebody saying like, hey, what can I. What was that thing I was talking about for my mom that I was thinking I might get her for a present? And then you're. It's like that idea of a second brain. Up until now, this has been a major fail of AI, and yet it is also one of the best potential use cases of AI, which is, how can it just search everything in my life, right? Like, if Google is good at searching information, what can it do for life search? And I do think Apple is finally onto something because so much of our lives are on these things, right? On our phones, between text, email, all the apps we use. So that part I am exceedingly excited about. I feel like it is unfortunate that they promised all this stuff early. And, you know, one of the interesting stories Kevin came out of this was that like, early on, they kind of pooh, poohed AI at large and didn't even pursue a chatbot and are really trying to catch up here, which is an interesting thing. I do think the Siri seems good. I guess I'm waiting till I get it and to be able to use it and see, but it does feel like it might be way more useful than it was before.
A
It's, you know, it's powered in part by distillation of Google's AI models, but it supposedly is 100% Apple in the tooling that allows Siri to run. They have a version of their secure cloud, basically running with Google Nvidia hardware in the cloud. So they're not fully hosting your secured cloud, but they have new technology from Nvidia that allows them to say with confidence that nobody, not even Apple, not even the host, can see what your requests are. Which is important when you're talking about like go search all of my texts or let me take a personal photo and then shift the perspective of it or erase some things from it. Like you, you want, you know, your privacy secured. I will say like a couple top hits again, because it's on every device. It's not just the phone. It is on your, you know, Mac os, it is on your watch. It's even Envision Pro for the six people out there that use it. You get things like, like the AI driven shortcuts are interesting, right? Very time I very. Every time I enter this building or load this app, do this action, you can have AI monitor it makes me
B
think, could I anytime I do something, I could have it text you something stupid. That might be a thing I'll try to do as a side. Like every time, I don't know, what could it be? Every time I drive by a building in Los Angeles that has like memories for us, like every time I drive by old, old G4, text me three toilet emojis. Yeah,
A
exactly. You can do AI AI organized tabs. Okay, Big deal. Chrome can do that. I like that they had one where it could monitor tabs for you in the background basically. So if you're waiting for like tickets to a pre sale to drop, you're gonna say keep an eye on this tab and alert me when it's ready. Their AI password app had an interesting agentic feature where if it noticed that a password of yours was compromised, it could go out and automatically change the password for you securely and then update it in your password vault. Like I do love that for the, the child of a, of a now a Mac owning father who has to provide tech support. Oh wow, that is a nice one.
B
Yeah, I, My biggest thing about here is what again? It's like kind of like we're seeing the two edges of AI right now, right? The mythos fable is like at the far edge, what people are going to be doing at the end. The AI Apple AI is kind of like the mainstream, right? Which is funny because Apple is now such a mainstream company. Used to be kind of like the, the company that pushed things forward, but now it's really like the masses get all this stuff. If the masses get this Apple stuff and it starts to feel actually useful to them, maybe there is a shift in the way that people start to think about AI because up until now, Kevin, if you remember, last year's Apple update gave us A really terrible like thing where it would text you summaries of your text and just like gave you just the most ridiculous text on top of the things you got. So I do think it's useful. The other thing I do want to mention here is that they have done some really cool stuff with photos. And I know that their photos model on site, on app may or may not be good for just generating photos, but they're making it so that you can move a photo around after you take it, which is kind of cool. So you can take a photo and then you can reframe it in some ways. And then a very cool, small nerdy thing is that they are Gaussian splatting Apple maps. So Gaussian splats we've talked about in the show before. It is a kind of a way to capture real life images and drop them on. But there is a really remarkable part of the WWDC demo where they showed off what kind of Apple maps looks like with Gaussian splats. And it does make me think about this idea of if we start mapping the entire world and we have all this data, the amount of interesting stuff that can happen for a platform like the Apple Vision Pro is actually quite big. And so as much as the AR VR world has been kind of written off as dead in some form, I do feel like maybe that in the next couple years could have a big tick up.
A
I, I fully agree. I was, I'm. Look, I'm. I'm still very excited for all of this stuff, even though I'm kind of flippant about certain things. Like I'm excited for it to arrive. I want Apple to be a major competitor here. I have Apple devices, I'm in a Mac ecosystem primarily, so I really want them to do well. But I just have to end on this image because you mentioned their message summarization issues from the past.
B
Yes.
A
Did you see the one about the house?
B
I saw this. Yeah. Say, just describe this for the people listening to me.
A
Oh, it is it. And, and shout out to Brooks Otter Lake who said, quote, insane that this is the main image Apple is using to show off the new AI Siri. It literally is a text message from someone where above the text message it tries to summarize the message. They're the same length. It's basically the exact same message popping up.
B
Nothing solved. Nothing's totally solved. All right, we will see you all on Friday for more more Fable 5 talk and go try it. It's out there. And we'll see you all soon. Bye. Bye.
A
Bye.
Hosts: Kevin Pereira & Gavin Purcell
Date: June 10, 2026
In this episode, Kevin and Gavin break down the just released Anthropic Claude Fable 5 (from the Mythos family of models) and Apple's latest AI-related announcements from WWDC, including the much-anticipated reimagining of Siri as "Siri AI." The hosts share excitement and skepticism about the rapid advances in AI, focusing both on benchmark results and real-world utility, safety, cost, and the shifting AI landscape for both power users and mainstream audiences.
Release Context:
Benchmark Results & Agentic Coding
Security and Guardrails
Cost & Usability
Intended Use Cases & Community Feedback
Feature Rundown:
What Works, What Doesn’t
Misses, Skepticism & Fun Moments
Benchmark wow moment:
Guardrails & Frustration:
Cost & Strategy:
On the state of benchmarks:
On Apple’s visual metaphors:
Summing up Apple’s AI message summarization:
Kevin and Gavin paint a picture of an inflection point: AI models like Fable 5 are accelerating rapidly in capability but remain costly, slow, and sometimes overzealous with safety. Meanwhile, Apple's entry marks AI "life search" coming to the mainstream—promising to actually work this time, although early implementations are still catching up on usefulness. Whether you're an AI power user or just along for the ride, the world of artificial intelligence feels both “incredible, and a little scary.”
Next up: Hands-on impressions with Fable 5 in Friday’s episode—plus deeper dives on Apple’s new AI features as they roll out.