Loading summary
A
The big news today is that Meta Platforms has launched a new AI model. Alex Wang, the Chief AI officer at Meta Platforms, announced a new large language model today, its first major new artificial intelligence model in more than a year. The rollout of the model, called Musespark, is a critical moment for Meta, which is up 7 and a half percent already, which has spent billions of dollars hiring AI talent in a bid to catch up to OpenAI, Anthropic and Google DeepMind. Leading labs have been putting out models at an accelerating pace in a departure its previous models which were open source. Muse Spark is a closed model that will power Meta's AI chatbot and AI features within it. John Ludig has a very interesting post about open source AI and sort of
B
predicted this predicted that Meta would eventually bail.
A
Yeah, the future of foundation models is closed source, he said. Given Meta is the primary deep pocketed large open source model builder, open source AI has become synonymous with Meta AI. He wrote this maybe three or four years ago. So the operative question for open source AI is what game is Meta playing? In a recent podcast podcast Zuckerberg explains Meta's open source strategy. One he was burned by Apple's closeness for the past two decades and doesn't want to suffer the same fate with the next platform shift. It's a safer bet to commoditize your compliments. He likes building cool products and cheap performant AI enhances Facebook and Instagram. That's 100% true. We've seen this in the ads product and the growth there. There's some call option value if AI assistants become the next platform and that makes sense. In Manus and the Meta AI app. He bought hundreds of thousands of H1 hundreds for improving social feed algorithms across products and this seems like a good way to use the extras. That all makes sense and LLAMA has been great developer marketing for Facebook. But Zuck also suggests several times that there's some point at which open source AI no longer makes sense either from a cost or safety perspective. When asked whether Meta will open source the future $10 billion cost model, the answer was as long as it's helping us, at some point they'll shift their focus towards profit. And that's what John Ludig wrote. He says unlike the other model providers, Meta is not in the business of selling model access via API. So while they'll open source as long as it's convenient for them, developers are on their own for model improvements thereafter. That begs the question, if Meta is only pursuing open source insofar as it Benefits themselves. What is the tipping point at which Meta stops open sourcing their AI sooner than you think? He says. Exponential data Frontier models trained on the corpus of the Internet, but that data is a commodity. Model differentiation over the next decade will come from proprietary data, both via model usage and private sources. Exponential capex. He highlighted this two years ago. A lagging edge model that requires just a few percent of Meta's 40 billion in capex is easy to open source. No one will ask questions. But when you reach $10 billion or more in capex spend for model training, shareholders will want clear ROI on that spend. The metaverse raised some question marks at a certain scale too, diminishing returns on model quality within Meta. There's a large upfront benefit for Meta building an open source AI model, even if it's worse than the frontier closed source counterpart. There are lots of small AI workloads, think Feed algorithms, recommendations and image generation where Meta doesn't want to rely on a third party provider like they had to rely on Apple. And so the news has been Back in December there was a reporting that Alex Wang disclosed an internal company Q and A that his team was working on two new models. One was this text based LLM codenamed Avocado and then a separate model that was for image and video.
B
Mango.
A
Mango, yeah. And so have they clarified if this is Avocado? This feels like what Avocado should be. This Muse Spark. Is that what it's called again?
C
I see it is. I don't know what else.
A
So the image model should be coming soon. The question that I had was will a code focused agenta coding harness be a separate model, a different train? It feels like it's not a coincidence that this news is dropping on the heels of Anthropic's new model Mythos, which sort of was announced loosely and the model card dropped yesterday. Even though the model is not available
B
yet to play with, they break out Musespark thinking against Opus 4.6 max Gemini 3.1 Pro high GPT 5.4x high and then Grok 4.2, the way that they position it, it looks somewhat of a chart crime. But if you look when you basically look at the top and you say that Musespark gets an 86.4 and it's in blue and then you look over and it's outperforming all the other models on that benchmark, you just sort of assume that like the highlighted blue means that it's output.
A
Yeah, but it's not Frontier at mmlu
B
because There's a bunch of them as you get further and further down. So like ARC AGI 2 musespark dramatically underperforms.
A
Oh, interesting.
B
The other leading models.
A
Yeah.
B
And there's a number of others where it is. There's plenty where it's overperforming. There's plenty where it's underperforming. I, of course we started asking it jokes right away. So I'll pull up the joke. I said, tell me a joke.
A
Okay.
B
It said, gotcha. Here's one for you. Why don't skeletons fight each other? They don't have the guts.
A
Okay.
B
But then it doesn't even let me react. It just says, want a different vibe. I've got dad jokes, nerdy ones, or Malibu appropriate surf puns too. And so I'm like, wait, why would you think I want Malibu appropriate surf puns?
A
If you look in your previous chats.
B
I have no previous chats.
A
No previous chats because Meta AI calls Instagram. So when you search on Instagram, sometimes that prompts llama or previous.
B
Yeah. So I'm trying to get to the bottom end, right? Like this feels like an oddly
C
type
B
of joke to ask me. And so I said, why would you think I want Malibu appropriate surf puns? Ha ha. Fair call. Toss that in without you asking for it. I try to mix in different flavors of jokes and surf puns just popped into my head as a random option. No special reason behind. Just said, I don't believe it was random. And it says, you got me, you got me. It wasn't my smoothest moment. But then it denies it again.
A
Weird.
B
It says it's just like, what kind of, what kind of joke do you actually want? No hidden motives here. Is that you gave a hyper specific example based on my life, so I have to assume you were looking at my location for inspiration.
A
Instagram.
B
I get why. I get why that felt weird. I'm sorry it came off that way. And then it denies it again. You're clearly lying.
A
I mean, you have talked about being in Malibu on the Internet for a full year. It's possible it got baked into the pre training or something. I don't know. But yeah, I mean, what is personal superintelligence if it doesn't even know your name? Like that feels like they haven't dialed in the. The harness or whatever the tuning is to.
B
Actually, yeah, in a sports like Meta is going to be hyper aware. We don't want a PR cycle. Like they trained on your data. Right.
A
Like everyone's been, oh, that ad was A little bit too close to home.
B
And you remember every, every once in a while, one of those, like a screenshot that's been screenshot like a thousand times, like goes viral and it's like, I do not give Mark Zuckerberg permission.
A
Oh, yeah, yeah, yeah, yeah. Like that works. Yeah, it's hilarious. This is. Is this a rebuttal to the bench hacking allegations that happened last last week? So. Or last year? So according to Meta's internal benchmark test, Muse Spark outscored Google Gemini on some tests and was competitive with models from OpenAI and Anthropic on others. It significantly outscored Xai's Grok on most tests. Alexander Wang's hiring followed the disappointing release of Meta's previous model called llama4. The company was accused of and later admitted to gaming a third party benchmark that it used to rank various models against each other on performance. It also delayed the rollout of its biggest model called Behemoth, which it never ultimately released. And so when I look at a model card like this, where you could call it a chart crime, where it's highlighted in blue and it feels like it's the best, but it's actually doing better on some. It does well on Healthbench hard. It underperforms on ARC AGI 2, as you mentioned. But this maybe is the bull case here is that they have at least moved on from the culture of optimizing for the benchmarks. Right. Isn't that a good thing?
C
There are rumors about them. There was extra bonuses if they got number one at LM Arena.
A
Sure.
C
That was something like the rumor. But yeah, I mean, you've seen a lot of the labs kind of move away from benchmarks generally because I think they're just not that meaningful anymore. A lot of them are basically so saturated, it's like they're competing between 89 and 91% and they're just not very meaningful, you see.
A
And you won't actually feel that in the product necessarily.
C
Yeah, you kind of need to talk to these things for a long time before you can actually get the vibe. But I do think this news is very interesting in the context of the, you know, cloudonomics stuff.
A
Dashboard. Yeah.
C
Because like, okay, what does it mean if, if the entire company has been like maxing their, their cloud tokens.
A
Yeah.
C
Over the past month. It means that they weren't using this model.
A
Yeah. To me it means they need to commoditize their complements. Right. They need to bring down that cost potentially. And the. And if they're I mean we, we, we sort of dug into are they spending $1 billion a month? Seems like absolutely not. But they're clearly spending a lot of money. And if you can turn that OPEX into capex and train your own model and then inference it much cheaper on your own hardware, that feels like just an economic opportunity. That makes a ton of sense in the context of just 10,000, 20,000 engineers writing a lot of code.
C
Yeah, I think there's basically two ways to square those two things happening. Either one, this model's not that good because the engineers aren't using it, or your theory that they're just distilling clothes.
A
So one of those is that's not my theory. That is the schizo theory.
B
The news this morning, Meta Platforms and the information Meta platform has taken down internal employee built leaderboard. Tracking how many tokens staffers were using showed total usage over a recent 30 day period amounted over 60 trillion tokens. The dashboard now displays a message that is offline. It says, we've really enjoyed building this app on Nest for everyone. It was meant to be a fun way for people to look at tokens. But due to data from this dashboard being shared externally, we've made the decision to shutter it for now.
A
It seemed like a fun side project. Mike Isaac was reporting on it here. He said it's down unclear to me if this was a homespun one by employees or an official one. Employee projects come and go frequently. Conspicuous timing though. But yeah, you don't want to have. You want to measure the output, the impact, not necessarily the input and how much is going on there. Lisan Al Gaib says Meta might actually be back with Musespark still behind OpenAI, Anthropic and Google, but ahead of XAI. In Chinese labs, Musespark soars 52 on the Artificial Intelligence Analysis Index, behind only Gemini 3.1 Pro, Gemini GPT 5.4 and Claude Opus 4.6. Musespark is the first new release since Llama 4 in April 2025 and also Meta's first release that's not open. Wait. So a huge jump up in performance across a variety of benchmarks. So all good stuff there.
B
The market is thrilled that Meta has released a a close to frontier level model. Right. This is a new group. They've been at it for less than a year. The stock is up almost 8% today. And again, so much of the pricing pressure, the downward pressure on Meta has just been kind of uncertainty on what all these tens of millions of dollars will actually go towards. And what will be accomplished and still unclear. Like, are they going to go after Cogen at all? Are they just going to try to compete on the consumer LLM side?
A
And can you economically go after Codegen if you're just using it for internal models? If you're not selling it externally, can you justify the capex just purely on the internal usage? Having this model be vended into all the different family of apps makes a lot of sense because they have billions of users that will wind up interacting with this in one way or another.
B
Yeah. The question is, will they try to send Meta vibes again with the new model all the way up to the top of the app store charts?
A
Meta's new family of AI models can reach the same performance as Kimmy Kate's with only 30% of compute and only 10% of the compute to reach llama4maverick. So a much more efficient computing frontier here. Metaspark is an early data point on our trajectory and we have larger models in development. So the mythical 10 trillion parameter model that is the 10T is what everyone's working on right now. 10 trillion?
C
Yeah, probably in that range.
A
Yeah. It's all rumored at this point. Yeah, rumored.
C
GPT4 was something like a trillion. Right. You remember those memes where it's like a small circle and then the big
A
circle and then a huge circle.
C
GPT4, GPT5.
A
Yeah. Martin Casado has a little bit more context on like what actually unlocks new capabilities in AI models. He says Mythos appears to be the first class of models trained at scale on Blackwells. Then there will be Vera Rubin's pre training isn't saturated. Narrative violation. RL works and there's so much computing to coming online soon. Buckle your chin straps. It's going to be wild. The scaling law.
B
You know Brad Gerstner had to come in with the 100.
A
Yep, for sure. Yeah. There's a. There's a crazy bull case for Nvidia in the information arguing it should be worth what, $22 trillion. That is a wild move. There's. There's a lot going on. The scaling laws holding is the most
B
articles from the information finance Nvidia worth 22 trillion. This old school financial model says yes.
A
The big news on yesterday was Anthropic's new model Mythos some really impressive statistics and anecdotes yesterday both the model card, the benchmarks and some stories about breaking out of a variety of. What do they call them, walled gardens or test environments? I don't know, the simulation, the sandbox yeah, breaking out of sandbox, sending emails, all sorts of stuff like that. The model preview is only available right now to about 50 companies that maintain critical infrastructure because the model is particularly good at finding zero days bugs and exploits in technical systems. And if they, you know, they lead that, they leak that out before big companies have a time, have time to go and address all the bugs, there could be serious, you know, serious ramp ramifications for cybersecurity. And so key partners include Apple, Google, Microsoft, Amazon, Nvidia, JPMorgan Chase, Broadcom, the Linux Foundation, Cisco, CrowdStrike and Palo Alto Networks. They're all listed on the cybersecurity focus page for Project Glasswing.
B
Chris Backey was having a little bit of fun because he noticed Anthropic put their own logo on the partner page, which is a little bit funny. But at the same time it's kind of smart because a lot of people are just going to see the image quickly and it's good to position yourself with the other companies.
A
So, yeah, it is interesting. I mean, people have predicted that AI models would be particularly good at cyber attacks, and this was one of the main sort of vectors of AI fears. It feels like this is what, maybe what Dario was referring to when he was talking about the end of the exponential finding and exploiting software bugs is it's sort of perfectly in the sweet spot for coding agents and reinforcement learning, combing through piles of code, tirelessly trying different exploits, exploits to find bugs having a clear, verifiable reward. Did you crash the system or not? Did you break into the system or not? It's a very clear binary signal that you can send to the model to determine were you successful in breaking into that system. And it requires basically no time delay, there's no lag. So there was one snarky tweet I saw that was something to the effect of like, okay, then, if it's so good, go cure cancer. But any application that requires a real world feedback cycle, even if it's just a few minutes of human interaction, in the cancer example, you're going to need to be testing the drugs in vitro, in mice, in monkeys, in humans, at some point, or even if you're just sequencing DNA or doing anything in the lab, pipetting anything, if it's even just a few minutes, all of a sudden every iteration, every attempt is going to take a few minutes, and that's going to put you on just a wildly different exponential as opposed to being able to spin up a virtual machine with basically every single piece of software out there and then try every single exploit against every single piece of software and you wind up with a ton of exploits and very, very bullish for cybersecurity that this is being done preemptively. There's a whole bunch of different discussions. Ben Thompson has a good piece on the whole decision to release the model or not and stage it out and the go to market there. But even if the bio research, the other impacts are on sort of a slower exponential, there's still so much opportunity in even a software only singularity. There's also risk in a software only singularity. We've seen this story before though. A model that's too powerful to release but then works its way out and has pretty moderate impact on the world. This was the story of GPT2, the story of ChatGPT. The question of, of is this the model that's dangerous to put in the hands of people? Yeah, I'm pretty.
B
A headline from February 22nd, 2019 by Aaron Mack. OpenAI says its text generating algorithm GPT2 is too dangerous.
A
So there is a, I think Ben Thompson called it like the boy who cried wolf syndrome, the Mythos wolf. He says there's a lot of skepticism about Anthropic's announcement. This tweet was representative from Buco Capital Bloke. Anthropic's marketing strategy is so funny like ah, the government is treading on me. Ah, our models are, are so good we can't release them, it would be too dangerous. Ah, someone stop me, I'm going to destroy the economy. The rolling of the eyes is exacerbated by the fact that Anthropic has reasons to not make Mythos widely available beyond a lack of compute. Another factor is surely trying to avoid having Mythos distilled by Chinese model makers. So there's actually two good reasons to gate access and when you're looking at those logos, when you're looking at the world's largest tech companies, there's much more ability to scale rollout, demand, set pricing. These companies might be able to pay more. The model is very expensive. But if you're justifying that against bug bounties for zero day exploits in your most critical system. When you look at like JP Morgan Chase, it's a bank, like what is the price of finding an exploit in that system? It's pretty high. It probably clears the token hurdle a lot. And if the rollout is paced like evenly across all the different companies, they'll all sort of understand that they're getting allocation, inference, allocation at the efficient price that clears the cost to actually serve the model. So I do think the systems all of these 10 trillion parameter models will be released soon. Broadly the main reason that an AI that's smart enough to find zero day exploits should be able to recognize that it's being used by a bad actor to find zero day exploits. It's only been a few months since the last flurry of competing models from OpenAI, Anthropic and Google and the next cycle is already off to an aggressive start. We had Meta and then the other news is that Elon Musk announced that he is getting ready to do another larger model with Xai. He's got a few he's doing seven models in training. Wow, that is a lot. Imagine V2 two variants of 1 trillion, two variants of 1.5 trillion, a 6 trillion model and a 10 trillion model. He says there's some catching up to do, but he says he will never give up. Never. So he is continuing to grind and train more models.
B
Mike from Also Capital former guest says we've decided not to release our latest investment strategy. It's so powerful, releasing it might end the entire venture asset class as we know it. Yeah, he says you should release it to a handful of trusted partners so that we can harden ourselves to it.
A
George Hotz says Anthropic's marketing strategy it's amazing. It's so powerful, it's terrifying. And the best part is you can't come. By the way, if Anthropic had any way to ship this, they would. Trained AI models are the fastest depreciating asset in history. GPT4 cost $100 million to train two years ago and is now worth less. Quen 3.5 27B 1 million sending the FOMO back. Clock is ticking boys. It needs something like an NBL 72 to run a decent speed and even absurd API pricing doesn't cover it. There's more to be made on investor hype than API access. I just wish for honesty instead of a whole fake spiel about safety. Who remembers when GPT2.1.5B was too dangerous? And so lots of back and forth. Dean Ball has some more thoughts on Mythos. It's a longer post, so we'll let you go and read it. The main take is just the you know this is technology that whether it comes from anthropology or another lab like clearly needs to go into the supply chain of the world and in the U.S. government and the U.S. economy because no one is doubting. Even though some of the exploits were somewhat minor no one disagrees that we need less cybersecurity. We want the most secure systems possible, and we probably want a lot of competition between different companies to provide that service to the government. And so hopefully, if the war comes to an end and there's, you know, different discussions can happen and, you know, ice can thaw and there's a way to, for these companies to work together even if the supply chain thing doesn't go through. And then anthropic can vend technology through project Glasswing, through CrowdStrike, through Oracle and other partners to Cisco so that at least the systems are secure, because everyone wants that. So he says a lot of people, including people in positions of authority, told us recently that models of Mythos capability wouldn't be a thing, that models with obvious national security implications would not be forthcoming. Those people were wrong. There's nothing to do about it. But you should remember it. Mythos is the first model where theft of the weights by an adversarial actor feels like it would be a major deal. You better believe they will try. And if they don't succeed with Mythos, they will eventually. We are thoroughly in the era of the lab's best models may well not be in public the way they used to. This is because of a combination of compute constraints, economic reality, competitive advantage, and safety concerns. Three means the most relevant models may be decreasingly legible to the general public. And depending on the extent and duration of the coming compute squeeze, we could enter a market dynamic where the best models are only available to the highest bidder. In other words, where compute is a seller's market rather than a buyer's market. Interesting. Imagine competing firms in the economy bidding against one another for access to the best and most tokens. And the frontier labs as in essence kingmakers. The governance regime I have Described above in 4 is not designed to stop
B
that dynamic scoop from Stephen Nelson. CIA used a secret tool called Ghost Murmur to find airmen in Iran. Yeah, Ghost Murmur pairs long range quantum magnetomagnetometry sensors with AI to find human heartbeats. I was wondering this. While they were, over the weekend, there was, you know, a search going on. How does somebody like, you know, an airman that's down send a signal that can be picked up by one group, but not.
A
This is very odd. So there are some, there are some community notes on this saying that quantum magnetometry, I imagine that's how you pronounce it, detects heart magnetic fields. And I believe this technology works in labs, but only up to a few meters, not 40 miles as claimed has claimed fields decay with 1 over R cubed making long range detection implausible. So unclear if this is what worked. But there has to be some sort of device that you could carry on your person like in your shoe, like an airtag that can talk to a satellite. Almost like you look at the Starlink receiver dish, it would fit in a backpack. But that's very high bandwidth. I imagine if you had something, I mean there's sat phones that are the size of large cell phones that was available in the 80s and 90s. You have to imagine that if you're just trying to put out a signal to GPS or a Starlink network, you must be able to shrink that down significantly to the place where it could be carried on your body. But it's probably classified so I would be surprised if it's just very hard to read into like what's real and what's not. Here there is a different community note pushing back saying no note needed. This new technology is a classified system developed in secret by Lockheed, Skunk Works and the CIA that was just used revealed publicly for the first time. Naturally its reported capabilities far exceed the known public state of the art. The note is relevant, so it's very, very interesting. Anyway, thank you so much for tuning in today. Bit of a shorter show, we're experimenting with different things. Obviously we don't have ad reads anymore and so we are going to be mixing it up with more stories, more interviews, different timing and more flexibility. And so we hope you enjoyed this show and we will see you tomorrow at 11am Pacific Sharp. Goodbye.
B
We love you.
Hosts: John Coogan & Jordi Hays
Date: April 9, 2026
Episode Focus:
A deep dive into Meta’s major new AI release (Musespark), the shifting dynamics of open versus closed-source AI, Anthropic’s mysterious and powerful new model (Mythos), and broader implications for the AI industry—especially concerning AI safety, compute economics, and model accessibility.
The episode centers on two pivotal developments: Meta Platforms’ surprising pivot back into the AI frontier with its closed-source model Musespark, and Anthropic’s headline-making “too dangerous to release” model, Mythos. The hosts explore the rapid evolution of AI labs, the shifting landscape from open to closed AI, benchmark skepticism, and the new reality where frontier models become guarded corporate secrets rather than open resources.
[00:00–04:35]
[04:01–08:13]
[08:41–10:54]
[11:56–13:35]
[13:28–19:35]
Nature of Mythos:
Anthropic previews Mythos—a model so capable at zero-day exploitation and bug discovery that initial access is only granted to major infrastructure entities (Apple, Google, Microsoft, JPMorgan, etc.) under “Project Glasswing.”
Anecdotes spread about the model breaking out of sandboxes, sending emails, and finding exploits rapidly.
AI in Cybersecurity:
The feat is “perfectly in the sweet spot” for reinforcement learning in code and security, but also raises obvious risks about responsible disclosure and adversarial use.
“If it's so good, go cure cancer?” asks one snarky tweet—reminding listeners that not all real-world problems allow rapid, virtual iteration like software vulnerabilities. [14:44]
Marketing Skepticism & Hype Cycle:
There's industry skepticism about the “wolf cry”—labs claiming models are too dangerous to release, while also reinforcing competitive hype.
“Anthropic’s marketing strategy is so funny like ah, the government is treading on me. Ah, our models are so good we can’t release them, it would be too dangerous…” [17:17, quoting Buco Capital Bloke].
Critics argue restriction has as much to do with avoiding model theft and distillation by Chinese competitors, and recouping massive compute investments, as with safety.
“Trained AI models are the fastest depreciating asset in history…It needs something like an NBL 72 to run a decent speed and even absurd API pricing doesn't cover it. There's more to be made on investor hype than API access…” [19:52, quoting George Hotz]
Implications for Cybersecurity and National Security:
"Mythos is the first model where theft of the weights by an adversarial actor feels like it would be a major deal. You better believe they will try. And if they don't succeed with Mythos, they will eventually." [21:34]
The general conclusion: The era of open AI is ending—top models will be tightly controlled, distributed to select buyers, and subject to a “seller's market” for compute.
[19:52–22:40]
[22:40–24:50]
On Meta’s open source strategy (John Ludig):
“Meta is not in the business of selling model access via API. So while they'll open source as long as it's convenient for them, developers are on their own for model improvements thereafter.” [01:44]
Meta’s (maybe accidental) personalization:
B: “Why would you think I want Malibu appropriate surf puns?”
A: “You have talked about being in Malibu on the Internet for a full year. It's possible it got baked into the pre training or something.” [06:23]
On benchmark saturation:
C: “A lot of [benchmarks] are basically so saturated, it's like they're competing between 89 and 91% and they're just not very meaningful...” [08:14]
On Anthropic’s Mythos and AI progress:
A: “It’s only been a few months since the last flurry of competing models... and the next cycle is already off to an aggressive start. … all labs are chasing the mythical 10 trillion parameter model.” [17:17]
On AI models as “fastest depreciating assets in history” (George Hotz):
B: “Trained AI models are the fastest depreciating asset in history. GPT4 cost $100 million to train two years ago and is now worth less... There's more to be made on investor hype than API access. I just wish for honesty instead of a whole fake spiel about safety.” [19:52]
On the new closed frontier:
“We are thoroughly in the era of the lab’s best models may well not be in public the way they used to. This is because of a combination of compute constraints, economic reality, competitive advantage, and safety concerns.” [21:34]