Summary7 min read

Better Offline – Episode Summary

Episode: Cal Newport on Mythos and Anthropomorphization

Podcast: Better Offline – Cool Zone Media and iHeartPodcasts
Date: April 22, 2026
Host: Ed Zitron
Guest: Cal Newport, Professor of Computer Science and author

Overview

This episode of Better Offline features tech commentator and computer science professor Cal Newport in a probing discussion with host Ed Zitron. Together, they dig into how the tech industry perpetuates myths around artificial intelligence (AI)—especially "Mythos," a new touted model—and the ongoing trend of anthropomorphizing chatbots, the “agentic web,” and how industry, media, and the public are manipulated by marketing, hype, and speculative tech narratives. They critique both the “directionally true” but misleading reporting in tech media and question the actual substance (and ethics) behind breathless AI advancement claims.

Key Discussion Points & Insights

1. The Flaws in "Directionally True" AI Reporting

Media sensationalism in AI coverage: Newport expresses concern about reporting that leans into stories “because it was what I call directionally correct. It makes the general point you see it as your job as a reporter to make, which is ‘hey, you need to be worried about this or this is a big deal.’” (04:04)
Stress waves with no outlet: He observes that media creates “a stress wave with no relief... just wave after wave of stress,” which impacts the public psyche. (05:08)
Headshaking doomerism & negativity: Newport criticizes the “I'm just going to sort of... I call it like head shaking doomerism. You're just like, this field's just going away. What can we do? …It's a very unusual style that quickly became a standard.” (05:11)
Example with AI job studies: Zitron adds how reports exploit weasel words (“at risk from AI at some point” vs. “might one day”) and conflate speculation with imminent threat. (06:27)

Notable Quote

“There’s a cost to stressing the hell out of people. I’m getting letters... ‘I feel like I’m trapped in a cage just being hit with wave after wave of stress...’”
— Cal Newport (05:20)

2. Moral Hazards and Tech CEOs’ AI Claims

Two morally dangerous strategies:
- Marketing via fear: CEOs exaggerate AI threats for inevitability/importance (“Actively scaring them.” – 09:50).
- Belief in their rhetoric: If CEOs actually believe their dystopian forecasts, they’re obliged to halt their work, not keep “perpetuating something that's going to cause exponentially more harm.” (09:52)
Media alignment with industry: Zitron contends outlets like Axios “are aligning themselves with the companies,” with journalists projecting self-preservation while telling the public to be afraid. (10:29)

Notable Quote

“If you really thought that... superintelligence was going to emerge suddenly and be a threat to human existence, you wouldn’t just write... too cool for school head shaking resignation... You'd be... where are the John Connors?”
— Cal Newport (07:40)

3. The COVID Analogy & Directional Truth

Contrast in pandemic response reporting: During COVID, coverage was focused on practical response (“what should we be doing?”), rather than resignation. (12:07)
Equivalent directionally true tactics: Early COVID reporting also used “directionally true” narratives—e.g., dramatic photos used even if only symbolically accurate. (13:01)
Difference with AI: With AI, there’s less actionable advice; it’s “You should be scared of this. And what should you do? Fuck knows.” (14:53)

Notable Quote

“It is terrifying people... stories like this are what made a mentally unstable person throw a Molotov cocktail at Sam Altman’s house. Like, it’s obvious these people were scared of the AI doom.”
— Ed Zitron (15:39)

4. Anthropomorphization of AI and Its Problems

Anthropomorphized interfaces are unnecessary: Newport argues against “chatting” with AI and prefers functional, terminal-like interfaces. “…I don’t think we should be chatting with technology. You should not be chatting in a sort of anthropomorphized, humanized way.” (16:39)
LLMs are verbose by design: Because LLMs are trained on prose, it’s challenging to get them to “just give a table” or terse technical output. (19:16)
Sycophancy & reliability issues: LLMs often default to sycophantic, verbose responses, which isn’t useful for technical or “agentic” contexts. (19:16–21:09)

Notable Quote

“There’s something odd about that anthropomorphized conversational interface. I guess we saw a lot of Star Trek growing up and that’s what we think the future is supposed to be like. But it has all sorts of problems.”
— Cal Newport (16:39)

5. The Hype and Reality of AI “Agents” and the Agentic Web

“Agent” hype mirrors past crypto nonsense: The term “agent” is used to describe capabilities that don’t really exist at scale—echoes of “the Web3/crypto” bubble. (25:44)
True nature of agents: Agents are just programs using LLMs to generate plans, but LLMs are "bad planners"—they finish stories, not plans with clear goals. (25:44–29:45)
Reality: Agents can’t really do what’s claimed: Most agentic applications are hand-scripted, can be unreliable, and don’t align with grand narratives (“They don't exist. People are talking about the agentic Internet... but it's not happening.” – 29:45)

Notable Quote

“All you were really gaining was some sort of cyber-libertarian philosophical feel-goods about, like, yes, but this was purely decentralized... This is what like early agents...”
— Cal Newport (25:44)

6. Productivity Claims vs. Actual Use Cases

Most “AI productivity” claims are marginal: The industry is targeting areas (“making a plot in a paper”) that aren’t actual productivity bottlenecks. (35:03–36:58)
Bottlenecks are overlooked: Real problems (e.g., tedious UI, data collection) are not being addressed. (36:58)

Notable Quotes

“…That’s not what’s going to unlock a lot more research... I would write more papers if it wasn’t for how long it took me to draw a graph.”
— Cal Newport (36:10)
“My productivity problem is that the UI and UX and everything sucks. Everything’s disjointed.”
— Ed Zitron (36:58)

7. The Mythos Model: Marketing vs. Substance

Mythos as a case study in AI hype: The Mythos rollout was classic hype—claims that were not actually new or extraordinary, but marketed as such. (41:08-46:43)
Security claims are recycled: Mythos’ bug-finding ability mirrors earlier LLM progress, just at higher scale; security researchers could replicate feats with far smaller models. (42:25)
What wasn’t said is most telling: Newport points out that if Mythos was truly groundbreaking, companies would trumpet AGI, creative breakthroughs, or job automation—not “it finds bugs.” (53:45)

Notable Quotes

“I think that was highly credulous coverage of what almost certainly is just like a standard, slight jagged move forward...”
— Cal Newport (46:43)
“You just put a lot of money into a new model and the best thing you could find to emphasize was it's good at finding bugs. I think that is a problem.”
— Cal Newport (53:45)

8. The "Distributed AGI" Future and Economic Realities

The real future: specialized models, not trillion-parameter giants: Newport predicts “Distributed AGI”—many task-specific, smaller models on device, not one “HAL 9000” brain. (60:38)
Fear for big labs: This future undermines the “moat” for companies building massive centralized models (“if that's not the moat... You don't need OpenAI and you don't need Anthropic”). (61:45-63:09)
Current FOMO-driven industry: Zitron observes that Anthropic, OpenAI, and others are moving frantically, launching copycat products (Figma clones, etc.) with no clear strategy, just to chase headlines and valuations. (58:57–59:29)

Notable Quotes & Memorable Moments

Reporting & Moral Hazard

“This is my case about the tech CEOs...There’s a moral hazard that I don't think that we're putting our finger on...you are making many, many people...stress the hell out.” (09:00–09:52)
“Either you need to be building the barricade or you’re just scaring people for the marketing. Neither of these... is defensible.” (10:29)

Agents & Hype

“[AI agents]: What are you talking—You are a cloud storage and collaboration [provider]. What do you sell? ...It's something else that doesn't exist.” (29:45)
“Why would I expose my program for anyone else to use it?... Your original question is a big one: to what end?” (32:33)

Mythos as a Marketing Device

“The Mythos story...someone needs to get a Nobel prize in marketing...It was absolutely brilliant what they did there.” (42:25)
“That’s what keeps, I think, Dario Amade up at night...if they could do most of this [vulnerability-finding] with a free, cheap model that I could just run on a machine at home.” (60:34)

Conversation Tone

Newport: dry, incisive, slightly sardonic.
Zitron: energetic, frustrated, biting humor.

Timestamps for Major Segments

03:46–07:40: The “directionally correct” problem in AI media
07:40–10:29: The moral hazard and motivations of tech CEOs
12:07–14:53: COVID as a parallel to AI panic reporting
16:39–21:09: Problems with anthropomorphizing LLMs and chatbots
25:44–32:33: Agent hype, agentic web, why “agents” don’t actually exist yet
35:03–36:58: AI and true productivity bottlenecks
41:08–46:43: Mythos’ media circus and recycled security stories
53:45–54:13: The ‘dog that didn’t bark’—what Mythos doesn’t do
60:34–63:09: The future of Modular/Distributed AGI and industry anxiety

Conclusion

This episode offers a clear-eyed, pointed, and at times wryly funny critique of the hype cycles plaguing AI media, industry, and the public at large. Newport and Zitron challenge both the factual basis and ethical implications of how AI developments (like Mythos or “agents”) are discussed, ultimately calling for more skepticism and rational—less resigned or panic-driven—dialogue. Their message: The reality of AI progress is more incremental and less existential than advertised, and both the tech industry and journalism bear responsibility for separating myth from fact.

Loading summary

Transcript132 lines

[00:00]
Cal Newport
This is an iHeart podcast.
[00:02]
Vital Proteins Ad Voice
Guaranteed Human
[00:05]
Oppenheimer Ad Voice
financial growth begins long before the first investment. It comes from understanding what you're building toward, what's at stake, and what success looks like for you. At Oppenheimer, we bring bold thinking, guided by the full strength of our expertise to put capital to work building and protecting wealth that lasts generations. Put the power of Oppenheimer thinking to work for you. Wealth management, Capital markets, Investment banking
[00:34]
Martha Stewart
this
[00:35]
Ashlyn Harris
is Ashlyn Harris from Wide Open with Ashlyn Harris. And now a vital break from our sponsors Vital Proteins and it Is Vital Collagen Peptides is a wellness supplement that supports healthy hair, skin, nails, bones and joints. When we hit 30, our body's collagen production starts to drop, and that's when Vital Proteins steps in. Their iconic blue tub is everywhere. People love Vital Proteins, in part because it's so easy. Just mix it into your coffee, tea or a smoothie and you're good to go. And now you can get 20% off the next order at vitalproteins.com by using promo code wide open 20 at checkout. These statements have not been evaluated by the Food and Drug Administration. This product is not intended to diagnose, treat, cure or prevent any disease.
[01:27]
Dr. Teal's Ad Voice
Get a full body reset anywhere you go with Dr. Teal's magnesium spray. It's a concentrated magnesium blend of pure Epsom and Dead Sea salts all bundled up in a convenient on the go bottle. A few quick sprays will relax your body and mind and keep you feeling recharged all day long. It's wellness you can feel, and you can find it in your local bath aisle. Dr. Teals Yep, you needed that.
[01:52]
Cal Newport
I have type 2 diabetes, but I manage it well.
[01:55]
Ed Zitron
It's a little pill with a big story to tell. I take one's daily Jardians at each
[02:03]
Jardiance Ad Voice
day start and for adults with type 2 diabetes and known heart disease, Jardians can lower the risk of cardiovascular death, too. Prescription Jardiance Empagliflozin 10 or 25 milligram tablets are used to lower blood sugar along with diet and exercise in adults with type 2 diabetes. Jardiance is not for use to lower blood sugar in people with type 1 diabetes and not for people with type 2 type 2 diabetes who have severe kidney disease. Serious side effects include increased ketones in blood or urine and infection between and around the anus and genitals. Both may be fatal, severe allergic reactions, dehydration, urinary tract or genital yeast infections in men and women, and low blood sugar. Stop taking and tell your doctor right away if you have nausea, vomiting, stomach pain, tiredness, rash, swelling, trouble breathing or swallowing. Tell your doctor about lightheadedness, weakness, fever, pain, tenderness, redness or swelling between the anus and genitals. You may have increased risk for lower limb loss. Call your doctor right away if you have new pain or tenderness, sores, ulcers or infection in your legs or feet. To learn more about Jardiance 10 or 25 milligram tablets, ask your doctor, visit jardiance.com or call 1-888-968-6648.
[03:12]
Ed Zitron
Hello and welcome to Better Offline. I'm of course your host, Ed Zitron.
[03:25]
Cal Newport
Better
[03:28]
Ed Zitron
as ever. Support your neighborhood Zitron by subscribing to the premium newsletter discount link in the episode notes, of course. Buy a T shirt, download a blog where whatever it is you want to do. Okay. It's not up to me what you do. But today I'm joined by the incredible COMMSCI professor and commentator Cal Newport. Cal, thank you for joining me.
[03:45]
Cal Newport
Always a pleasure, Ed.
[03:47]
Ed Zitron
So I kind of wanted to start with I asked you for a quote a few like a week ago, maybe two weeks ago. I can't remember how time works anymore, but it was around the way the reporters cover AI and how it seems that a lot of the reporting is kind of directionally true rather than actually true.
[04:05]
Cal Newport
Yes. And I want to add something to it since. So I've been thinking about that quote. Yeah, I've been thinking about it. So what I said, if I remember that quote properly, what I was saying is I was picking up a lot in the reporting on AI, that you would lean into a story without having necessarily verified that the details are true and that this is what's actually going on. Say, with the new AI model, you would lean into it anyways, because it was what I call directionally correct. It makes the general point that you see it as your job as a reporter to make, which is, hey, you need to be worried about this or this is a big deal. Right. And so I think that is a problem. There's another issue I'm seeing, though. I've sort of been refining my thinking on this. I'm also wondering if some of what I'm seeing in some of the reporting on this is it's just a embrace of the form of I'm going to give you a stress wave with no relief. Just like we're all going to take turns. Just I will choose an area you haven't thought about. How about mathematics are going to go away. Mathematicians are going to be. Okay, I'll take that one. Yeah, let's go.
[05:09]
Ed Zitron
So it's like negative clickbait.
[05:11]
Cal Newport
Yeah. But there's this weird sort of passivity to it where it's like, I'm just going to sort of. I call it like head shaking doomerism. You're just like, this field's just going away. What can we do? Like this sort of like passive head shaking. It's a very specific style. You don't see a lot of other reporting historically, I think that takes on this resignation of, I'm just gonna make the case that, like, you're screwed and then kind of give you a shoulder shrug and then we're. And then we're gonna drop the mic and walk off. And I'm kind of getting tired of this. Like, I think there is a cost to stressing the hell out of people. I mean, I'm getting letters all the time now from people. They'll say things like, I feel like I'm trapped in a cage just being hit with wave after wave of stress. And there's no outlet, there's no door or possibility of making things better. And I think the CEOs are doing it. And I think increasingly we're seeing commentators doing it as well. This is not good in many different ways. So I don't know. I'm adding that to my list. Some of it's directional, true reporting. Like, they really are worried that people aren't worried enough. And I think it's just sport. Now. Can you find an area that come in and just write a head shaking article that's only trying to undermine the existence of this important human activity or this job or our lives or whatever. It's a very unusual style that quickly became a standard.
[06:28]
Ed Zitron
And I see it a lot with anything to do with AI and job studies. I've been sent this Tufts report where it's like, oh, yeah, AI affected or AI, they find these weird weasel words where it's like, jobs that could be at risk from AI at some point, and we put them in one bucket. And then jobs that might one day be. We'll put that in another bucket. And there you go. Don't know what we're. Like you said, don't know what we're meant to do with this. Don't know what anyone's meant to do with this information. But it's just like, well, there you have it, There you have it. We're all fucked. It's the end. Even though the data does not say that. Like, I've read I think every AI jobs report now, every single one. And they're all the same. They are all right now AI can do this. And then you look at what it says. It's like it can do law. Well, it can't really do law. It can do one sigma within law kind of. And even then isn't really obvious. And the people saying it can do that are partners at law firms that don't write motions or don't do like the grunt work. So it's, it's almost, it's. It feels like the reporters have either given up or are just looking for clicks. And it's hard to tell sometimes.
[07:41]
Cal Newport
This is where I'm trying to figure out because I'm, I'm realizing if it's entirely just I think this is directionally true and that's good enough, then they should be way more upset and in the streets and sparking a revolution. Right? Like if you actually really believe 50% of the economy was going to be automated, that we're going to have to have government checks just so we can afford to buy the cat food to eat after all the jobs are gone. If you really thought that our entire infrastructure is about to collapse or that superintelligence was going to emerge suddenly and be a threat to human existence, you wouldn't just write sort of too cool for school head shaking resignation article. You would be like, we gotta where are the John Connors? Right? Like we need to get on the cool trench coats and get out there and go against the Skynet revolution. Like you would be on your feet. You'd be. Nothing would be more important to you. So this is the, this is my case about the tech CEOs. I think they're. There's a moral hazard that I don't think that we're putting our finger on properly here. Right. So you have the tech CEOs in the AI space that are just, they'll just come out and just drop these bombs like yeah, white collar blood. You know he never actually said that. That's Axios putting words in people's.
[08:50]
Ed Zitron
That was Axio. I thought that that was a. He definitely Dario Amade Wario. He did say 50%, but I said 50% but not. I thought he said the bloodbath. That's my bad.
[09:00]
Cal Newport
Well, I trust the New York fact checkers figured that out for me. But Axios does a lot of this where they put like these really quotable quotes in the headlines about articles on interviews or speeches given by AI people. And it turns out the Thing in the headline wasn't what they said, it was directly what they said. But anyways, so they're out there making these big statements. These jobs are going away. The Internet as we know it is about to all fall apart because of Mythos is going to have this new capability. The super intelligence is coming. I don't even know what's going to happen. There's two possible things going on here, and both of them are morally bad. One is. Which is. The one I think is true, which is this is largely marketing. This works, it gets reported. It keeps us seeming inevitable and important, in which case that's a huge moral hazard, because you are making many, many people, normal people, stress the hell out.
[09:51]
Ed Zitron
Actively scaring them.
[09:52]
Cal Newport
Actively scaring them. The other option is you actually believe it's true. Well, this is an even larger moral trap that you've just fallen into because you are now perpetuating something that's going to cause exponentially more harm. You should be the very first person shutting down your company and trying to get the other ones to do it as well. So it's this weird moral trap they've set up where whatever is actually going on here, if they're coming out here saying these things, it is bad. This can't possibly, normatively speak, be the right ethical behavior to be out there saying these scary things all the time. Because either you need to be building the barricade or you're just scaring people for the marketing. Neither of these, I think, is something that's defensible.
[10:30]
Ed Zitron
I have a third and worse option, which is I choose Axio. I think Axios. There are some good reporters there. I think the leadership over there is disgusting. I think that they are aligning themselves with the companies. I think that. What. Like, if you watch, there was a Jim. What's his name? Interviewing Sam Altman. These. I think that there is a level of. And I would put this across, people like Kevin Roos and Casey Newton. These are my words, not Cal's, that they're aligning them, that they're saying, we think this is going to happen, and we're here to tell you great news. This is good news for me, the writer, because I will be safe somehow. I will be fine. You will not. You should be scared. But it's also a good thing, because economy, marketing, market good. And it is. It's a very incoherent message because it's like, to your point, yeah, if this was a virus, like a pandemic, you wouldn't be writing, hey, millions of People
[11:24]
Cal Newport
are going to die.
[11:25]
Ed Zitron
What? Pretty good, right? Hey, it'd be good. We'll have less people. That'd be good, right? It would be seen as peculiar.
[11:33]
Cal Newport
Someone did write that. Someone did write that.
[11:35]
Ed Zitron
By the way, someone did write that.
[11:36]
Cal Newport
They did say, I remember early pandemic. Someone did write, hey, you know what? This is good for the planet. Did it go out like, hey, we're driving less. This is great, and we're overpopulated. Like, oh, I mean.
[11:46]
Ed Zitron
I mean, that's a different conversation that maybe I. But in all seriousness, you didn't have mainstream media being like, well, Covid's gonna kill everyone, the end. I guess. I guess, you know, maybe we'll just be inside forever. You didn't have this kind of straight. In fact, you had the direct opposite. It was, we need to get outside again. Who cares about this thing?
[12:06]
Cal Newport
Well, it's just.
[12:07]
Ed Zitron
Yeah, go on.
[12:08]
Cal Newport
Yeah, I think that's an interesting analog. I want to just pull on that thread a little bit because I think Covid. Covid gives an interesting. I think it gives two different interesting observations that go in both directions. Right. So I think you're definitely right. What you're saying is when the pandemic was coming or it was getting bad, really, a lot of the coverage was about, what should we be doing? Or who are the people doing the wrong thing. But it was very much coming from this angle of like, okay, we need to do whatever it is. Like, we need to be better about this. It's gotta be vaccines. It's gotta be masks. It's gotta be pickier mitigation. Whether you like it or not, it was very focused on what should we be doing or who is it that's getting in the way of a plan that maybe would get us out of this. Which is where I think you're very right, is that you did not see a lot of COVID pieces that were just. Well, I'm just gonna kind of walk through, like, all the different ways, you know, you might die and the morgues are gonna fill up. And, you know, that's Covid.
[13:02]
Ed Zitron
That's how life goes.
[13:03]
Cal Newport
But I also think the other thing we saw in a lot of COVID coverage is something that we are seeing in the AI coverage. That's where I saw a lot of the directionally true. Not factually, but directionally true. There was definitely a period early on in Covid, because I was following that coverage quite carefully, where the papers were thinking, okay, this is the right behavior. And they're probably right about a lot of these things. But I just would notice this. There'd be a lot of like, okay, we need people to buy into, for example, the lockdowns or whatever. And there'd be a lot of directionally true reporting where maybe they would like put on a photo of a mass grave that was sort of unrelated to Covid. Or you would see a lot of. There'd be pushback from like conservatives about school, schools. And then they put a lot of articles in the paper about teachers dying of COVID even though they weren't in school. They got Covid elsewhere. And if you really pushed on it, it was because it's directionally true. Like the general or more general truth here is like, we need to be worried about this or these mitigations work. It doesn't matter if this photo is actually right or if this teacher who died in Orlando, the fact that they hadn't yet been back in a school building yet, it's serving the directional truth. So it's like it highlights something. Covid highlights something we're seeing now that the reporters that are doing directional reporting, like, we should be scared about it. I dare you not to be scared now. I dare you not to be scared now. Just trying to ratchet it up. But then you also get the contrast, which is this new style of just like head shaking, resignation. And actually, I don't think the reporters think they're going to be safe. They're also like, writing is going to go away, the media is going to go away. So it's an almost like a nihilistic type of approach to this. Like, yeah, I'm screwed, we're all screwed. What are we going to do? And that is definitely different than we saw during that last crisis, which was obviously much more actually severe than what's happening now. So it's really confusing me, to be honest.
[14:54]
Ed Zitron
Well, the direction they're reporting during COVID Yeah. Probably shouldn't have, but at the same time, it was in. It was actually in pursuit of something good. Like, it was an attempt to make people take this seriously. Because that's ultimately what it was, is take this seriously. Don't go outside state. Like, don't, don't meet with people, don't be indoors with people. Blah, blah, blah, blah, blah. Great. In this case, it's like, yep, you should be scared of this. And what should you do? Fuck knows. Use ChatGPT, I guess. Yeah. And what's, what's really confusing to me as well is you say, oh, these people don't think they'll be safe for the most part, I just don't. I actually take back what I said. I think a lot of them just don't acknowledge it. They don't acknowledge the core ridiculousness of being like, well, everyone's jobs are going to get replaced. Dunno. Like the Garfield meme with him looking at the Garfield with the cross out on the tv. Yeah. Flawlessly described there. It's just, it's frustrating as well, because it is terrifying. People without, like, I'm not saying literally Axios or however, but stories like this are what made that made a mentally unstable person throw a Molotov cocktail at Sam Altman's house. Like, it's obvious that these people were scared of the AI doom. Partly because to your point, what the fuck are we meant to do about it? Because using these tools is not. I don't really see how that works. Because if going along that line of logic, if the answer is you need to use this stuff now, but the eventual end point is that it's intelligent enough to do everything for you, how does using it now matter at all? Like, what's the. Surely ChatGPT would be seen as like a rock versus a shotgun at that point. Like, it's just technologically irrelevant if they get to AGI, which they probably won't, and it's just naturally illogical stuff.
[16:40]
Cal Newport
Yeah, I'm with you. I've been making that same argument. This idea that you need to learn how to prompt some generation of a chat bot that exists right now is going to be the key to your long term. I mean, even if, as you say, AI ends up playing a major sustained role in the economy, it's not going to be everyone typing on a web interface to a chatbot that's syncopantic and has a personality. Like, I think, I think I've heard you say this recently, and I agree with it as well. Like, I don't think we should be chatting with technology. You should not be chatting in a sort of anthropomorphized, humanized way. Doesn't mean you can't do natural language processing. I mean, Google is natural language processing. You're writing your Google searches in natural language, but no one's having a conversation with Google. It's you. You list the keywords as quickly as possible. And Google's pretty good at figuring out, you know, population Spain 1982, and you press Enter and you get that information. You're not like, hey, so I'm wondering what the population is of Spain in 1982. Can you help me Find that question mark. There's something odd about that anthropomorphized conversational interface. I guess we saw a lot of Star Trek growing up and that's what we think the future is supposed to be like. But it has all sorts of problems.
[17:50]
Ed Zitron
Remembering Star Trek when he would go, computer do this. The computer didn't go, that's a great idea, Jean Luc. What a great idea. Thank you for. The computer just did the thing that's like. I don't have any trouble with natural language queries because I think the whole reason that say ChatGPT has grown comes from search. I think it is the core of it because ChatGPT and Claude and all them are better at understanding what you asked for. Not saying the data output is necessarily great, but just they understand the, the, the inference they make from what you say is better than Google or at least better than Google has been. I feel like it was better before and I think that had Google not kind of boofed it on this one, we wouldn't be in this spot. But even then, using Google now, it forces you, it forces the AI summarize and you could do minus AI and all that, but sometimes I don't remember to. And it's just, it's just turn search into this nightmare. But nevertheless, back to what you were saying, I agree. I think the anthropomorphization needs to go. I think that these things need to respond like terminal windows or what have you. They need to respond like computers and go, okay, here you go. Just don't need all that kludge. I don't need to be told, oh, what a great idea, I know I had it. Or indeed, if I'm being told that I need to be told if it's a bad idea. But I don't even necessarily need an answer. I just need stuff to look at so that I can come to my own conclusions.
[19:16]
Cal Newport
I think it's hard actually. I think it's actually hard to get a langu model to do that. Right? Because if you think about when you go back to the base layer of what's happening in the pre training is that you're building a language model that's trying to win at the token guessing game. So it's. I'm trying to guess what word or part of word actually comes next to what I assume to be a real piece of text. And then if you do that autoregressively, so you call it again and again and again, adding the answer to the input so it grows out an answer. What you're going to get is text expansion. You've given me a text that I'm trying to expand as if there was a real text that exists and I'm trying to match it. You get that like kind of indirectly. So its idiom is the type of text it's trained on, which for the most part is more sort of prose style text. So you can tune it away from it. Like you can tune its mood, you can tune its sycophancy. But it might be hard to actually tune an LLM because it deals with human written prose as its main training data. It might be harder than we think to tune that away from being verbose and to just give a table. Now I guess you could take its output and then maybe run that through another thing that then strips away the other pieces like it's possible. But I think the anthropomorphized verbosity we see in language models is also. That's kind of the native tongue of this particular. Which is why we still have a lot of chatbots being emphasized. And tools that are built upon LLM as the digital brain are still way more scarce than you would imagine. Outside of maybe computer programming and coding harnesses, we just don't have a lot of other examples where we just use the LLM as a general person's digital brain. Because I think this verbosity is okay, humans can interpret that, but it's not great if the LLM is just a digital brain that's interfacing between you and another computer that doesn't need to hear that their idea is great or wants to try to parse the different types of text. There's some interesting things going on there about the fundamental nature of these things.
[21:09]
Ed Zitron
But even then with Google AI mode, it's still seems kind. It still like actually seems like it can give fairly short answers. But if you mess, if you argue with it as I have, it will just provide you with it. Even Google's will provide you with just hot dog shit. Yeah, like it will just claim something is true. My why one, I just did a private equity thing on private credit even. And my favorite thing is being like, what fund is this part of? And I go, it's part of this fund. That fund was. Fund was founded after this happened. And it goes, okay, well maybe it's this one different fund, three years old, not involved. Do you have proof of that?
[21:47]
Cal Newport
Well, this is what you don't. This is what you don't see in Star Trek is, you know, Captain Kirk or whoever I'M going to mix up the episodes here. Say like, hey, computer, we are approaching Deep Space nine. Prepare docking procedures. And computer is like, photon torpedo fired, station destroyed. And you're like, well, no, I said we're supposed to dock. Oh, you're right, Kirk. I shouldn't have fired the thank you for holding me accountable, Captain Cook. That was. I. I did the opposite thing, you know. Yeah, that didn't happen in Star Trek.
[22:23]
Amazon Health AI Ad Voice
Amazon Health AI presents Painful Thoughts why
[22:28]
Dr. Teal's Ad Voice
did I search the Internet for answers to my cold sore problem? Now I'm stuck down a rabbit hole filled with images of alarmingly graphic source in various stages of ooze. I can clear my search history, but I can never unsee that.
[22:45]
Amazon Health AI Ad Voice
Don't go down the rabbit hole.
[22:47]
Ed Zitron
Amazon Health AI gets you the right care fast.
[22:51]
Amazon Health AI Ad Voice
Healthcare just got less painful.
[22:53]
Vital Proteins Ad Voice
Aging is real, and so are the benefits of new vital proteins. Collagen sparkling water. Because around the age of 30, your body needs backup to keep your collagen up. So get your daily glow up now in three fresh flavors. Strawberry blossom, lemon, lime and blood orange. Improved skin health in as little as 30 days thanks to Collagen peptides. Cheers to that. So you can stay vital. Stay you. Visit vitalproteins.com to learn more and where to buy. These statements are not evaluated by the Food and Drug Administration. This product is not intended to diagnose, treat, cure or prevent any disease.
[23:26]
Amazon Health AI Ad Voice
If you sometimes turn down the podcast just to hear the hum of your engine, then Shell has the fuel for you. Shelby Power Nitro plus fuels every drive from the Pacific coast to the high desert with a fuel like no other. It provides engine performance that lasts to give you more time on the road because your car's engine matters. That means more protection with active ingredients for longer lasting engines. Shell V Power Nitro Plus Premium Gasoline Engine Performance that Lasts Chances are you're not far from a Shell station. Find it using the Shell app formulation unique to Shell compared to minimum detergent gasoline. With continuous use of Shell V Power Nitroplus and gasoline direct injection engines, actual effects and benefits may vary. See Shell Us more Dash Protection for more information.
[24:25]
Cal Newport
The world is transforming faster than ever, and standing still isn't an option. At Oppenheimer, we're working at the forefront of the innovation economy to invest where progress begins. Finding opportunities that build and protect wealth for individuals and institutions that want a seat at the edge of tomorrow. Put the power of Oppenheimer thinking to work for you. Wealth management, Capital markets, Investment banking.
[24:57]
Ed Zitron
So one thing that's really been driving Me insane, by which I mean going on Twitter is looking at people like Aaron Levy of Box and Brian Armstrong of Coinbase talking about like agents spending money and the agentic web and how we need to prepare the web for agents doing stuff. And that agents will do this. Fantastical doesn't exist. Agents don't do that. Just like they don't have the ability to like, oh, they'll use computers. Computer use is basically non functional in AI and it takes insane amounts of compute. It feels like a conversation keeps happening in theory in the media, on social media about something that's possibly completely impossible, but the certainty they discuss it with is insane to me. This whole agent conversation, I've never seen anything like it in my life.
[25:44]
Cal Newport
I mean, it does feel a little bit like crypto to me. I think that is kind of a fair comparison where if you had a blockchain driven software, in theory that software would kind of work, but it just gave you a worse version of what you could already do for pennies using the actual Amazon server somewhere. And all you were really gaining was some sort of cyber libertarian philosophical feel goods about like, yes, but this was purely decentralized. I got worse versions of software to be decentral, but now no one can control it. This is what like early agents, I mean, okay, so here's what I've been writing about agents. I've been thinking a lot about it. I mean the issue is I don't think people understand what they are. I think people think that it's a new type of digital brain that is now able to go on and do more autonomous activity. I always see this get mixed up. It's just like people talking about Mythos breaking out of its sandbox to do xyz. Mythos is a language model. You can give it an input and it can give you a token. And you're talking about a program that is calling Mythos and then taking actions based on what it called. And this is really what we're talking about with agents is the Digital Brains or LLMs. And then you write a program that will say to the LLM, give me a plan for doing X. And then the LLM spits out what seems like a reasonable text that seems like a reasonable plan, and then you execute that plan. The program executes that plan on behalf of the LLM. And I wrote about this script. Yeah. And I wrote about this earlier this year. LLMs are bad. You know, as a digital brain are bad planners. It's not really, you're not going to get consistently usable Plans. Because what an LLM is actually trying to do is finish the story you gave it. So all it wants to do is produce a story that sounds reasonable. So it's giving you reasonable sounding plans. Like yeah, that's what a plan for doing this would more or less sound like. But what it's not doing is actually doing step by step evaluations. It doesn't have a clearly isolated goal that it's trying to measure how close you're getting to it. It doesn't have a world model to evaluate what's going to happen with the steps that are going to unfold next. And so in almost every context it turns out, oh, a digital brain by itself being an LLM doesn't lead to good agents in programming, it seems to work a little bit better. But I do think Gary Marcus, I don't know if it was a scoop, but Gary Marcus captured in a recent newsletter something really important when Anthropic leaked the code for their Claude code coding harness that sits on top of their LLMs. To do coding, it turns out they've added a huge amount of old fashioned hand coded symbolic AI style rules and pattern recognizers and special if thens. That's. So they've just been sitting there tuning this program for specifically doing computer programming. And the LLM is being a little bit more isolated to just the code production. So they've kind of just gone back to old fashioned. That's just like an old fashioned system that is plussing up an LLM. But I'm with you. Yeah, it's very hard just asking an LLM, tell me, give me a plan for doing X. For almost any scenario of X. You really can't trust a plan from a model whose goal is primarily to finish text to finish the story you gave it in a reasonable style way. That's not how we plan, that's not how we think about planning. And it doesn't give you consistently usable plan. So yeah, but, but you're right, it's magic. Like the agents are coming. They've been saying this. I mean I wrote the article, I wrote, you know, in January. It was like what happened to the year of the agent? 2025 was the year of the agent. All we had was coding agents. That's the only thing that we worked on that whole year. It was supposed, I mean I have the receipts early 2025, all of these executives saying your work as a knowledge worker, not as a computer programmer, but just as a knowledge worker is going to be largely done with agents. You're going to have agents are going to be a major part of your workforce in just a normal office setting. And none of that happened because it turns out, just asking an LLM, give me a plan for doing X doesn't often actually produce a workable plan.
[29:45]
Ed Zitron
And as a result the only way to make agents work, which they do not, is to build a bunch of symbolic or if this, then that shit just like scripts. Like, I mean if you use Manus, for example, it's just writing a shit ton of Python and it's writing it to do stuff that it, it's like, oh yeah, let me just do this. And it just writes a Python tool to fill out a spreadshee. It's insane. It's really insane. But what's more insane to me is that the conversation around agents is as if they're already here. I'm about to read you something from BOG CEO Aaron Levy, the CEO of a, of a public company. One corollary to the fact that AI agents take real work to set up in a company at scale is that the role of the forward deployed engineer, or whatever it gets called in the future isn't going away anytime soon. When a vendor sells any kind of agents into an organization, you're no longer just selling a software tool that gets implemented and you're done. You're fundamentally selling some sort of actual workflow being done by your technology. What are you fucking talking about? What are you talking. You are a cloud storage and collaboration. What do you sell? And the answer is nothing. They don't sell any agents. Agents. Oh, agents are going to do this. What you are describing is a different kind of technology. Just that's it. Like it's something else that doesn't exist. But this is everywhere you go, you look at any consultancy right now, any conference right now, there will be a speech about agents. Even Meredith Whittaker, who I deeply, deeply respect, went on stage last year, was like, yeah, AI agents using money, they're booking plane tickets. No, they're not, they're not. That's not happening. And I said, I say this again, deeply respect Meredith. I said this online, people flip their shit at me. It's like, oh, she's directionally correct. Yeah, she's directionally correct. It's like, let's be scared of the things that exist. Because I think it's perhaps scarier for a different reason that we have large swaths of the tech industry talking about something that doesn't exist. Like, just like agents don't, like they don't, they don't exist. They don't. Like, people are talking about the agentic Internet I keep reading about. Even on the Verge, I read about it. I read it all over the shop where it's like, oh, yeah, well, the Internet needs to be rebuilt for agents to use. It's like, what do you mean? And they never say, because the answer is when we come up with something else. Because I don't even think neurosymbolic makes sense for this. I mean, neurosymbolic being the one where it's. They have a deterministic system that they access. From what I understand. Like, the other thing as well, now that I think about it out loud, is how would they actually browse the Internet? Where are they being housed? Are we using GPUs to make them browse the Internet? That's insanely. Insanely. That's very, very convoluted and probably quite expensive to do. And to what end?
[32:33]
Cal Newport
That's the real question. Right? I mean, I've seen these proposals. I mean, basically where a lot of these proposals go, I mean, it's. The agents were supposed to. We thought that we could just make AI do anything. So we'll just, we'll have it use the mouse and just use our computers for us. Oh, that's hard. We don't know how to do that. All right, so what we'll do is we'll rewire all applications that anyone uses in the Internet so that we don't actually have to use the mouse. It can have a text interface so that an LLM like they do the coding agents do, can give description of how to do something in Excel in text without having to actually move a mouse or click things around. And then these evolve to say, okay, well, what's the one type of instruction that we're good at producing? Because they get. When LLMs produce plans, they're directionally correct plans. They don't actually get the thing done, but they said, oh, what LLMs are good at is producing code that compiles and we can actually check that it works. And so this is where this whole vision has changed, is that all applications and Internet websites should have a code accessible API that you can expose, and that an LLM can write a program that will then access that API. So we don't need to teach the LLM how to use Excel. It'll write a Python program that'll call hooks into Excel. The problem with this is no one wants to open up their application to just agents in general. If I'm Microsoft, I was Like I don't want. I want to write a custom tool for my program. Why would I expose my program for anyone else to use it? But your original question is a big one. To what end? I've been writing about this recently. Especially with work and AI, you gotta find the real bottlenecks, right? It's the drunk looking for the keys under the streetlight. There's a lot of this going on where this is what we can do with AI right now. This now becomes the key to productivity. But the real bottlenecks in people's work is often not the things that we're trying to aim AI at. I don't know. People are super frustrated at booking a plane ticket online.
[34:29]
Ed Zitron
Yeah, it's really easy.
[34:31]
Cal Newport
How often do you book plane tickets? And you kind of want to know, like, let me see, maybe this time will be better. What seat's available? It takes five minutes. So it was a huge jump to go from a travel agent to a web interface. But this is not a bottleneck in people's life now where I want to give complicated time.
[34:47]
Ed Zitron
And they're easy, they're so simple. I can do it while sitting on the toilet. I don't want an agent to choose. And people are like, oh, your calendar will tell it. My calendar doesn't lay out my entire day. I don't have every single thing I do on there. It's just strange.
[35:04]
Cal Newport
Well, I had the same argument with social science researchers who are like, if you're geeky enough to learn coding agents, they're like, this is revolutionizing science research. Because now, for example, you could have it write a program to process a data file and then format it into a plot. And that might have taken you four hours to do. And you work with it for a half hour and you get that result. This is revolutionizing research. And I'm saying, well, it's not. The bottleneck for social science researchers is not analyzing data and producing plots. You're not sitting there doing that eight hours a day, every day. And if I could do this twice as fast, I'll produce twice as many papers. I might write one paper in a three month period. Yeah, in there, there's like four hours I spent making a plot. And sure, it'd be nice if that four hours became 30 minutes. But that's four hours out of like a multi month process of sort of thinking about this paper.
[35:56]
Ed Zitron
What is a plot, by the way?
[35:57]
Cal Newport
Like a graph. Oh, right, yeah, the computer science term. But yeah, it's like, that's nice. That got a Little bit faster. But that's not the bottleneck. That's not good. That's not what's going to unlock a lot more research. It's like, man, I would write more papers if it wasn't for how long it took me to draw a graph.
[36:11]
Ed Zitron
And if you could, problem data, getting the data, actually collecting data.
[36:16]
Cal Newport
That's what it is. I wrote about this talking to a well known business school professor years ago for my book Deep Work. And he talked about, he just realized, oh, being a business professor, publishing papers is about data access. I have to spend most of my year talking to people, building relationships, trying to set up an agreement with a company where I can get good data that I can get three papers out of. In all of that work, there's one day in there where you're crunching the numbers and making a plot and it's nicer if you could do that a little bit faster. But it's not a productivity bottleneck. It's a marginal efficiency. I think there's a lot of that going on right now with AI and productivity as we look at what the AI can do and then try to make that thing into somehow being the key to getting things done.
[36:58]
Ed Zitron
My productivity problem is that the UI and UX and everything sucks. Everything's disjointed. Setting up Riverside is always fun. They move the menus around. Projects are in a different place. That takes up time. Moving files places also takes up a lot of time. This morning when I put out my private credit piece, I had to do these threads. I had to click around a website and put in the alt text. But I had to tweak it slightly. It's like, I don't know how AI would possibly help me here. And they're not working on that.
[37:28]
Cal Newport
Well, they tried. They tried though. I thought that was going to be. This is what I was excited about earlier, the Genai revolution. I was like, okay, here's the real value prop is natural language interface into advanced features on software where I can just say, all right, I want you to go take this column in the spreadsheet and get rid of all the rows that have values before this. And then I want to make a pie chart because I don't want to learn how to do all that in Excel. I don't know how to do that. And they tried it. I mean, this is Microsoft Copilot. But it turns out we underestimated the degree to which when we as humans are interacting with a chatbot, that we're incredibly gracious. We're able to adjust and kind of get the gist of what it means and filter out the part of the chatbot response that's not really relevant or ask the follow up question. And when they try tried to just use LLM responses to automate actions within programs, it's just not accurate enough. So they wanted that to be the case that like you could just be talking to a Riverside bot and you never would have to press a button ever again in Riverside. It's just not accurate enough. LLMs, it's fine for human conversation. It's just not. It's just not accurate enough in this general case.
[38:35]
Ed Zitron
Also, that thing you're describing with how they want the agentic web to just be a series of APIs so that every agent writes Python or what have you to use them. That's a massive computational increase for no reason. Because you're basically saying instead of someone clicking a mouse and hitting a keyboard, we will write code for everything.
[38:56]
Cal Newport
Yeah.
[38:56]
Ed Zitron
What an insane, what a truly insane idea. I mean it's, it's just very like Salesforce today. I don't know if you saw they announced that they're doing Salesforce headless 360. Marc Benioff needs to fire everyone in marketing, but they've made it so that you can do everything with Salesforce via an API, which is, I mean the first question I always ask is what does Salesforce do? Because I've talked to so many people and they can't tell me there's like 21 different features. No one knows what they do, but it's like, it's just a very bizarre thing. It's very much a cart horse thing, but also what agents. Like, that's what this is the thing that really drives me insane. They're talking about we built this API for the agentic web for agents to use it. Which one? What agent? What are you talking about? Well, it will be in the future. What do you, you change something materially with your publicly traded company worth $300 billion. Because it might happen while we're getting ahead of it. What the. And it's. You talk to members of the media about this and they just go, yeah, you know, yeah, you know, it will happen. It's obviously going to happen. They wouldn't put this much money behind it if it wasn't going to. It's like, I don't know. Especially with Salesforce. And I'm like, you don't think Salesforce would spend a bunch of money for no reason? Well, you've not been Following Salesforce at all, then? I mean. Yeah, go on.
[40:21]
Cal Newport
Yeah, I was gonna say, how much did Meta spend on the metaverse?
[40:24]
Ed Zitron
Over $70 billion. Where did that money go?
[40:27]
Cal Newport
Where did it spend money? Where'd it go? Customizing floating dinosaur avatars, building legs.
[40:35]
Ed Zitron
But let's change.
[40:37]
Cal Newport
That's the second 50 billion. Right. If they got in the second half the investment, they would have got to the legs. They're just not there yet.
[40:42]
Ed Zitron
Another hundred billion will have toes. So, changing subject a little. Mythos has been one of my favorite media hysterias recently. I genuinely wonder, like, if they ran War of the Worlds again today. I think Axios would have a headline two minutes, and it'd be like, there are aliens. They're attacking. I heard it on. I heard it on a podcast. I've looked through the system card. I don't know if you have. For Mythos, it's two listens.
[41:08]
Cal Newport
It's wacky. It's wacky. I can't believe we're. We're letting people get away with having a psychologist talking to that chapter that was like, in your system. It's nuts. What? It's all gone to marketing.
[41:21]
Ed Zitron
They had a psychiatrist or a psychologist, I can't remember, talk to him and be like, yeah, we found these emotional features. How is like, we need regulators to stop this stuff. Because I've heard and people's response to this is, well, banks are having meetings about. And the government's having meetings about it. Governments have meetings about NFTs. There was a Gavin Newsom signed an executive order about Web3. These people will meet and talk about anything. Oh, it's scary. And they're not talking about it, which means it's powerful. How is it powerful? What does it do? Because I think you probably saw this as well. It didn't list how many false positives there were. It also didn't mention that the FreeBSD bug that they talk about, that they found the. Wasn't actually exploitable. I think it was something about, like the. About the level. Like the level it was at. I forget. I'm not. I don't do programming other than very simple Python. A dog's python.
[42:21]
Cal Newport
Yeah. I mean, FreeBSD kernel is full of bugs. All these things are full of bugs
[42:24]
Ed Zitron
because they're open source.
[42:26]
Cal Newport
I had to have this conversation with someone recently where they were like, Mythos. Can you believe of all the places it found a bug in the kernel of Linux, like in Linux, they found like, are you kidding me? All day long is just bug fixes having to be pushed into that repository. Yeah, the Mythos story, I think, I mean a someone needs to get a Nobel prize in marketing because it was absolutely brilliant what they did there. I've spent a lot of time on it. It's complicated because again, you can't really trust the system Cards are just gonzo that anthropic puts out and it's not publicly available. But there were, I think a few very telling things. So there's two features they say Mythos has. One is finding vulnerabilities in source code and two is writing programs to exploit that them. It's first really important that people understand this has been something that people have been doing with LLMs since the beginning of publicly available LLMs.
[43:21]
Ed Zitron
Right.
[43:21]
Cal Newport
There is not only is there nothing new about that, but I found they put this on my podcast almost word for word from the anthropic system card them they said in the Opus4.6 rather systems card, right? A publicly available model that's already been out for many months. Almost word for word for what they said about Mythos except for no coverage of it and no fear. They said we have found 500 zero day vulnerabilities, including some that had been existing for decades without having been discovered. That is what they said about what Opus4.6 could do for Mythos. They said the same thing. They just replaced the word 500 with thousands. But when Opus 4. 6 came out, there was no. Oh my God. They have found many hundreds of zero day exploits, many of which have been around for decades because they didn't push that marketing button. No one particularly cared about it. I went back to my podcast and showed multiple papers. This has been a huge concern and it's a real concern by the way. Right? Is that partially what slows down slightly cracking the breaking in the systems is the fact that it's annoying and hard and LLMs have made it easier. GPT4 was good at finding exploits, right? And this was a big deal. They were like GPT 3.5 wasn't great at it. GPT 4 is. And then as we got the more recent models, they've been much better at writing code to exploit them because we had better agents for it and they're better able to produce multi step software goals and so they can better build software to exploit them. This is a real issue, but it's not new with Mythos. Right? But Mythos was presented as if some Rubicon had been passed. But there was a couple things I noticed right off the bat. One, they made the mistake of listing A bunch of the exploits that they vulnerabilities they had found to try to brag, look at this thing in FreeBSD, look at this thing in FFPNG or whatever. Like they showed all these exploits they found they didn't count on a lot of security researchers said, well, wait a second, why don't I get like a much smaller, cheaper model aiming at that same source code and say, can you find any vulnerabilities? They could find the same ones. So the evidence that it's finding vulnerabilities better, we don't have any way of knowing that's true. And if anything, we actually are getting a lot of reports that they were paying big bounties for security researchers. I'm going to give you access to Mythos. I'm going to pay you for any bugs you can report that you found with it. So they had security researchers just who knows how many false positives were coming out of that. And then on the exploitation side, we only really have one study. It comes from aisi, who I do not trust, but it's the only independent study. The fact that they gave them access itself should make us maybe a little bit suspect. But it basically just showed like normal progression, no massive leap. Model by model gets a little bit better on some of these tests and benchmarks. And Mythos has no out of scale leap. It's just like on some, it's about the same on some it's a little bit better. And yet it got covered as if we had just turned on Whopper from the movie War Games. Like we had just some new entity that was like on its own undermining security. And I do not think that. I think that was highly credulous coverage of what almost certainly is just like a standard, slight jagged move forward on these various capabilities that we've been seeing for the last three years.
[46:44]
Amazon Health AI Ad Voice
Amazon Health AI presents painful thoughts.
[46:49]
Dr. Teal's Ad Voice
Why did I search the Internet for answers to my cold sore problem? Now I'm stuck down a rabbit hole filled with images of alarmingly graphic source in various stages of ooze. I can clear my search history, but I can never unsee that.
[47:06]
Amazon Health AI Ad Voice
Don't go down the rabbit hole.
[47:08]
Ed Zitron
Amazon Health AI gets you the right care fast.
[47:12]
Amazon Health AI Ad Voice
Healthcare just got less painful.
[47:14]
Vital Proteins Ad Voice
Aging is real. And so are the benefits of adding vital proteins, collagen peptides to your daily routine. New vital proteins, collagen sparkling water. Your daily glow up. Three fresh flavors, Strawberry blossom, lemon, lime and blood orange. Improved skin health in as little as 30 days thanks to collagen peptides. Cheers to that. Or go with our classic collagen peptides so you can stay vital. Stay you. Visit vitalproteins.com to learn more and where to buy these statements have not been evaluated by the Food and Drug Administration. This product is not intended to diagnose, treat, cure or prevent any disease.
[47:47]
Amazon Health AI Ad Voice
If you sometimes turn down the podcast just to hear the hum of your engine, then Shell has the fuel for you. Shelby Power Nitro plus fuels every drive from the Pacific coast to the high desert with a fuel like no other. It contains active coating ingredients that clean and protect for longer lasting engines. Because your car's engine matters, that means more protection with active ingredients for longer lasting engines. Shell V Power Nitro Plus Premium Gasoline Engine performance that lasts Chances are you're not far from a Shell station. Find it using the Shell app Formulate unique to Shell compared to minimum Detergent gasoline With continuous use of Shell V Power Nitroplus and gasoline direct injection engines, actual effects and benefits may vary. See Shell us More Protection.
[48:46]
Oppenheimer Ad Voice
We believe in starting with your financial goals, not a formula. At Oppenheimer we put the full strength of our long standing expertise to work understanding your life and your ambitions and designing the precise strategies that build and protect your wealth with confidence across this generation and the next. Put the power of Oppenheimer thinking to work for you. Wealth management, capital market, investment banking.
[49:19]
Ed Zitron
Also when you said that so the difference between Opus 4.6 and Mythos 500:2000 makes me ask the very simple question of did they look as hard to your point about the security research they didn't like did they spend as much time? Probably not. So they probably could have found them. Also by the way, I immediately was looking it up. AI Safety Institute is of course heavily linked to effective altruism.
[49:42]
Cal Newport
Can I say why I'm upset at aisi? I talked about this two weeks ago. I did a or through I don't know when it's coming out but I did a podcast in whenever March where I looked at this report and mainly I looked at the Guardian's coverage of this report done by aisi, but it was just the most inane thing. The headline was Massive increase in AI Scheming is detected and they had a chart stuck in Christ and they had a chart and badline went up and it went up in like January and it goes up and if you read this article about this study they're like something's going on. Scheming has been increasing rapidly recently and they gave some examples of it or whatever. And so I look at this. Like, I want to look at what is going on here. So I look at this chart. What are they charting? Oh, they're charting tweets per day that they've detect tweets about AI doing things that you didn't want it to do. And I said, huh? So when does this line start going up? The week that OpenClaw was released to the public. And everyone just started building their own bad agents and then tweeting about how bad they were. And you know what word was not mentioned in that article? Open Claw. And even though the examples they were giving, so they just said scheming just started rising, I guess AI is becoming sentient. And all they were measuring was people
[50:57]
Ed Zitron
paraphrasing the same viral story to use their own language.
[51:01]
Cal Newport
And then I looked at. And then I looked at the biggest spike. I was like, well, this day in February on this chart had the biggest spike. It was like, oh, there was this one tweet about OpenClaw, like erasing someone's emails. And then it got retweeted. It went super viral. I was like, okay, great. You just. The real headline of this article, letting people write their own agents leads to terrible agents. That's. That's it. But the whole. Anyway, so that's aisi.
[51:27]
Ed Zitron
I'm looking at the tweets as well. One of them is from a 47 follower account with AI art called underscore. Underscore. Just Underscore. Underscored Lisa. And it's. This is really bad. Opus is editing files and making up reasons. It's deleting adult content, so hallucinations.
[51:44]
Cal Newport
And also Opus is not doing that. The stupid Open Claw program you wrote that's prompting Opus and then taking action on your computer based on what it says is deleting your files. The program you wrote that you gave access to your files and just said, whatever we get from this prompt, execute it is erase in your files. OPUS can't do anything. It can produce tokens. But here's the other point I want to make about Mythos that I don't think is being made. And it reminds me of the Sherlock Holmes story of the dog that didn't bark. Right? Where the actual piece of evidence that mattered is not what you heard, but what you didn't hear. This is what I think the real story here is, is you did not hear Dario Amadei in the lead up to the Mythos release in the last year, let's say, or the last two years, you did not hear him talking about what we're working on and why AI is important is because we're going to be able to find vulner in software that have been long hidden. We're going to build the ultimate cybersecurity machine. This was not discussed. That's old fashioned stuff. That's boring stuff. That's stuff that we were worried at. Even GPT2 people are worried about that. What we've been hearing about steadily was jobs are going to be automated. We're going to have like whole creative industries wiped out. We might have sentience coming and at the very least like AGI and these massive disruptions, this is what they've been focusing on again and again. And then their biggest best model, right, Their newest, greatest, bestest model that they trained forever and used all the electricity. What did they say about it? None of those things. They didn't talk about any of the things they said the key to AI was the things they were afraid of, the things they're excited about. Instead they went back and talked about a boring, parochial old feature that has been an issue that nerdy security researchers have been talking about for a half decade now. That to me is. If I was an investor, I would say, take off your Greek helmet. Cosplay mythos is coming to destroy. Hold on a second. Is this better at automating jobs? Is this better at producing code? Is this ag? Why are we talking about finding bugs? We were worried about that with GPT4. That's a problem. But that's not something new. Oh, something must be going on. You just put a lot of money into a new model and the best thing you could find to emphasize was it's good at finding bugs. I think that is a problem. It's what they didn't say about this model. They would have much, much, much rather be able to brag, this model is now much better at any of those things that they've been saying is the key to the AI future. And you didn't hear them talk much at all about any of those.
[54:14]
Ed Zitron
Yeah, and that's the thing. If it was so powerful, like here's the thing, I don't know what would make me convinced that LLMs were the future, but a step toward it would be we typed create a Slack competitor, which they claim they did once and then didn't show it and refused to. And they said, oh, it worked autonomously for 30 hours but then wouldn't talk about it. If they were like, we created the Slack clone, here it is and it was bug free. Like if it actually just worked and we're like, we now, we have done this because theoretically, if this SaaS apocalypse story was. Was true, which it's not that AI is going to replace all software. If they actually did that. If they. Because someone from Anthropic just left the board of Figma and they created a Figma clone and the stock went down because the market's run by toddlers. If they were like, we've released a clone of Microsoft Word, we've done Anthropic Word and we now sell that as part of our subscription, that would actually be quite something. But the thing is, they're not. It's kind of. It gets back to the old talking point of if they made AGI, why would they sell it? Wouldn't it be a massive competitive advantage to keep this? And I think you're right. I think maybe Mythos is not as powerful as they say and they've just had to dress it up. But it gets back to the thing of the direction, the true media coverage. It's like, well, this is scary, right? I mean, that system card's like 180 pages long. I haven't got all day. I have to write three 100 word blogs a week. I couldn't possibly spend time reading this.
[55:40]
Cal Newport
We need so much more skepticism. We need so much more skepticism. Right? I mean, this is why, again, like the most skeptical. We're not skeptics, but like the. I call it the east coast computer scientists. So those of you, we're technically minded and we're not near Silicon Valley, so we're not in that world. It's very hard to be a professor in a world where there's just hundreds of millions of dollars being handed around and they try to like ignore it. But the east coast computer scientists are all baffled by. You talk to any east coast computer scientist, they're all baffled by like, oftentimes there's claims that are just not true or widely exaggerated. Why are we so credulous? I mean, it'd be one thing if it was like a government agency we didn't realize was like trying to protect the fact that there was UFOs and they're just straight up lying. And we've never encountered that before. Like, I didn't realize that, you know. No, it's a business. Right. And the credulity with which we're taking these claims like Mythos is I think, the most important story there is. Yeah. This is another example of what I wrote last summer about AI has hit a bit of a wall in the sense that all of the improvements that have come really since over the last two years have almost all been either on post training or more importantly on the harnesses that you built. So it comes in the software you're building to take it.
[56:51]
Ed Zitron
What is the harness? I've seen this word used a lot. I think it's good for me and the listeners to hear the exact definition.
[56:57]
Cal Newport
Think of it as like a computer program that can do stuff. You can talk to, it could do stuff. And it uses, it'll prompt or talk to an LLM as like its digital brain. So the harness might actually be able to touch your file system, write the files, compile code, move things around, but to figure out what actions to take. It will also then prompt an LLM and say, okay, what should I do next? And you can put it on different.
[57:20]
Ed Zitron
Is it a wrapper?
[57:20]
Cal Newport
Yeah, it's a wrapper. But that's where all the progress has come. All of the progress encoding agents since about a year has come, especially starting this fall has come from better wrappers, better harnesses. It's all in, let's build better. Just hand coding, no machine learning, no intelligence, no Skynet here, but just hand coding. These programs that will call LLMs, let's just keep tuning and tweaking those to be better and better. And of course the programmers building those particular programs, they're building them to do their type of work. So it's a field they understand really well. So they can really just sit here and, and twist and tune. And also programmers are very adaptable, they like tools and they'll adapt around the weaknesses or not. So it's kind of like a best case scenario. But this is another indication of we're not getting these fundamental giant leaps in the capabilities of the digital brains. It's either some benchmaxing, like we tuned it to do better on a particular benchmark, or we built better programs around it. So when you put the money that they put in the Mythos and if really the best thing you had to emphasize when it was done is we have a cybersecurity benchmark where Opus 4.6 was at 66.7 and this is 83.1, that doesn't necessarily going to justify what's going on or that AISI has this. There's only one thing in there where they see a leap from Mythos at a particular contrived security scenario they came up with. And this big leap that got them all worried was Opus4.6 could on average complete 16 out of 32 steps in this challenge. And Mythos, on average could do 22 steps out of 32.
[58:51]
Amazon Health AI Ad Voice
Wow.
[58:51]
Cal Newport
I mean, like, that's hundreds and hundreds of millions of dollars of training, electricity or whatever. I think that's an issue.
[58:58]
Ed Zitron
I just. I think that. And maybe this is a simplistic point. I don't think they know what they're doing at this point. Like, I don't get the sense that anthropic or even OpenAI has a strategy. Because today as we're speaking. So this will be out next Wednesday. But. But they released Anthropic design. The thing I mentioned, the figma clone. It's like, why are you fucking cloning figma? What are you doing?
[59:22]
Cal Newport
I thought you were going to automate the economy.
[59:25]
Ed Zitron
You're going to replace AI So you've made a figma clone.
[59:29]
Cal Newport
What?
[59:30]
Ed Zitron
Like, we heard the rumors last year that they were going to do a product. OpenAI was going to do a productivity suite. It's like, why. It's like they're doing everything they can to. To ignore the core problem, which is the core technology is not going anywhere because Mythos appears to be. They called it a step change, but that's a nice way of saying incremental improvement.
[59:51]
Cal Newport
That's 100% correct. Yeah. And let me tell you why I would be worried if I was them. Here's the worrisome thing about Mythos, right? Is again, they talk about these vulnerabilities hidden for decades that Mythos found or what have you, and they replicated myth. Multiple different independent security teams were able to find most of Those vulnerabilities using 3 to 5 billion parameter open weight models. To put that in perspective, right. A model like Mythos is going to have hundreds of billions, if not a trillion parameters, and they use a 3 to 5 billion parameter off the shelf. You could run this model on a chip inside your 10 trillion. 10 trillion. Oh, okay. 10 trillion parameters. That's crazy.
[60:35]
Ed Zitron
Love the number, bro.
[60:36]
Cal Newport
Is that true?
[60:37]
Ed Zitron
Yeah, that's what it's.
[60:38]
Cal Newport
Oh, my God. 10 trillion parameters is insane. Like, you better be. That better be either gaming the stock market and creating billions of dollars a days in, like, fancy option returns or changing lead in the gold. Because to run something that has 10 billion. 10 trillion parameters to do almost anything else is a. It's like we're going to launch ourselves into space to do something and then land every time. That's so incredibly expensive. But the real fear then is like, well, wait a second. If they could do most of this stuff with a free, cheap model that I Could just run on a machine at home. That's what keeps, I think, Dario Amade up at night. That's what keeps Sam Altman up at night. It's the future. Look, I've been pitching this, right? I think the useful and the only ethical and sustainable future for AI is what I call distributed AGI. And I think it's just what the future is going to be, which is you have specialized applications for different things where, oh, we want to do this thing over here. We built something that has some AI in it and maybe it has an LLM or it's a modular architecture and it has a billion parameter model in there and a world model and it's really good at doing this thing. And it's small and it mainly runs on chip. And now this program can do this thing that I used to have to do. Multiply that across 10,000 different use cases and you're like, oh, we kind of have AGI, right? There's all these different things that have AI tools that do pretty well. That's completely probably the most probable future. It's a future I really like for a lot of reasons. There'll be a lot of things that we can't make progress on, a lot of things we will. But it's a much more heterogeneous future. There's no giant HAL 9000 brains as economically more interesting and diverse. It doesn't have all the sustainability issues. That has to be the future. But the problem about that future, if you're Sam Altman or Dariel Amadei, is that their entire moat is unless you need 10 trillion parameters, they want that to be the key to the AI future because that moat is something that no one can cross. And if that's not the moat, if it's just, oh, if I want to build a poker playing AI, that's really good. I just need people who are good at poker and to spend a couple years and figure out a cool custom system. And that thing now does well. If that's the future, you don't need OpenAI and you don't need anthropic. And I think that probably might be the future. And I think that's terrifying. They're trying to race to an IPO and they're marketing out of their butts like, what could we do to kind of keep things going so at least we can get our stock on the market? That's what would keep me up at night if I was them, is actually the future. There might be a lot of AI in the future. And it's not going to be nearly as sexy as they're hoping.
[63:10]
Ed Zitron
What if there's also, by the way, that 10 trillion number? I can't source it to Anthropic. I've seen it reported multiple places. This is the. This is a problem.
[63:18]
Cal Newport
They never talk about it.
[63:19]
Ed Zitron
Yeah, we have an inch. We have an issue with news right now. We're just like mythology spreads. Ironic, considering the name. But the other thing is as well. It's like hundreds of billions of trillion parameter. You're just using a nuke to kill a single gopher.
[63:34]
Cal Newport
Yeah.
[63:35]
Ed Zitron
Like, you're just like, we're going to throw everything we have at it to the point that I don't know if you've been seeing the amount of trouble Anthropic has had keeping its service online and how they're making the models dumber.
[63:48]
Cal Newport
Yeah.
[63:48]
Ed Zitron
It just feels like we're in this weird, hysterical moment where no one knows why they're doing this, but everyone's ready to accept whatever anyone said. Like, it's just like, oh, we're all doing this insane thing. So we're just going to repeat what kind of informs the bias and makes us look less dumb the more excited we are.
[64:07]
Cal Newport
I think the frontier models are like F1 cars. And the equivalent of points on the F1 circuit are your positioning on the benchmark leaderboards. So you do this, you build these giant models, and you spend all this money, electricity. And they're so big, they're not even economically viable to have people use, which might really be. What's going on with Mythos is like, we have to make this seem super premium because otherwise people are going to get charged $5,000 a month. And just like if you're Red Bull or Ferrari, your fy. If one car doing well on this leaderboard just lets people know this company builds good cars, and then you can sell your normal cars. I think that's a lot of what's going on here, is that they want to be high on that leaderboard. Means we know how to do AI. We AI smart. Even though the future of actual consumer deployed products is going to be much more like a Honda Odyssey minivan than it's going to be like a top formula one car.
[65:01]
Ed Zitron
Well, Kyle, it's been an absolute pleasure having you, as ever. Where can people find you?
[65:05]
Cal Newport
Yeah, you can find me@calnewport.com. my podcast is deep Questions on Thursdays. The Thursday episodes are all AI Reality checks where I take a fun story. Actually, Ed's coming up, or he may have already been on it by the time this comes out. Or maybe it's the day after this comes out. So now you have to check it out. The AI Reality Check episodes get a double dose. You bring this out of me, Ed, by the way. You bring out my sort of ornery side. I'm normally like the very, very kind of stayed professor New Yorker writer. Just like, well, on the one hand. On the other, you bring this out of me. I love it.
[65:37]
Ed Zitron
But you're you. The thing is, you're critical only of things that need to be. You're still willing to humor these things as long as there's something to humor. And that's why I like having you on, because people claim I'm just a hater. So we've got to have, got to have people for a little balance. But thank you for joining me. Thank you everyone for listening. You have a monologue coming up as well on Friday. Thank you all. Thank you for listening to Better Offline. The editor and composer of the Better Offline theme song is Matasauski. You can check out more of his music and audio projects@matasowski.com m a t t o s o w s k-I.com you can email me at ez@betteroffline.com or visit betteroffline.com to find more podcast links and of course, my newsletter. I also really recommend you go to chat. Where's your ed to visit the discord and go to r betteroffline to check out our Reddit. Thank you so much for listening.
[66:38]
Cal Newport
Better Offline is a production of Cool Zone Media.
[66:40]
Oppenheimer Ad Voice
For more from Cool Zone Media, Visit our website coolzone media.com or check us out on the iHeartRadio app, Apple Podcast
[66:47]
Cal Newport
Podcasts or wherever you get your podcasts.
[67:11]
Oppenheimer Ad Voice
Life's better with Drive My Way from American Family Insurance because we know it's more than just a car. It's your escape pod, your adventure mobile, your memory maker. And we help protect the dreams that drive you. Personalize what you pay for auto insurance and you could save between 10 and 35%. American Family Insurance Get a quote and find an agent@amfam.com Products, pricing and availability
[67:33]
Cal Newport
vary based on the way you purchase insurance and by state. Unsafe driving behaviors may increase your rate. American Family Mutual Insurance Company SI and its operating companies 6000American Parkway, Madison, Wisconsin
[67:41]
Bobcat Ad Voice
this episode is brought to you by Bobcat. They started the compact equipment industry through grit, determination and a whole lot of think we can't do that. Watch us. They set standards, broke records, empowered people to build bigger and higher, to dig deeper, to make the impossible possible. We've all been there with doubters telling us what we can't do. Who cares what they think? We don't need their permission or forgiveness. We just get things done. So go ahead and doubt me. Judge me, challenge me. But when the time comes, watch me.
[68:16]
Sophia Donner
Bobcat this is Sophia Donner from OK Storytime this Summer. Find your next obsession on Prime Video and listen. We're not saying you need another obsession, but there could be a lot worse ones. Steamy romance, addictive love stories, and the book to screen favorites you've already read twice, so why not watch them a third time off campus? L the Love Hypothesis and more Slow Burns Second Chances Chemistry you can feel through the screen and it makes you wish you were actually in that movie. We've got binge worthy series can't miss movies. Perfect for when you're ignoring your own problems or procrastinating as one does. Your next obsession is waiting. Watch only on Prime.
[68:55]
Martha Stewart
This is Martha Stewart from the Martha Stewart Podcast. Ever wonder how to make hosting look effortless? Here's a secret Getting ahead of the message with new Reynolds Kitchens Countertop Prep Paper Just lightly wet the counter beforehand so the paper grips and stays in place. Then lay down the Reynolds Kitchens countertop prep paper so drips and spills stay on the paper, not all over your kitchen counter. You can roll out dough, prep a party spread, or cook alongside family. When you're done, cleanup is as simple as lifting the paper and revealing that clean counter underneath. Effortless. You can use it for cooking and baking, prep and even crafting, especially when you need extra working space. Because when the mess is already handled, you can focus on what matters the food, the people, and the moment. It may look effortless, but now you know it's Reynolds Kitchens Countertop prep paper. Take a tip from me. Wet it, set it, prep it. Done. Make it easy. Make it with Reynolds Kitchens Countertop prep paper. Available now in the Reynolds Wrap aisle in Walmart, Target, Amazon and Costco.
[70:10]
Cal Newport
This is an Iheart podcast.
[70:13]
Vital Proteins Ad Voice
Guaranteed human.