wavePod

Get Wave AI

‘A.I.-Washing’ Layoffs? + Why L.L.M.s Can’t Write Well + Tokenmaxxing - Hard Fork | Wave AI Podcast Notes

Back to Hard Fork

‘A.I.-Washing’ Layoffs? + Why L.L.M.s Can’t Write Well + Tokenmaxxing

Hard Fork

Fri Mar 20 2026

Companies are using A.I. as a reason for layoffs, but the truth may be more complex.

Summary

Hard Fork Podcast Summary

Episode: ‘A.I.-Washing’ Layoffs? + Why L.L.M.s Can’t Write Well + Tokenmaxxing
Hosts: Kevin Roose (The New York Times), Casey Newton (Platformer)
Guest: Jasmine Sun (Freelance Journalist)
Date: March 20, 2026

Episode Theme

This episode explores seismic shifts in the tech industry brought on by AI, focusing on three intertwined topics: whether recent tech layoffs are truly caused by AI (“A.I.-washing”), why large language models (LLMs) seem to hit a ceiling on good writing, and the rise of token “leaderboards” that track who is spending the most on AI tools within tech companies. The hosts, joined by Jasmine Sun, dig deep into the reality behind the headlines and examine the human and economic stakes of AI’s next wave.

1. ‘A.I.-Washing’ and Tech Layoffs

[02:00–17:55]

Key Points

Recent Wave of Layoffs: Tech giants such as Atlassian, Block, and reportedly Meta are cutting large portions of their workforce and citing AI-driven efficiency as partial justification.
AI as an Excuse (‘AI-Washing’): The term "AI-washing" is discussed: Are companies using AI as a convenient cover for cost-cutting, or is AI truly displacing jobs?
- Quote (Casey, 06:29): “I thought it was when a software engineer finally took a shower.”
- Quote (Kevin, 06:34): “Basically the thesis is like, these aren't really layoffs about AI. This is just sort of a convenient excuse that these companies are using.”
Economic Drivers: Company narratives often align to please markets; referencing AI can boost stock prices, reminiscent of past crypto hype.
Shift in Spending: Layoffs are not always reducing costs—they’re often redirecting budget from people to AI infrastructure and data centers.
- Quote (Kevin, 11:40): “They are plowing this money that they are going to save by laying off these thousands of people into the building of data centers and other AI infrastructure.”
Worker Anxiety and Labor Power:
- Employees wonder if using AI makes them more valuable or simply easier to automate away.
- Hostile layoffs have made workers more compliant; unionization may become a more pressing topic, marking a major shift from earlier tech labor relations.
- Quote (Casey, 17:26): “I cannot think of anything that would make Mark Zuckerberg more mad than a union of software engineers at Meta.”

Notable Moments

Jokes about “AI washing” paralleling “Crypto washing” and superficial market tricks.
Direct acknowledgment of personal ties and biases, e.g. Casey’s fiancé working at Anthropic.

2. Why LLMs Can’t Write Well

Guest: Jasmine Sun, Author of “The Human Skill That Eludes AI”
[19:46–44:59]

Key Points

LLMs Still Struggle With Truly Good Writing: Despite advances, LLM outputs tend to be formulaic or lack genuine literary flair, especially in creative or lived-experience writing.
- Quote (Jasmine, 21:55): “Most writing period is very bad. And so I think that language models are definitely better at writing and language than most humans are. But... why can't they write at a sort of literary, creative fiction level?”
Comparison Across Model Eras:
- Jasmine found earlier models (e.g. GPT-2, GPT-3) sometimes surprisingly stylistic, even weird, compared to sanitized modern models like ChatGPT 5.4, which are “optimized for blandness.”
- Quote (Jasmine, 23:12): “I found so much more compelling than ChatGPT today. It doesn't have any of the annoying ticks, it doesn't have the M dashes, the tripartite list.”
What Happened? RLHF and Post-Training:
- “Alignment” and reinforcement learning from human feedback (RLHF) have made LLMs safer and more useful but also boring.
- Human editors are often asked to judge “quality” using mismatched rubrics (e.g., limiting exclamation marks or grading fan fiction on factuality).
- Quote (Jasmine, 27:02): “The rubrics just didn't make any sense. He would be told things like, you have to grade them based on the number of exclamation marks… He got a bunch of fan fictions and he was supposed to grade them on their factuality.”
LLMs Lack Lived Experience:
- At their creative best, writing draws on specific, lived experience; LLMs only remix patterns from their data.
- Quote (Jasmine, 31:49): “They don't have lives. That means that all of the metaphors they choose, all of the words they choose, the examples they choose, they're just ungrounded.”
Objections and Debate:
- Would AI rapidly catch up with “true” writing if the incentive shifted?
- Is our resistance to AI-generated literature just bias—a “blind taste test” paradox?
- Quote (Kevin, 35:02): “Is it possible that the models have already become superhuman at writing, but the minute we learn that they are AI models generating text… we lose all interest in it just because of the source, not because of the quality?”
Human-AI Collaboration, Not Competition:
- Jasmine describes her own method for using Claude as a partner in editing—training its critique to her personal standards, not generic rubrics. This achieves better editorial guidance and self-improvement.
- Quote (Jasmine, 41:11): “I wanted to learn what do I aspire to be and where do I see myself falling short and where. What am I proud of, right?... we were able to co develop a rubric of... qualitative criteria.”

Notable Moments

Recalling Sam Altman’s hedging: promising AI super-skills but conceding, “maybe in the future ChatGPT will be able to write, quote, a real poet's okay poem” (Jasmine, 21:55).
Discussion of the “centaur” model: best work comes from human + AI collaboration.
Jasmine’s transparency about her own attempts to automate herself—or not—and a candid, practical segment on how writers use AI to up their game.

3. Tokenmaxxing: The AI Leaderboard Craze

[47:21–62:53]

Key Points

Token Leaderboards:
- New trend at AI-forward tech companies: tracking and ranking employees by number of AI “tokens” they consume—essentially measuring the atomic units of AI-generated work.
- Quote (Kevin, 47:45): “It’s a token frenzy out there, and the employees of these companies are competing among their colleagues... They want to be the people at their company who are using the most AI tokens.”
Why Track Tokens?
- Seen as a proxy for adoption of new AI tools and productivity.
- Top users (“billion-token club”) spend staggering amounts on tokens—sometimes more than their salary, with free use for in-house employees.
- Quote (Kevin, 51:39): “The top user of CLAUDE code, the top individual user of CLAUDE code as measured by Anthropic, spent more than $150,000 on tokens last month.”
Bad Incentives & Goodhart’s Law:
- Creates perverse incentives to “waste” tokens or start side hustles rather than do truly valuable work.
- Quote (Casey, 53:51): “That just seems like it would create the worst incentives. Right. There's this idea of Goodhart's law, right. Like when a measure becomes a target, it seem pieces to become a good measure.”
Broader Implications & Transfer to Other Industries:
- Creating leaderboards for AI use in other domains (e.g., marketing) may incentivize performative rather than productive behavior.
- Managers need to focus on actual value, not just metrics.
- Job negotiations at AI labs now consider “token budgets” as a perk or necessity for power users.

Notable Quotes

Casey, 58:58: “I have been struck at how this idea of the token leaderboard just represents a new incarnation of something that the software industry has been trying to figure out for a long time, which is how can I figure out if my software engineers are productive?”
Kevin, 61:17: “There will be people who are token maxing who are way more productive than their colleagues and doing way more projects way more quickly. I think there will be other people whose managers look at their, like, token budgets and see, say you spent this many tokens on what? And we'll have to have some hard conversations.”

Additional Notable Quotes and Timestamps

On ‘A.I.-Washing’
- Kevin, [09:57]: “There’s sort of this narrative power around AI where if you seem like a company that is investing heavily in the AI tools and the AI way of working, your investors say, oh, that company is really forward looking.”
- Casey, [10:08]: “It turns out that the public markets actually can just be tricked that easily.”
On Creative AI
- Jasmine, [31:49]: “...their writing has stakes. It comes from an emotional place. And the fact that LLMs... don't have lives. That means that all of the metaphors they choose... they're just ungrounded.”
On Editing with AI
- Jasmine, [41:11]: “We're co developing these qualitative criteria and then I split it into phases...I put this all in a cloud project. I said, your job is to evaluate my drafts based on this criteria, but not to do the writing for me and to make sure to prompt out of me what I can do better.”

Episode Flow and Tone

Style: The episode blends journalistic curiosity, playful banter, and skepticism. Kevin and Casey riff on headlines and hold nuanced discussions—mixing skepticism, optimism, and humor.
Language: Conversational, at times irreverent ("Jay Z washing," "software engineer finally took a shower"), with sharp, accessible explanations of technical and economic concepts.

Conclusion

This episode is a deep dive into the murky realities behind AI’s market hype: major tech layoffs justified (maybe opportunistically) under the AI banner, the struggle of even “superhuman” LLMs to write with real voice and soul, and the wonky new culture of “tokenmaxxing” that’s reshaping incentives and office politics. The conversation with Jasmine Sun is particularly rich for anyone interested in how AI is (and isn’t) changing creative work.

For Further Exploration

Loading summary...

Transcript

Casey Noon (0:00)

So there's a lot of noise about AI, but time's too tight for more promises. So let's talk about results. At IBM, we work with our employees to integrate technology right into the systems they need. Now a Global workforce of 300,000 can use AI to fill their HR questions, resolving 94% of common questions. Not noise. Proof of how we can help companies

Jasmine Sun (0:21)

get smarter by putting AI where it

Casey Noon (0:24)

actually pays off, deep in the work that moves the business.

Jasmine Sun (0:27)

Let's create smarter business.

Casey Noon (0:29)

IBM. I just read the most heartwarming news this morning that I wanted to share with you, Kevin.

Kevin Roose (0:35)

What's that?

Casey Noon (0:35)

The UK government has withdrawn a proposal to let AI companies train on copyrighted works after a backlash from artists like Dua Lipa. Did you see this?

Kevin Roose (0:45)

No.

Casey Noon (0:45)

Dua Lipa said, don't start now with this AI. My sugar boo, she litigating, Kevin. She's making some new rules and she's saying we're not going to train on my copyrighted works. Wow. And that's why she is a queen. And so, Dua Lipa, if you're listening, we salute you.

Kevin Roose (1:06)

Yeah. Dua Lipa. You're a Dua Keepa, period.

Casey Noon (1:11)

Dua Lipa said artists rights.

Kevin Roose (1:14)

Wow. I'm Kevin Roos, a tech columnist at the New York Times. I'm Casey Noon from Platformer, and this is Hard Fork.

Casey Noon (1:24)

This week, a big wave of tech layoffs is raising the question, has AI job loss truly begun? Then writer Jasmine sun is here to help us answer the question, why are chatbots bad at writing? And finally, it's token maxing time. Why tech companies are building leaderboards to measure who is spending the most on AI.

Kevin Roose (1:54)

Well, Casey, for years now, we've been monitoring for signs of an AI job apocalypse.

Casey Noon (1:59)

Yeah, we've been monitoring the situation, it's true.

Jasmine Sun (37:12)

Some of it, but not all of it. I mean, so I talked to for example, James Yu, who is the Co founder of Pseudo Write, which is one of the earliest creative fiction AI writing assistants. I talked to some other folks who similarly were in the fiction writing LLM space and like you said, to an extent, a lot of writers are already using these. A lot are already leaning on LLMs to generate large amounts of text and it can be very successful and it can meet readers needs and whatever. But like, even these people who I was talking to, they were describing the to me how freaking hard it is to undo all of the post training that the labs have done. So they are applying immense amounts of engineering effort that clearly, in my conversations with them, clearly frustrates them that it is so hard to get these models to stop being so chirpy, so sycophantic, so PG13 and everything in order to get them to this sort of base model state where they're able to be weird again. So I think it's certainly possible, but I think the labs have made it quite challenging just because of the way that these models are trained. The other thing that I think is important is I tend to think that writing and a lot of creative work is actually like the perfect use case for these centaur models, right? Like the idea that the human plus AI collaboration is where you can get the furthest. And when I listened to the interviews that you guys did about the fiction authors, I was thinking, this is a centaur model, right? Because without the human prompting and bullying the AI into getting weird and getting sensual and whatever, it was not going to do that on its own. And I myself, I do use LLMs as a research assistant. I wrote about that Inside the Atlantic piece about the way that Claude has now sort of helped me edit my own work in a way that I found incredibly useful. But I do feel like the collaborative element is important for any domain where the personal perspective, lived, experience, whatever really matters.

Jasmine Sun (40:11)

I mean, these are very low quality bullet points. But I also gave it that because I wanted to learn my taste, I wanted to learn what do I aspire to be and where do I see myself falling short and where. What am I proud of, right? And so from those two things, plus a little bit more information about, like, here's my audience, this is my beat, this is my goals, we were able to co develop a rubric of. Instead of like, how many exclamation marks does it have? It would say things like, does this take advantage of your quote unquote, like, insider anthropologist position in Silicon Valley? Because that's one of the things that Claude and I think distinguish my voice or it'll also notice like, oh, Jasmine, you tend to move between registers. You'll switch between, between startup jargon and Internet slang and whatever. And I think the fact that you can do the high, low or move from policy to personal scene, this is something that is characteristic of your writing. And so again, we're co developing these qualitative criteria and then I split it into phases of ideation, phase, rubric, structure, rubric prose, rubric, final fact checking. And so what I do now, I put this all in a cloud project. I said, your job is to evaluate my drafts based on this criteria, but not to do the writing for me and to make sure to prompt out of me what I can do better. Better. I dumped a draft into Claude. Claude will run like phase two structure on it. It'll say things like, your conclusion is just a summary and this is really boring. In fact, in your piece about this and that, you actually ended on a Scene. And I thought that was much more powerful. So why don't you try ending this one on a scene? And Claude will say, rather than inventing a scene, it will say, what were you thinking when the plane took off? What were you feeling inside? Can you think of a scenario where you had a conversation with, say, a kid's safety advocate about AI that really resonated with you? Because right now it sounds like Dry Policy explainer. And that feedback I actually found incredib useful. Like, I'm still applying my own judgment to say, do I take it or not? But I'm like, you know, this is about me becoming the best version of myself as a writer. It's about, like, me self improving and Claude pushing me to do that, which I found much, much more helpful.

Summary

Hard Fork Podcast Summary

Episode Theme

1. ‘A.I.-Washing’ and Tech Layoffs

[02:00–17:55]

Key Points

Recent Wave of Layoffs: Tech giants such as Atlassian, Block, and reportedly Meta are cutting large portions of their workforce and citing AI-driven efficiency as partial justification.
AI as an Excuse (‘AI-Washing’): The term "AI-washing" is discussed: Are companies using AI as a convenient cover for cost-cutting, or is AI truly displacing jobs?
- Quote (Casey, 06:29): “I thought it was when a software engineer finally took a shower.”
- Quote (Kevin, 06:34): “Basically the thesis is like, these aren't really layoffs about AI. This is just sort of a convenient excuse that these companies are using.”
Economic Drivers: Company narratives often align to please markets; referencing AI can boost stock prices, reminiscent of past crypto hype.
Shift in Spending: Layoffs are not always reducing costs—they’re often redirecting budget from people to AI infrastructure and data centers.
- Quote (Kevin, 11:40): “They are plowing this money that they are going to save by laying off these thousands of people into the building of data centers and other AI infrastructure.”
Worker Anxiety and Labor Power:
- Employees wonder if using AI makes them more valuable or simply easier to automate away.
- Hostile layoffs have made workers more compliant; unionization may become a more pressing topic, marking a major shift from earlier tech labor relations.
- Quote (Casey, 17:26): “I cannot think of anything that would make Mark Zuckerberg more mad than a union of software engineers at Meta.”

Notable Moments

Jokes about “AI washing” paralleling “Crypto washing” and superficial market tricks.
Direct acknowledgment of personal ties and biases, e.g. Casey’s fiancé working at Anthropic.

2. Why LLMs Can’t Write Well

Guest: Jasmine Sun, Author of “The Human Skill That Eludes AI”
[19:46–44:59]

Key Points

LLMs Still Struggle With Truly Good Writing: Despite advances, LLM outputs tend to be formulaic or lack genuine literary flair, especially in creative or lived-experience writing.
- Quote (Jasmine, 21:55): “Most writing period is very bad. And so I think that language models are definitely better at writing and language than most humans are. But... why can't they write at a sort of literary, creative fiction level?”
Comparison Across Model Eras:
- Jasmine found earlier models (e.g. GPT-2, GPT-3) sometimes surprisingly stylistic, even weird, compared to sanitized modern models like ChatGPT 5.4, which are “optimized for blandness.”
- Quote (Jasmine, 23:12): “I found so much more compelling than ChatGPT today. It doesn't have any of the annoying ticks, it doesn't have the M dashes, the tripartite list.”
What Happened? RLHF and Post-Training:
- “Alignment” and reinforcement learning from human feedback (RLHF) have made LLMs safer and more useful but also boring.
- Human editors are often asked to judge “quality” using mismatched rubrics (e.g., limiting exclamation marks or grading fan fiction on factuality).
- Quote (Jasmine, 27:02): “The rubrics just didn't make any sense. He would be told things like, you have to grade them based on the number of exclamation marks… He got a bunch of fan fictions and he was supposed to grade them on their factuality.”
LLMs Lack Lived Experience:
- At their creative best, writing draws on specific, lived experience; LLMs only remix patterns from their data.
- Quote (Jasmine, 31:49): “They don't have lives. That means that all of the metaphors they choose, all of the words they choose, the examples they choose, they're just ungrounded.”
Objections and Debate:
- Would AI rapidly catch up with “true” writing if the incentive shifted?
- Is our resistance to AI-generated literature just bias—a “blind taste test” paradox?
- Quote (Kevin, 35:02): “Is it possible that the models have already become superhuman at writing, but the minute we learn that they are AI models generating text… we lose all interest in it just because of the source, not because of the quality?”
Human-AI Collaboration, Not Competition:
- Jasmine describes her own method for using Claude as a partner in editing—training its critique to her personal standards, not generic rubrics. This achieves better editorial guidance and self-improvement.
- Quote (Jasmine, 41:11): “I wanted to learn what do I aspire to be and where do I see myself falling short and where. What am I proud of, right?... we were able to co develop a rubric of... qualitative criteria.”

Notable Moments

Recalling Sam Altman’s hedging: promising AI super-skills but conceding, “maybe in the future ChatGPT will be able to write, quote, a real poet's okay poem” (Jasmine, 21:55).
Discussion of the “centaur” model: best work comes from human + AI collaboration.
Jasmine’s transparency about her own attempts to automate herself—or not—and a candid, practical segment on how writers use AI to up their game.

3. Tokenmaxxing: The AI Leaderboard Craze

[47:21–62:53]

Key Points

Token Leaderboards:
- New trend at AI-forward tech companies: tracking and ranking employees by number of AI “tokens” they consume—essentially measuring the atomic units of AI-generated work.
- Quote (Kevin, 47:45): “It’s a token frenzy out there, and the employees of these companies are competing among their colleagues... They want to be the people at their company who are using the most AI tokens.”
Why Track Tokens?
- Seen as a proxy for adoption of new AI tools and productivity.
- Top users (“billion-token club”) spend staggering amounts on tokens—sometimes more than their salary, with free use for in-house employees.
- Quote (Kevin, 51:39): “The top user of CLAUDE code, the top individual user of CLAUDE code as measured by Anthropic, spent more than $150,000 on tokens last month.”
Bad Incentives & Goodhart’s Law:
- Creates perverse incentives to “waste” tokens or start side hustles rather than do truly valuable work.
- Quote (Casey, 53:51): “That just seems like it would create the worst incentives. Right. There's this idea of Goodhart's law, right. Like when a measure becomes a target, it seem pieces to become a good measure.”
Broader Implications & Transfer to Other Industries:
- Creating leaderboards for AI use in other domains (e.g., marketing) may incentivize performative rather than productive behavior.
- Managers need to focus on actual value, not just metrics.
- Job negotiations at AI labs now consider “token budgets” as a perk or necessity for power users.

Notable Quotes

Casey, 58:58: “I have been struck at how this idea of the token leaderboard just represents a new incarnation of something that the software industry has been trying to figure out for a long time, which is how can I figure out if my software engineers are productive?”
Kevin, 61:17: “There will be people who are token maxing who are way more productive than their colleagues and doing way more projects way more quickly. I think there will be other people whose managers look at their, like, token budgets and see, say you spent this many tokens on what? And we'll have to have some hard conversations.”

Additional Notable Quotes and Timestamps

On ‘A.I.-Washing’
- Kevin, [09:57]: “There’s sort of this narrative power around AI where if you seem like a company that is investing heavily in the AI tools and the AI way of working, your investors say, oh, that company is really forward looking.”
- Casey, [10:08]: “It turns out that the public markets actually can just be tricked that easily.”
On Creative AI
- Jasmine, [31:49]: “...their writing has stakes. It comes from an emotional place. And the fact that LLMs... don't have lives. That means that all of the metaphors they choose... they're just ungrounded.”
On Editing with AI
- Jasmine, [41:11]: “We're co developing these qualitative criteria and then I split it into phases...I put this all in a cloud project. I said, your job is to evaluate my drafts based on this criteria, but not to do the writing for me and to make sure to prompt out of me what I can do better.”

Episode Flow and Tone

Style: The episode blends journalistic curiosity, playful banter, and skepticism. Kevin and Casey riff on headlines and hold nuanced discussions—mixing skepticism, optimism, and humor.
Language: Conversational, at times irreverent ("Jay Z washing," "software engineer finally took a shower"), with sharp, accessible explanations of technical and economic concepts.