Better Offline – Exclusive: How GPT-5 Actually Works

Host: Ed Zitron
Podcast Network: Cool Zone Media and iHeartPodcasts
Episode Date: August 22, 2025

Episode Overview

In this candid and incisive episode of Better Offline, Ed Zitron tears through the hype and confusion surrounding OpenAI's launch of GPT-5. With a blend of dry British humor and sharp analysis, Ed exposes the mishaps, technical inefficiencies, and marketing misdirection that define the rollout of GPT-5 and its implementation in ChatGPT. The episode unpacks how user experience, cost, and performance have been affected by OpenAI's architectural changes, especially the introduction of a complex routing system, and questions the sustainability and legacy of the current AI arms race.

Key Discussion Points and Insights

1. The Hype and Reality of GPT-5's Launch

Ed compares the anticipation around GPT-5 to "the software equivalent of the launch of St. Anger," highlighting the industry's mixed reviews and confusion.
- Quote: "Every time Lars Ulrich hit the snare drum, it cost them $55,000." (01:48)
Tech media and YouTubers were either underwhelmed or suspicious, with notable voices like Simon Willison and SemiAnalysis making dubious or poorly substantiated claims.
The so-called “router” at the heart of GPT-5 quickly becomes the focal point of confusion, especially regarding its role in costs, performance, and ad monetization.

2. Influencer Misdirection and OpenAI’s Demo Tactics

Influencer Theo Brown publishes a glowing video from inside OpenAI, calling it a "Oh fuck moment" and warning, “Keep an eye on your job because I don't know what this means for us long term.” (05:27)
Viewer skepticism abounds: "If OpenAI is holding you hostage, blink twice," among other snarky comments. (05:31)
Theo later reverses his stance in a follow-up video, admitting his demo experience in OpenAI's offices was not representative of the public model:
- Quote: “...the experience that you probably are having with ChatGPT and GPT5 right now is not the experience that I had when I was first testing it.” (06:29)
Ed asserts that OpenAI misled testers by showing enhanced demo models not indicative of general use—setting up influencers for hype, then providing consumers with something notably worse.

3. User Backlash and the "Fandom" Problem

The ChatGPT subreddit flooded with complaints about GPT-5's performance; users angry about the deprecation and reinstatement of previous models like 4o (08:16).
Ed posits that OpenAI's user base acts more like a fandom than a typical product market, causing volatile swings in loyalty and perception.
The narrative shifts from product excitement to a sense of collective disillusionment, even among AI's biggest boosters.

4. How GPT-5/ChatGPT 5 Actually Works: The Router Mess

Ed dives into the router's mechanics, emphasizing that instead of making ChatGPT more efficient, the router architecture makes queries more expensive and unwieldy:
- In some cases, ChatGPT5 can burn up to double the tokens per query compared to prior models, substantially driving up costs. (13:28)
- The new routing system means every interaction can trigger a different set of models/tools, each needing freshly generated static prompts (instructions), wiping out previous efficiencies from caching.
Tokenization Explained:
- Input tokens = user query; output tokens = response. More tokens = higher cost.
Static prompts — invisible instructions that guide the models — must be regenerated for every interaction, eliminating the efficiencies of caching used in prior versions.
The router (potentially an LLM itself) decides which model or tool (text, code, image, reasoning, etc.) to use for each prompt, ramping up latency and compute needs.

Memorable Analogy:

Quote: "It's like you started a job and every time you do a task... your workplace requires you to complete the entire mandatory onboarding training first. Want to edit a spreadsheet? Not before you brush up on your anti-bribery legislation first, you prick." (16:42)

5. Architectural Breakdown

Router and Splitter Models:
- When a user prompts ChatGPT, the query is first processed by a "splitter leg," which decides if it’s a simple (fast path) or complex (thinking path) request.
- The router/splitter model(s) are likely separate large language models tasked with orchestration, unable to leverage previous caches, and adding substantial overhead.
In practice, a single user request might get split, with different parts handled by different models, each receiving their own static prompts.
The cumulative effect: endless repetition of instructions, expensive queries, and complex decision-making layers that don't seem to produce meaningfully better results.

6. Illustration: Old vs. New Model Handling

Under GPT-4O: One model processes everything; static prompt can be reused; multi-modal tasks (e.g., analyzing a chart and generating an image) are streamlined.
Under GPT-5: The router splits the prompt:
- Vision model → analyzes chart (needs unique static prompt)
- Reasoning model → decides best player (needs another static prompt)
- Generation model → creates image (requires yet another static prompt)
- Conclusion: "GPT5 is a rat king of OpenAI's models and tools that gets reborn every single time you ask it to do anything." (24:00)

7. OpenAI’s (Lack of) Product Strategy and the Billion-Dollar Burn

OpenAI’s public messaging about GPT-5 sidesteps developers and the API, a notable shift from prior releases which emphasized API access and partner engagement.
Sam Altman’s willingness to burn billions of dollars on a still-unprofitable product is characterized as "phenomenally silly and desperate":
- Quote: "OpenAI, a company that has already incinerated upwards of $15 billion in the last two years, has chosen to create a less efficient way of doing business as a means of eking out modest at best performance improvements. It just sucks." (30:45)
Ed questions the sustainability of this approach, contrasting it with the individual accountability and scrutiny regular people face for financial recklessness.
Quote: “Companies like OpenAI live by a different set of standards. Sam Altman intends to lose more than $44 billion by the end of 2028...” (31:35)

8. Existential Questions: Why Are We Doing This?

Ed lambasts the entire exercise of spending billions with no clear return or innovation. The model's sophistication is both questionable and self-defeating.
Quote: “This is the best. They've got a large language model that chooses which large language model will answer your question. Gee fucking whiz, Sam Altman sounds dandy. And how much better is this, you say? Oh, you can't really say. Fucking brilliant.” (34:10)
The episode closes on a bleak, sardonic note: "None of this is going to end well, and not even the boosters seem to be having fun anymore. Even Sam Altman seems tired of it all. I know I bloody well am." (36:10)

Notable Quotes & Timestamps

"Every time Lars Ulrich hit the snare drum, it cost them $55,000."
— Ed Zitron (01:48)
"I didn't know it could get this good. This was kind of the like, oh fuck moment for me in a lot of ways."
— Theo Brown (05:06)
"Keep an eye on your job because I don't know what this means for us long term."
— Theo Brown (05:27)
"If OpenAI is holding you hostage, blink twice."
— YouTube commenter on Theo Brown (05:31)
“…the experience that you probably are having with ChatGPT and GPT5 right now is not the experience that I had when I was first testing it.”
— Theo Brown (06:29)
"It's like you started a job and every time you do a task... your workplace requires you to complete the entire mandatory onboarding training first. Want to edit a spreadsheet? Not before you brush up on your anti-bribery legislation first, you prick."
— Ed Zitron (16:42)
"GPT5 is a rat king of OpenAI's models and tools that gets reborn every single time you ask it to do anything."
— Ed Zitron (24:00)
"OpenAI, a company that has already incinerated upwards of $15 billion in the last two years, has chosen to create a less efficient way of doing business as a means of eking out modest at best performance improvements. It just sucks."
— Ed Zitron (30:45)
"Companies like OpenAI live by a different set of standards. Sam Altman intends to lose more than $44 billion by the end of 2028..."
— Ed Zitron (31:35)
"This is the best. They've got a large language model that chooses which large language model will answer your question. Gee fucking whiz, Sam Altman sounds dandy. And how much better is this, you say? Oh, you can't really say. Fucking brilliant."
— Ed Zitron (34:10)
"None of this is going to end well, and not even the boosters seem to be having fun anymore. Even Sam Altman seems tired of it all. I know I bloody well am."
— Ed Zitron (36:10)

Structure/Timestamps for Important Segments

| Timestamp | Content/Topic | |-------------|---------------------------------------------------------| | 01:48–03:00 | Ed introduces the episode and the GPT-5 "router" hype | | 05:06–06:39 | Theo Brown's initial reaction and community response | | 06:39–08:16 | Theo Brown's reversal and suspicion about OpenAI demos | | 13:24–19:40 | Deep technical explanation of GPT-5 routing flaws | | 24:13–29:30 | Detailed router/splitter model description, inefficiency| | 29:30–34:10 | Systemic waste, OpenAI’s billions lost | | 34:10–36:10 | Ed’s closing existential question and summary |

Episode Tone and Style

Sharp, sardonic, irreverent: Ed doesn't pull punches and skewers both the technical and cultural failings of OpenAI.
Technical, yet accessible: Though he dives deep into how the models and routing work, Ed makes frequent analogies and asides ("If you do want to correct me, please don't") to keep it light and understandable.
Candid and skeptical: The tone throughout is one of deep skepticism about both technical claims and business strategy.

Concluding Thoughts

Ed Zitron’s analysis points to a fundamental contradiction at the heart of GPT-5’s launch: despite enormous investment and an avalanche of hype, OpenAI’s new architecture is likely less efficient, more convoluted, and has left users and customers even more dissatisfied. The layering of large language model upon language model, exemplified by the so-called "router," has resulted in rising costs, increased complexity, and little measurable gain — suggesting that, for now, genuine AI progress might be more about marketing than meaningful advances.

Listening to this episode is a must for anyone interested in how AI hype and reality collide — and for those who want to understand the inner, messy logic behind today’s most talked-about tech launches.

Better Offline – Exclusive: How GPT-5 Actually Works

Host: Ed Zitron
Podcast Network: Cool Zone Media and iHeartPodcasts
Episode Date: August 22, 2025

Episode Overview

Key Discussion Points and Insights

1. The Hype and Reality of GPT-5's Launch

Ed compares the anticipation around GPT-5 to "the software equivalent of the launch of St. Anger," highlighting the industry's mixed reviews and confusion.
- Quote: "Every time Lars Ulrich hit the snare drum, it cost them $55,000." (01:48)
Tech media and YouTubers were either underwhelmed or suspicious, with notable voices like Simon Willison and SemiAnalysis making dubious or poorly substantiated claims.
The so-called “router” at the heart of GPT-5 quickly becomes the focal point of confusion, especially regarding its role in costs, performance, and ad monetization.

2. Influencer Misdirection and OpenAI’s Demo Tactics

Influencer Theo Brown publishes a glowing video from inside OpenAI, calling it a "Oh fuck moment" and warning, “Keep an eye on your job because I don't know what this means for us long term.” (05:27)
Viewer skepticism abounds: "If OpenAI is holding you hostage, blink twice," among other snarky comments. (05:31)
Theo later reverses his stance in a follow-up video, admitting his demo experience in OpenAI's offices was not representative of the public model:
- Quote: “...the experience that you probably are having with ChatGPT and GPT5 right now is not the experience that I had when I was first testing it.” (06:29)
Ed asserts that OpenAI misled testers by showing enhanced demo models not indicative of general use—setting up influencers for hype, then providing consumers with something notably worse.

3. User Backlash and the "Fandom" Problem

The ChatGPT subreddit flooded with complaints about GPT-5's performance; users angry about the deprecation and reinstatement of previous models like 4o (08:16).
Ed posits that OpenAI's user base acts more like a fandom than a typical product market, causing volatile swings in loyalty and perception.
The narrative shifts from product excitement to a sense of collective disillusionment, even among AI's biggest boosters.

4. How GPT-5/ChatGPT 5 Actually Works: The Router Mess

Ed dives into the router's mechanics, emphasizing that instead of making ChatGPT more efficient, the router architecture makes queries more expensive and unwieldy:
- In some cases, ChatGPT5 can burn up to double the tokens per query compared to prior models, substantially driving up costs. (13:28)
- The new routing system means every interaction can trigger a different set of models/tools, each needing freshly generated static prompts (instructions), wiping out previous efficiencies from caching.
Tokenization Explained:
- Input tokens = user query; output tokens = response. More tokens = higher cost.
Static prompts — invisible instructions that guide the models — must be regenerated for every interaction, eliminating the efficiencies of caching used in prior versions.
The router (potentially an LLM itself) decides which model or tool (text, code, image, reasoning, etc.) to use for each prompt, ramping up latency and compute needs.

Memorable Analogy:

Quote: "It's like you started a job and every time you do a task... your workplace requires you to complete the entire mandatory onboarding training first. Want to edit a spreadsheet? Not before you brush up on your anti-bribery legislation first, you prick." (16:42)

5. Architectural Breakdown

Router and Splitter Models:
- When a user prompts ChatGPT, the query is first processed by a "splitter leg," which decides if it’s a simple (fast path) or complex (thinking path) request.
- The router/splitter model(s) are likely separate large language models tasked with orchestration, unable to leverage previous caches, and adding substantial overhead.
In practice, a single user request might get split, with different parts handled by different models, each receiving their own static prompts.
The cumulative effect: endless repetition of instructions, expensive queries, and complex decision-making layers that don't seem to produce meaningfully better results.

6. Illustration: Old vs. New Model Handling

Under GPT-4O: One model processes everything; static prompt can be reused; multi-modal tasks (e.g., analyzing a chart and generating an image) are streamlined.
Under GPT-5: The router splits the prompt:
- Vision model → analyzes chart (needs unique static prompt)
- Reasoning model → decides best player (needs another static prompt)
- Generation model → creates image (requires yet another static prompt)
- Conclusion: "GPT5 is a rat king of OpenAI's models and tools that gets reborn every single time you ask it to do anything." (24:00)

7. OpenAI’s (Lack of) Product Strategy and the Billion-Dollar Burn

OpenAI’s public messaging about GPT-5 sidesteps developers and the API, a notable shift from prior releases which emphasized API access and partner engagement.
Sam Altman’s willingness to burn billions of dollars on a still-unprofitable product is characterized as "phenomenally silly and desperate":
- Quote: "OpenAI, a company that has already incinerated upwards of $15 billion in the last two years, has chosen to create a less efficient way of doing business as a means of eking out modest at best performance improvements. It just sucks." (30:45)
Ed questions the sustainability of this approach, contrasting it with the individual accountability and scrutiny regular people face for financial recklessness.
Quote: “Companies like OpenAI live by a different set of standards. Sam Altman intends to lose more than $44 billion by the end of 2028...” (31:35)

8. Existential Questions: Why Are We Doing This?

Ed lambasts the entire exercise of spending billions with no clear return or innovation. The model's sophistication is both questionable and self-defeating.
Quote: “This is the best. They've got a large language model that chooses which large language model will answer your question. Gee fucking whiz, Sam Altman sounds dandy. And how much better is this, you say? Oh, you can't really say. Fucking brilliant.” (34:10)
The episode closes on a bleak, sardonic note: "None of this is going to end well, and not even the boosters seem to be having fun anymore. Even Sam Altman seems tired of it all. I know I bloody well am." (36:10)

Notable Quotes & Timestamps

"Every time Lars Ulrich hit the snare drum, it cost them $55,000."
— Ed Zitron (01:48)
"I didn't know it could get this good. This was kind of the like, oh fuck moment for me in a lot of ways."
— Theo Brown (05:06)
"Keep an eye on your job because I don't know what this means for us long term."
— Theo Brown (05:27)
"If OpenAI is holding you hostage, blink twice."
— YouTube commenter on Theo Brown (05:31)
“…the experience that you probably are having with ChatGPT and GPT5 right now is not the experience that I had when I was first testing it.”
— Theo Brown (06:29)
"It's like you started a job and every time you do a task... your workplace requires you to complete the entire mandatory onboarding training first. Want to edit a spreadsheet? Not before you brush up on your anti-bribery legislation first, you prick."
— Ed Zitron (16:42)
"GPT5 is a rat king of OpenAI's models and tools that gets reborn every single time you ask it to do anything."
— Ed Zitron (24:00)
"OpenAI, a company that has already incinerated upwards of $15 billion in the last two years, has chosen to create a less efficient way of doing business as a means of eking out modest at best performance improvements. It just sucks."
— Ed Zitron (30:45)
"Companies like OpenAI live by a different set of standards. Sam Altman intends to lose more than $44 billion by the end of 2028..."
— Ed Zitron (31:35)
"This is the best. They've got a large language model that chooses which large language model will answer your question. Gee fucking whiz, Sam Altman sounds dandy. And how much better is this, you say? Oh, you can't really say. Fucking brilliant."
— Ed Zitron (34:10)
"None of this is going to end well, and not even the boosters seem to be having fun anymore. Even Sam Altman seems tired of it all. I know I bloody well am."
— Ed Zitron (36:10)

Structure/Timestamps for Important Segments

Episode Tone and Style

Sharp, sardonic, irreverent: Ed doesn't pull punches and skewers both the technical and cultural failings of OpenAI.
Technical, yet accessible: Though he dives deep into how the models and routing work, Ed makes frequent analogies and asides ("If you do want to correct me, please don't") to keep it light and understandable.
Candid and skeptical: The tone throughout is one of deep skepticism about both technical claims and business strategy.

wavePod

Exclusive: How GPT-5 Actually Works

Powered by Wave AI

Summary

Better Offline – Exclusive: How GPT-5 Actually Works

Episode Overview

Key Discussion Points and Insights

1. The Hype and Reality of GPT-5's Launch

2. Influencer Misdirection and OpenAI’s Demo Tactics

3. User Backlash and the "Fandom" Problem

4. How GPT-5/ChatGPT 5 Actually Works: The Router Mess

Memorable Analogy:

5. Architectural Breakdown

6. Illustration: Old vs. New Model Handling

7. OpenAI’s (Lack of) Product Strategy and the Billion-Dollar Burn

8. Existential Questions: Why Are We Doing This?

Notable Quotes & Timestamps

Structure/Timestamps for Important Segments

Episode Tone and Style

Concluding Thoughts

Summary

Better Offline – Exclusive: How GPT-5 Actually Works

Episode Overview

Key Discussion Points and Insights

1. The Hype and Reality of GPT-5's Launch

2. Influencer Misdirection and OpenAI’s Demo Tactics

3. User Backlash and the "Fandom" Problem

4. How GPT-5/ChatGPT 5 Actually Works: The Router Mess

Memorable Analogy:

5. Architectural Breakdown

6. Illustration: Old vs. New Model Handling

7. OpenAI’s (Lack of) Product Strategy and the Billion-Dollar Burn

8. Existential Questions: Why Are We Doing This?

Notable Quotes & Timestamps

Structure/Timestamps for Important Segments

Episode Tone and Style

Concluding Thoughts