Podcast Summary: "Using Codex Upgrades for Better Coding"
Podcast: How I AI Stuff
Host: Jaden Schaefer
Air Date: April 17, 2026
Overview
In this episode, host Jaden Schaefer breaks down some of the most significant recent advances and stories in AI coding tools, including a head-to-head between OpenAI’s new Codex upgrades and Anthropic’s offerings, notable VC activity in enterprise AI, a technical explainer of “token maxing,” and the promise of the latest robotics foundation models. The tone is enthusiastic, practical, and candid, with a focus on how these updates impact real-world workflows for developers, startups, and enterprises.
Key Discussion Points & Insights
1. Enterprise AI Coding and VC Activity
Timestamp: 04:22
- Spotlight on Factory: An AI coding startup named Factory, focused specifically on enterprise engineering teams, just raised $150M Series A at a $1.5B valuation.
- Key Differentiator: Model flexibility; users can switch between different coding models (Claude, Deepseek, etc).
- Insight: “Even with Anthropic and OpenAI and Cursor already in the market, enterprise AI coding still has room for some category specific players. Morgan Stanley isn’t going to let some, you know, random developer tool run inside their network unless it’s built with their compliance and security posture in mind. And I think that’s basically the gap that Factory is filling.” (Jaden, 06:21)
- Notable customers: Morgan Stanley, Ernst & Young, and Palo Alto Networks.
- VCs flocking: Led by Khosla Ventures, with Sequoia, Insight Partners, and Blackstone also participating.
- Trend: The enterprise segment is seeking highly customizable, secure, and compliant AI coding tools, making room for niche players despite competition from Big Tech.
2. Anthropic’s Launch: Claude Design
Timestamp: 08:09
- Claude Design: New design tool for Pro Max teams and enterprise, powered by Claude Opus 4.7. Users can describe what they want (pitch decks, mockups, landing pages), and Claude generates an editable first draft.
- Export/Integration: You can share outputs as PDFs, PBTX files, URLs, or directly to Canva. It also reads company code/design files to maintain brand consistency.
- Key Audience: Non-designers–founders, PMs, operators who need decent-looking outputs quickly.
- Strategic Play: “Anthropic is continuing to move up the stack...they’re not just trying to be an API company, they want to actually own actual workflows and surface area. It’s the same play that OpenAI has been making.” (Jaden, 10:55)
- Quote on competitors: “Google has Stitch, which is a very similar design tool as well...we’re going to see a lot of these players get more into the software itself beyond just the models, which is pretty interesting.” (Jaden, 11:28)
3. Token Maxing: Productivity or Vanity?
Timestamp: 12:40
- Definition: “Token maxing” is when companies boast about how many tokens their AI tools churn, equating higher usage with greater productivity.
- Reality Check: Initial AI code is often accepted (80-90%), but code reviews two weeks later show only 10-30% remains unchanged—engineers are constantly rewriting.
- Data Points:
- AI users see 9.4x higher code churn than non-AI users (GetClear).
- Code churn increased 861% with high AI adoption (Farrow AI).
- Higher token budgets → double throughput, but at 10x token cost (Jellyfish, 7,500 engineer study).
- Data Points:
- Insight: “The productivity gains from AI coding are real, but they’re also a fraction of what the raw output numbers suggest...If you look at it three weeks later, a big chunk of it has to be rewritten or fixed, which is fine. I mean, a normal developer writes code and...fixes it.” (Jaden, 14:29)
- Seniority Lens: Senior engineers are more critical and less likely to accept AI-generated code.
4. Physical Intelligence & Robotics Foundation Models
Timestamp: 17:00
- PI 0.7 by Physical Intelligence: A robotics model capable of novel tasks by combining previously learned skills, even with minimal exposure.
- Demo Example: An air fryer the robot had only seen briefly was successfully operated using step-by-step verbal instructions.
- Generalist Model Success: “In broader testing, the generalist model actually matched specialized models on jobs like making coffee, folding laundry, and assembling boxes.”
- Valuation & VC Interest: Over $1B raised; currently valued at $5.6B, reportedly aiming for $11B soon; cofounder Lachi Groom is a proven backer (Figma, Notion, Ramp).
- Limitations: “PI 0.7 still can’t handle a lot of multi-step tasks.... The robotics field doesn’t really have a lot of clean benchmarks like LLMs do.” (Jaden, 20:35)
- Broader Implication: Generalized models are getting close to handling real-world, messy environments, not just single tasks.
5. Major OpenAI Codex Upgrades — Desktop Battle Begins
Timestamp: 22:12
- Codex Desktop App:
- Now runs in the background on Mac, opening apps and performing actions without interrupting the user.
- Parallel Agents: Can run multiple agents handling different tasks at once (bug fix, testing, docs, etc.).
- In-app browser: Can interact with web apps directly.
- Plugin Ecosystem: 111+ integrations (e.g., CodeRabbit, GitHub/GitLab, Google tools).
- Memory: Can remember previous sessions.
- Image Generation: Built-in image gen (unlike Claude code).
- Flexible Pricing: Pay-as-you-go for business and enterprise customers.
- Comparison to Anthropic's Claude: “Honestly has been really crushing it...OpenAI is basically swinging directly at Anthropic’s Claude code.” (Jaden, 22:45)
- Quote on background automation:
- “It is annoying with Claude cowork that lots of times when those automated tasks start happening, all of a sudden this Chrome browser pops up on my screen...I'm like swatting flies, like trying to get this thing away while I keep working on something different...OpenAI is trying to combat that and have it work on things in the background.” (Jaden, 23:51)
- Quote on plugin advantage:
- “The plugin ecosystem is probably part of one of the most underrated pieces of this entire announcement because they have like 111 different plugins at launch…with Claude cowork...there’s so many different tools I use that don’t integrate very well with it. I think a lot of these integrations that OpenAI is pulling in are going to be very useful.” (Jaden, 25:47)
Notable Quotes & Memorable Moments
- On enterprise coding needs:
- “Morgan Stanley isn’t going to let some, you know, random developer tool run inside their network unless it’s built with their compliance and security posture in mind.” (06:19)
- On AI productivity stats:
- “AI users have 9.4 times higher code churn than non-AI users...” (13:34)
- On reality vs. hype:
- “The productivity gains from AI coding are real, but they’re also a fraction of what the raw output numbers suggest.” (14:33)
- On robotics milestones:
- “This kind of generalized behavior is...pretty significantly stepping us towards robots that actually work in really messy real world environments.” (20:56)
- On Codex background automation:
- “So it’s not just writing code in an editor, it’s actually operating your entire machine. So this is what I’m excited about.” (23:39)
- On the plugin ecosystem:
- “The plugin ecosystem is...one of the most underrated pieces of this entire announcement...” (25:50)
Important Segments with Timestamps
- VC Funding for Factory & Enterprise AI: 04:22 – 08:09
- Claude Design Preview and Integration: 08:09 – 12:40
- Token Maxing Deep Dive: 12:40 – 17:00
- Physical Intelligence/PI 0.7 Robotics Discussion: 17:00 – 22:12
- OpenAI Codex Upgrades and Desktop Control: 22:12 – 27:00
Tone & Final Thoughts
Jaden’s discussion is practical and sometimes skeptical (especially regarding hype around AI code productivity), but clearly enthusiastic about the transformative impact of these tools. He highlights the practical friction points and real-life workflow improvements these new tools bring, while underscoring the importance of actually measuring successful outcomes, not just AI-generated output.
Useful for: Anyone in software, product management, enterprise engineering, or those tracking the competitive landscape in AI tooling and automation.
Skipped: Ads/promos, intros, outros, and podcast reviews.
