Podcast Summary: The Startup Ideas Podcast
Episode: Claude Opus 4.6 vs GPT-5.3 Codex: Live Build, Clear Winner
Host: Greg Isenberg
Guest: Morgan Linton
Date: February 6, 2026
Overview
In this episode, Greg Isenberg and his guest, engineer and AI entrepreneur Morgan Linton, dive deep into the freshly launched Claude Opus 4.6 (Anthropic) and GPT-5.3 Codex (OpenAI). They compare the two new large language models head-to-head by doing a live, real-world coding test: rebuilding the core of Polymarket, a real multi-billion-dollar prediction market app. The episode is a hands-on, practical exploration for developers and tech enthusiasts, focusing on model strengths, configuration tips, live coding demos, and actionable advice.
Key Discussion Points & Insights
1. Setting the Stage: Two Model Releases, Same Day
- Both Opus 4.6 and GPT-5.3 Codex were released within 20 minutes of each other, prompting immediate and intense comparison across the developer community.
- Morgan (01:38): “By the end of this, you're going to know…how to make sure you are running Opus 4.6 and all of the little details you can change in the settings JSON file to use some of the cool features in Opus 4.6, especially agent teams, which is probably the feature I’m the most excited about.”
2. Housekeeping: Ensuring You’re Really Running Opus 4.6
- Many users mistakenly run old versions, missing new features.
- Key configuration steps: Update via NPM (
npm update). Check version numbers (should be 2.1.32 or above). In settings, set “model” to “Claude Opus 4.6”. - To unlock ‘agent teams’ (multi-agent orchestration) in Opus 4.6, you must enable it:
ENV this Claude code experimental agent teams=1in your settings. - “The coolest feature they added with four. Six is agent teams. I’m super excited to demo that with you. You have to make sure to turn that on because it is an experimental fe.[ature]” (Morgan, 04:45)
- New “adaptive thinking” for Opus 4.6’s API: set “effort” to max level for unconstrained thinking—only on 4.6.
3. Philosophical Differences: Pair Programmer versus Autonomous Agent
- Morgan shares a Hacker News take:
- GPT-5.3 Codex is an interactive collaborator—think “pair programmer” where you steer mid-execution.
- Opus 4.6 is a delegator—launch agent teams that work autonomously.
- “Some want tight human and loop control. Others want to delegate whole chunks of work and review the result… Codex really is your collaborator. …Opus 4.6: probably the best of the best now being able to say I want to spin up three or four agents, I want them to go do stuff. Hey, don't bug me.” (Morgan, 08:22)
- No absolute “winner”; choice depends on your working style and needs.
4. Detailed Feature Comparison: Context, Benchmarks, and Behaviors
- Opus 4.6
- Giant 1M token context window—great for understanding large, complex codebases; plans deeply.
- Excellent for codebase comprehension, refactoring, and architectural sensitivity.
- Multi-agent orchestration (“agent teams”) is the killer feature.
- May “overthink” or hesitate on ambiguous requirements.
- GPT-5.3 Codex
- ~200k context window (not the focus).
- Fast, interactive coding; optimized for “progressive execution.”
- Wins on most coding benchmarks (SWD Bench, Pro Terminal Bench).
- More “founding engineer” than “staff reviewer.”
- Relies on task-driven autonomy, human-in-the-loop mid-execution steering.
- Can be overconfident; but lets you mid-course correct easily.
- “Claude's really asking, like, should we do this? GPT 5.3 is like, how fast can I ship this right?” (Morgan, 14:40)
5. Live Build Showdown: Polymarket Clone Challenge
Setup
- Both models tasked with building a Polymarket competitor app, each given similar prompts (adjusted for each model’s strengths).
- Testing not only code and output, but also usability, UX, and development workflows.
The Codex (GPT-5.3) Build
- Completed in 3 minutes, 47 seconds.
- Generated:
- Core LMSR market maker engine
- REST API
- Responsive front-end
- Test suite (10 tests, all passed)
- Ran locally: allowed Greg to create a trader, set a crypto market, execute trades.
- Design: functional but minimal, “not that different” or visually impressive.
- “Codex built a competitor to Polymarket in 3 minutes and 47 seconds.” (Morgan, 22:13)
The Opus 4.6 Build
- Utilized multi-agent orchestration:
- Technical architect, prediction market expert, UX designer, QA/tester work in parallel.
- Consumed ~150–250k tokens vs. Codex’s ~30–40k.
- More comprehensive output:
- 96 tests
- Modular monolith NextJS app
- Multi-page UI (leaderboard, sports, crypto, portfolio)
- Realistic prediction markets, hover states in UI, dark mode, color-coded trades.
- Far superior UI/UX:
“This looks really clean. What happens when you hover over?” (Greg, 42:07) - More “creative leap”—UI, content, seed markets, storytelling aspects.
- “I wasn’t expecting to click in and actually get a well designed a page like this. …this is pretty wild.” (Morgan, 43:05)
Iteration and Prompting
- Attempted to push Codex for more imaginative design, invoking “Jack Dorsey” as inspiration.
- Codex could be steered in real-time during execution—but responses to high-level creative requests were still incremental, not transformational.
- Opus’s multi-agent process appeared slower, more expensive, but result was more robust and better-designed.
6. Cost, Token Consumption, and Real-World Implications
- Opus's agent system dramatically increases token usage—which may be intentional (drives usage and subscription value).
- “I can tell you I’ve never used so many tokens in one day as today.” (Morgan, 28:57)
- For the $200/month Claude Max plan, ~10 million Opus tokens are included (unofficial estimate).
- Building with agents eats into this quickly, but still “like the price of a cocktail in Miami.” (Greg, 31:00)
- Codex is much more efficient, blazing fast, and cheaper in this scenario.
7. Beginner-friendliness and Guidance
- For non-tech or “vibe coders,” Codex may feel easier—faster, more direct, less configuration.
- Opus 4.6, with agents, is powerful, but requires more setup—potentially a steeper learning curve, but more robust for teams and ambitious builds.
8. Final Assessment: Which Model Wins?
- “I’m not going to say which one is. It’s not that Opus is better than Codex or vice versa, but I would say in this test, Opus won.” (Morgan, 45:03)
- Codex: Faster, efficient, collaborative, great for rapid builds and iterations.
- Opus 4.6: Slower, more expensive in tokens, but more thoughtful output, better design, more comprehensive “team” approach.
Notable Quotes & Memorable Moments
- Morgan (08:22): “Codex really is your collaborator…Opus 4.6: probably the best of the best now being able to say I want to spin up three or four agents, I want them to go do stuff.”
- Greg (14:40): “It’s really, I mean it's so cool because it almost feels like they're different people, you know what I mean? Like they have different styles.”
- Morgan (22:13): “Codex built a competitor to Polymarket in 3 minutes and 47 seconds.”
- Morgan (28:57): “I can tell you I've never used so many tokens in one day as today. So it's working.”
- Greg (42:07): “This looks really clean. What happens when you hover over?”
- Morgan (43:05): “I wasn’t expecting to click in and actually get a well designed a page like this. Huh. So if I were to do that, I have to sign in a trade...It's clean. This is pretty neat.”
- Greg (45:16): “In this test, Opus won.”
- Morgan (48:26): “Let your teams loose with this stuff. Let them try it… Some of this stuff is really cutting edge and really performing and gives us the opportunity to do better, more creative work.”
Timestamps for Key Segments
- [01:38] — What listeners will gain; intro to agent teams
- [04:37] — Ensuring you're running the latest Opus 4.6; setting up agent teams
- [08:22] — Philosophical divergence: “collaborator” model vs. “delegator” model
- [13:07] — Model feature breakdown: context windows, benchmarks, agentic behavior, failure modes
- [15:41] — Live build challenge: setup and prompts
- [22:13] — Codex (GPT-5.3) completes build, demo, test run
- [31:00] — Token consumption, cost, and value discussion
- [41:54] — Reviewing the Opus 4.6 build: UI, features, agent output
- [45:03] — Declaring a (situational) winner
- [48:26] — Final advice, engineering team culture, and Morgan’s plug
Takeaways & Practical Tips
- For devs: Make sure you’re up-to-date and properly configured; enable Opus agent teams for the full feature set.
- For teams: Choose a model based on workflow preference; try both—Codex for speed/iteration, Opus for comprehensive, multi-faceted builds.
- For everyone: Push prompts creatively and iteratively. The tools are different—explore both to maximize what’s possible.
Where to Find More
- Morgan Linton: Co-founder & CTO, Bold Metrics; active on X (formerly Twitter)—talks a lot about “vibe coding.”
- Greg Isenberg: Host; find more startup ideas and resources at gregisenberg.com/30startupideas
Conclusion:
This was a high-energy, informative, and practical head-to-head with real code, real configs, and honest technical storytelling. For anyone wanting to dive into or choose between the two hottest new AI developer models, this episode is a goldmine of actionable advice and authentic insight.
Like the show? Give it a like, comment, and subscribe. Then, stop listening—and start building!
