Building an AI Mathematician with Carina Hong - #754 - The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Summary

The TWIML AI Podcast #754: Building an AI Mathematician with Karina Hong (Axiom)
Date: November 4, 2025
Guest: Karina Hong, Founder and CEO at Axiom
Host: Sam Charrington

Episode Overview

In this episode, Sam Charrington explores the burgeoning intersection of artificial intelligence, mathematics, and programming languages with guest Karina Hong, founder and CEO of Axiom. Together, they discuss the challenges and opportunities in building an "AI mathematician" — a system capable of formal mathematical reasoning, conjecture, and discovery. Karina shares her personal journey from math Olympian to research mathematician, the current state of formal mathematics with languages like Lean, the role of code generation and reinforcement learning, data challenges, and the roadmap to making mathematics as programmable as code.

Key Discussion Points

1. Karina Hong’s Mathematical and AI Journey

Background and Motivation
- Karina describes her love for math since childhood, her experience in math Olympiads (02:20), and how research mathematics contrasts with problem-solving sprints (“a lot more pain and suffering… stuck on a problem for months” – 02:34).
- Quote:
  
  “Wouldn’t it be great if you can just let your intuitions take you?… Wouldn’t it be great to just think there’s a high chance if your intuition is right that the lemma is correct and move on…” (04:06, Karina)
Inspiration for AI Approaches
- Frustration with the verification grind inspires the vision to bring creative workflows from coding and AI art to mathematics.

2. Why Now? The Confluence of Math, AI, and Programming Languages

Three Key Trends Enabling Progress (05:51)
1. Advances in Large Language Models (LLMs):
  - LLMs have reached a point of strong mathematical and coding reasoning.
  - Quote:
    
    "The O series of OpenAI really marked an era of post-training, reasoning, reinforcement learning… great technique breakthoughs." (06:16, Karina)
2. Widespread Adoption of Formal Mathematical Languages (Lean):
  - Lean 4 has brought formalization to the mainstream, attracting top mathematicians.
  - “Lean makes this more like how computer programs work… if it’s not [correct], then it will report a bug, say at line 37…” (08:35, Karina)
3. Threshold Crossing in Code Generation (CodeGen):
  - CodeGen breakthroughs in reinforcement learning and code generation can underpin formal math reasoning at scale, despite the “ridiculous” data gap between code and formal mathematics (07:45).
Why Math Is Uniquely Unsolved
- Despite coding being heavily invested, “math has not been turned into programming language yet” (01:05, Karina; echoed at 47:50).

3. The Role of Formal Mathematical Languages (Lean)

How Lean Works and Its Role (08:19–13:39)
- Emulates the rigorous, step-wise nature of mathematical proofs.
- Acts as a programming language for mathematics with type theory foundations.
- Key advantage: Proofs in Lean, if correctly formalized, offer provable correctness.
- Quote:
  
  “If your statement is correctly formalized and your proof compiles, then congratulations, you have a correct proof.” (13:39, Karina)
Limitations and Challenges
- Lean proofs are only as good as correct formalizations of the original mathematical statement.
- Library coverage remains limited, especially in combinatorics, graph theory, or geometry (44:42).

4. Autoformalization and the Data Gap

What Is Autoformalization?
- The process of translating informal/natural language math proofs/statements into formal (Lean) equivalents (14:46).
- Two key motivators: closing data scarcity by turning informal data into formal data, and entangling proving/autoformalizing in a single loop for better systems.
The "Trillion vs. Million" Data Problem
- Trillions of code tokens online, but only about 10 million Lean tokens (07:45).
- Most formal math datasets skew to algebra/number theory; others lack representation.
Tackling Cold Start
- Reinforcement learning is crucial for bridging the cold start, “chicken and egg” problem (17:22).
- Quote:
  
  “You do have that sort of like cold start chicken and egg problem…” (17:22, Karina)
Current State and Barriers
- No robust, widely accepted benchmarks for autoformalization.
- Existing systems handle only trivial or short proofs, often with significant human intervention (41:17).

5. Reinforcement Learning and Curriculum Learning in Proofs

Role of RL in Formal Math
- RL delivers performance on benchmarks (e.g., Seed Prover and Aristotle models).
- With sparse feedback and the complexity of proofs, models often use curriculum learning: gradually increasing abstraction and creativity of problems (19:05–20:24).
Model Strategies
- Approaches differ: e.g., Seed Prover uses sophisticated system design and lammet pools, Aristotle uses Monte Carlo tree search (20:52).

6. Axiom’s Unique System Design and Research Focus

Prover-Conjecturer Self-Improving Systems
- Axiom bets on systems where provers and conjecturers interact in a feedback loop, enabling self-improvement and continual curriculum generation (21:55).
- Quote:
  
  “…form like a self improving loop as the prototype of self improving AI.” (22:13, Karina)
Synthetic Data and Benchmarking
- Heavy focus on autoformalization, synthetic data generation, and careful benchmarking (21:55).

7. Applications, Commercialization, and Impact

Commercial Potential Beyond the 'Geeky Subculture'
- While Lean and formal math communities may seem niche, history shows many 'subcultures' (like GPUs) became mainstream tech foundations (47:50).
- Key near-term applications: software/hardware verification, where formal guarantees are crucial (48:00–50:56).
- Quote:
  
  “…it is going to be a fundamental technology that for the first time you have provable guarantees of things in large language models.” (47:50, Karina)
- Productization & UI Uncertainty
  - Making formal reasoning UIs accessible is a design challenge—likely, the formal backend needs to be “mostly hidden” from end users (53:00).

8. Mathematical Discovery and Specialized Problem-Solving

Francois Charton’s Team & Pattern Boost
- Pattern Boost generates mathematical examples using transformers and perturbations, leading to new discoveries, including disproving century-old conjectures (34:38).
- Expanding methods to evolutionary algorithms and diffusion models (“diffuse boost”) (35:09).
Exploring Beyond Lean
- Some areas, like graph theory or Euclidean geometry, require new formal universes or approaches (43:22).

9. Entry Points and Community for Aspiring Researchers

Welcoming AI Researchers to AI for Math
- Many transferable skills for AI/CodeGen researchers.
- Recommended resources: survey papers (e.g., Formal Mathematical Reasoning; Next Frontier in AI) and the “AI for Math” GitHub repo (30:29).
- Vibrant, open community with conferences and collaborative initiatives.

Notable Quotes & Moments (with Timestamps)

On the challenge of math vs. coding:

“Math has not been turned into programming language yet and there will be new markets and new use cases that get unlocked because of this.” (01:05, Karina)
On autoformalization’s difficulty:

“Autoformalization… is surprisingly very hard. I mean, it’s not just a translation problem.” (14:46, Karina)
On the intersection of formal and informal reasoning:

“We don’t believe in only formal. We don’t believe in purely informal for sure… a hybrid approach… could be very interesting.” (29:19, Karina)
On the feedback loop of conjecturer/prover:

“You can have the prover as the reward signals for the conjecture… have the prover talk to the conjecture… form like a self-improving loop…” (22:06, Karina)
On commercialization:

“…there needs to be good product design… the right level of abstraction that one communicates should be at the natural language layer… [but] lean system working to make sure that it’s provable and correct…” (52:24, Karina)

Key Timestamps for Segments

| Timestamp | Segment Description | |--------------|--------------------------------------------------------------| | 02:20–04:41 | Karina’s early math journey and MIT research experience | | 05:48–07:39 | Why now? The three converging fields enabling AI math | | 08:19–13:39 | Lean explained: programming language for math, strengths | | 14:46–17:22 | Autoformalization explained and data gap challenge | | 17:54–20:24 | Reinforcement learning, formal math benchmarks, curriculums | | 21:55–24:52 | Axiom’s approach: Prover-conjecturer, self-play systems | | 27:40–30:07 | Data exploitation, informal vs. formal data | | 30:29–33:24 | Advice for newcomers, resources, and community | | 33:32–36:31 | Mathematical discovery: Pattern Boost and specialized problems| | 41:17–44:08 | Autoformalization limits and Lean’s coverage | | 47:50–50:32 | Commercial applicability and transformative potential | | 53:00–54:31 | Productization, UI/UX, and the challenge of adoption |

Conclusion & Takeaways

Karina Hong, through Axiom, is spearheading a future where math is not just solved by machines, but formalized, proven, and even creatively discovered with AI. The journey remains fraught with technical and data hurdles: scaling up autoformalization, bridging data gaps, expanding formal language coverage, and creating tools mathematicians will actually use. Yet, as math, programming, and AI coalesce, the prospect grows for both practical applications (verification, discovery) and foundational advances — not just for math, but across all high-stakes domains requiring rigorous, trustworthy reasoning.

Podcast fans in AI, mathematics, or software engineering will find this a rich, hopeful look at a deeply technical fast-moving frontier, still very much open for contribution and innovation.

Summary

The TWIML AI Podcast #754: Building an AI Mathematician with Karina Hong (Axiom)
Date: November 4, 2025
Guest: Karina Hong, Founder and CEO at Axiom
Host: Sam Charrington

Episode Overview

Key Discussion Points

1. Karina Hong’s Mathematical and AI Journey

Background and Motivation
- Karina describes her love for math since childhood, her experience in math Olympiads (02:20), and how research mathematics contrasts with problem-solving sprints (“a lot more pain and suffering… stuck on a problem for months” – 02:34).
- Quote:
  
  “Wouldn’t it be great if you can just let your intuitions take you?… Wouldn’t it be great to just think there’s a high chance if your intuition is right that the lemma is correct and move on…” (04:06, Karina)
Inspiration for AI Approaches
- Frustration with the verification grind inspires the vision to bring creative workflows from coding and AI art to mathematics.

2. Why Now? The Confluence of Math, AI, and Programming Languages

Three Key Trends Enabling Progress (05:51)
1. Advances in Large Language Models (LLMs):
  - LLMs have reached a point of strong mathematical and coding reasoning.
  - Quote:
    
    "The O series of OpenAI really marked an era of post-training, reasoning, reinforcement learning… great technique breakthoughs." (06:16, Karina)
2. Widespread Adoption of Formal Mathematical Languages (Lean):
  - Lean 4 has brought formalization to the mainstream, attracting top mathematicians.
  - “Lean makes this more like how computer programs work… if it’s not [correct], then it will report a bug, say at line 37…” (08:35, Karina)
3. Threshold Crossing in Code Generation (CodeGen):
  - CodeGen breakthroughs in reinforcement learning and code generation can underpin formal math reasoning at scale, despite the “ridiculous” data gap between code and formal mathematics (07:45).
Why Math Is Uniquely Unsolved
- Despite coding being heavily invested, “math has not been turned into programming language yet” (01:05, Karina; echoed at 47:50).

3. The Role of Formal Mathematical Languages (Lean)

How Lean Works and Its Role (08:19–13:39)
- Emulates the rigorous, step-wise nature of mathematical proofs.
- Acts as a programming language for mathematics with type theory foundations.
- Key advantage: Proofs in Lean, if correctly formalized, offer provable correctness.
- Quote:
  
  “If your statement is correctly formalized and your proof compiles, then congratulations, you have a correct proof.” (13:39, Karina)
Limitations and Challenges
- Lean proofs are only as good as correct formalizations of the original mathematical statement.
- Library coverage remains limited, especially in combinatorics, graph theory, or geometry (44:42).

4. Autoformalization and the Data Gap

What Is Autoformalization?
- The process of translating informal/natural language math proofs/statements into formal (Lean) equivalents (14:46).
- Two key motivators: closing data scarcity by turning informal data into formal data, and entangling proving/autoformalizing in a single loop for better systems.
The "Trillion vs. Million" Data Problem
- Trillions of code tokens online, but only about 10 million Lean tokens (07:45).
- Most formal math datasets skew to algebra/number theory; others lack representation.
Tackling Cold Start
- Reinforcement learning is crucial for bridging the cold start, “chicken and egg” problem (17:22).
- Quote:
  
  “You do have that sort of like cold start chicken and egg problem…” (17:22, Karina)
Current State and Barriers
- No robust, widely accepted benchmarks for autoformalization.
- Existing systems handle only trivial or short proofs, often with significant human intervention (41:17).

5. Reinforcement Learning and Curriculum Learning in Proofs

Role of RL in Formal Math
- RL delivers performance on benchmarks (e.g., Seed Prover and Aristotle models).
- With sparse feedback and the complexity of proofs, models often use curriculum learning: gradually increasing abstraction and creativity of problems (19:05–20:24).
Model Strategies
- Approaches differ: e.g., Seed Prover uses sophisticated system design and lammet pools, Aristotle uses Monte Carlo tree search (20:52).

6. Axiom’s Unique System Design and Research Focus

Prover-Conjecturer Self-Improving Systems
- Axiom bets on systems where provers and conjecturers interact in a feedback loop, enabling self-improvement and continual curriculum generation (21:55).
- Quote:
  
  “…form like a self improving loop as the prototype of self improving AI.” (22:13, Karina)
Synthetic Data and Benchmarking
- Heavy focus on autoformalization, synthetic data generation, and careful benchmarking (21:55).

7. Applications, Commercialization, and Impact

Commercial Potential Beyond the 'Geeky Subculture'
- While Lean and formal math communities may seem niche, history shows many 'subcultures' (like GPUs) became mainstream tech foundations (47:50).
- Key near-term applications: software/hardware verification, where formal guarantees are crucial (48:00–50:56).
- Quote:
  
  “…it is going to be a fundamental technology that for the first time you have provable guarantees of things in large language models.” (47:50, Karina)
- Productization & UI Uncertainty
  - Making formal reasoning UIs accessible is a design challenge—likely, the formal backend needs to be “mostly hidden” from end users (53:00).

8. Mathematical Discovery and Specialized Problem-Solving

Francois Charton’s Team & Pattern Boost
- Pattern Boost generates mathematical examples using transformers and perturbations, leading to new discoveries, including disproving century-old conjectures (34:38).
- Expanding methods to evolutionary algorithms and diffusion models (“diffuse boost”) (35:09).
Exploring Beyond Lean
- Some areas, like graph theory or Euclidean geometry, require new formal universes or approaches (43:22).

9. Entry Points and Community for Aspiring Researchers

Welcoming AI Researchers to AI for Math
- Many transferable skills for AI/CodeGen researchers.
- Recommended resources: survey papers (e.g., Formal Mathematical Reasoning; Next Frontier in AI) and the “AI for Math” GitHub repo (30:29).
- Vibrant, open community with conferences and collaborative initiatives.

Notable Quotes & Moments (with Timestamps)

On the challenge of math vs. coding:

“Math has not been turned into programming language yet and there will be new markets and new use cases that get unlocked because of this.” (01:05, Karina)
On autoformalization’s difficulty:

“Autoformalization… is surprisingly very hard. I mean, it’s not just a translation problem.” (14:46, Karina)
On the intersection of formal and informal reasoning:

“We don’t believe in only formal. We don’t believe in purely informal for sure… a hybrid approach… could be very interesting.” (29:19, Karina)
On the feedback loop of conjecturer/prover:

“You can have the prover as the reward signals for the conjecture… have the prover talk to the conjecture… form like a self-improving loop…” (22:06, Karina)
On commercialization:

“…there needs to be good product design… the right level of abstraction that one communicates should be at the natural language layer… [but] lean system working to make sure that it’s provable and correct…” (52:24, Karina)

Key Timestamps for Segments

Conclusion & Takeaways

Podcast fans in AI, mathematics, or software engineering will find this a rich, hopeful look at a deeply technical fast-moving frontier, still very much open for contribution and innovation.

wavePod

Building an AI Mathematician with Carina Hong - #754

Powered by Wave AI

Summary

Episode Overview

Key Discussion Points

1. Karina Hong’s Mathematical and AI Journey

2. Why Now? The Confluence of Math, AI, and Programming Languages

3. The Role of Formal Mathematical Languages (Lean)

4. Autoformalization and the Data Gap

5. Reinforcement Learning and Curriculum Learning in Proofs

6. Axiom’s Unique System Design and Research Focus

7. Applications, Commercialization, and Impact

8. Mathematical Discovery and Specialized Problem-Solving

9. Entry Points and Community for Aspiring Researchers

Notable Quotes & Moments (with Timestamps)

Key Timestamps for Segments

Conclusion & Takeaways

Summary

Episode Overview

Key Discussion Points

1. Karina Hong’s Mathematical and AI Journey

2. Why Now? The Confluence of Math, AI, and Programming Languages

3. The Role of Formal Mathematical Languages (Lean)

4. Autoformalization and the Data Gap

5. Reinforcement Learning and Curriculum Learning in Proofs

6. Axiom’s Unique System Design and Research Focus

7. Applications, Commercialization, and Impact

8. Mathematical Discovery and Specialized Problem-Solving

9. Entry Points and Community for Aspiring Researchers

Notable Quotes & Moments (with Timestamps)

Key Timestamps for Segments

Conclusion & Takeaways