Podcast Summary: "Humans&: Bridging IQ and EQ in Machine Learning"

Podcast: No Priors: Artificial Intelligence | Technology | Startups
Hosts: Sarah Guo, Elad Gil
Guest: Eric Zelikman (formerly Stanford, xAI, founder of Humans&)
Date: October 9, 2025

Episode Overview

In this episode, Sarah Guo interviews Eric Zelikman—an influential AI researcher and founder of the new startup Humans&. The discussion traverses Eric’s journey from foundational work on scaling reasoning in machine learning (notably the STAR and QuietStar methods), his time at xAI contributing to Grok, and his new focus on bringing emotional intelligence (EQ) and human-centric design to AI models. The conversation covers the limitations of today’s "IQ-heavy" models, the future of human-in-the-loop AI, and the vision behind Humans&, which aims to bridge the gap between technical reasoning and understanding people’s goals and emotions.

Key Discussion Points & Insights

Eric’s Early Motivation for Machine Learning

Eric was drawn to ML by the potential for AI to unlock human talent and passions constrained by circumstance.
- [00:33] “There’s just so much talent out there… AI is all of humanity’s not living up to their full potential.” — Eric Zelikman
Initially enamored with automation, Eric realized empowering people requires truly understanding their goals—far more complex than mere task automation.

Technical Innovations: STAR and QuietStar

STAR (Self-Taught Reasoner):
- [03:15] Eric describes STAR’s intuition: let the model generate solutions to problems, reward it for correct answers, and iterate.
- Surprise finding: continual training yielded improvements in abilities like N-digit arithmetic, with no early plateau.
  
  [04:47] “As you actually trained for more and more iterations, the number of digits that it was actually able to do kept increasing… there’s no obvious plateau here.” — Eric Zelikman
- Introduced variants addressing learning from failures, not just successes.
QuietStar:
- [06:09] Last Stanford work, scaling reasoning to pretraining-sized datasets.
- Highlighted key advances: keeping learning online, using baselines to focus on hard problems, and moving beyond simple Q&A to arbitrary text chunks for broader reasoning.

IQ Progress & Contemporary Model Capabilities

How smart are today’s models?
- [08:24] Models are “reasonably smart,” often surpassing humans on non-trivial tasks (e.g., advanced physics, math, trick questions).
- Failure modes cluster around tricky/jagged edge cases and particularly, lack of emotional intelligence.
- [09:27] “One category… are trick questions that require basically people… But also they don’t really… I think one of the core things is that they’re not smart emotionally…" — Eric Zelikman
Advice for Non-Researchers:
- [10:41] The more context you feed current models, the better they perform; they thrive in “closed-form” answer spaces.

Challenges in Verifiable Domains (e.g., Code)

[11:42] Limitations:
- Responsiveness requirements vs. model thoroughness.
- In/out-of-distribution issues (models falter in novel domains).
- Verifiability strongly impacts model performance.

Reflection on Scaling and “Human in the Loop”

Most scaling strategies focus on minimizing human input because it’s “messy”; however, Eric argues for the importance of keeping people involved:
- [16:10] “Being very mindful… to effectively keep people in the loop, it’s actually a very active decision.”
- Trends like models autonomously running for hours are worrying, leading to less transparency and less user agency.
- [18:41] “If you have a model that goes off and does its own thing for eight hours and comes back… This is a weird regime where people probably feel less real agency.” — Eric Zelikman
Human-in-the-loop: The why
- Error correction, philosophical agency, and true innovation require human input.
- [19:17] “If you actually have models that really understand what people’s goals are and really empower them more, you end up in a very different situation."

From IQ to EQ: The Vision Behind Humans&

Eric shifts focus to emotional/interpersonal capabilities that go beyond the current “task-centric” paradigm.
- [22:49] “We have these incredibly smart models that are capable of so much, but… The role that they play in people’s lives is a lot less deep, a lot less positive… fundamentally these models don’t really understand people.”
On Current Benchmarks & Training:
- [24:19] Calls out the field’s focus on narrowly-quantifiable benchmarks and credits incentives; little consideration for longitudinal, interactive metrics.
- [26:12] “And the most popular [environments] are… encoding and computer use rather than anything that requires simulating people.”
Missing Capabilities:
- [26:28] “The most fundamental thing is that the models… don’t understand the long term implications of the things that they do and say.”
- Today’s AIs are “single-turn focused,” rarely clarifying or expressing uncertainty, failing to act proactively or remember useful context.
- [31:37] “Imagine if you had a friend where you had to re-explain everything about yourself to them every time you spoke.” — Sarah Guo
The Data Challenge:
- Collecting true multi-turn, longitudinal human interaction data is hard—but essential for moving beyond the current limitations.

What Does "Humanly Intelligent" AI Look Like?

Future AI should:
- Remember rich personal context over time;
- Understand evolving, sometimes contradictory user goals and preferences;
- Assist in productivity and well-being (not just character/companionship);
- Collaborate and coordinate, not just automate.
- [34:59] “Everyone kind of has things that they’re passionate about, and… can do really cool things. I think the role of the model should be to allow people to do those really cool things…” — Eric Zelikman
Addressing Uniqueness:
- [32:49] “I’m a unique snowflake. You can’t possibly simulate me…” — Sarah Guo
- [33:26] Eric acknowledges the difficulty but asserts the goal: models should try to learn about each user, improving well beyond today’s generic outputs.

The Recruiting Pitch for Humans&

Hiring builders across engineering, infrastructure, research, and product:
- [35:43] “I’m looking for really strong infra folks… researchers… product folks… who’ve thought a lot about users, who’ve thought a lot about memory… about building beautiful, tasteful products.”

Notable Quotes & Memorable Moments

On the reality of model agency:
[18:43] “20,000 lines of generated code looks good to me.” — Sarah Guo, on the opacity of fully autonomous codegen.
On current AI as bad friends:
[31:37] “Imagine if you had a friend where you had to re-explain everything about yourself to them every time you spoke.” — Sarah Guo
On the long-term value of EQ-focused models:
[29:51] “For most labs, the human is kind of the intermediate until you have this fully automated system… optimizing things for being really good at… interacting and collaborating… is almost like an intermediate thing until you get to this fully automated point.” — Eric Zelikman
On the limits of task-centric training:
[24:23] “It’s ludicrous that all the benchmarks are still oriented this way.” — Sarah Guo
On empowering people, not just automating:
[19:17] “If you actually have models that really understand what people’s goals are… you end up in a very different situation.” — Eric Zelikman

Timestamps for Important Segments

00:33 — Eric’s motivation for AI as a tool for human potential
03:15 — The intuition and steps behind STAR
04:47 — Model scaling and the surprise of continual improvements
06:09 — QuietStar and advancing the reasoning paradigm
08:24 — How smart are today’s models, and limits of their IQ
10:41 — Advice for practitioners using current LLMs
11:42 — The challenge of AI in code: verifiability, context, and responsiveness
16:10 — On the importance (and challenge) of keeping humans in the loop
18:41 — Dangers of fully autonomous long-horizon models
24:19 — On the field’s fixation with task-based benchmarks
29:51 — Why most industry sees “human interaction” as temporary
31:37 — Memory, context, and what empathy in models really means
34:59 — The “Culture” sci-fi example: AI overlords or AI enablers?
35:43 — Who Humans& is hiring

Episode Tone & Takeaways

The episode is candid, deep, and leans on Eric’s technical expertise while pulling the conversation toward the imperative for more human-centric and emotionally intelligent AI models. Both Eric and Sarah are reflective but practical, pushing past buzzwords to interrogate what real collaboration between AI and people could—and should—look like.

For listeners:
This episode is essential for anyone interested in the “next leap” in AI: not just smarter algorithms, but systems with memory, self-awareness, and real collaboration with humans. If you care about the future relationship between technology and humanity, and how AI can augment—not replace—our potential, you’ll find both insight and inspiration here.

Podcast Summary: "Humans&: Bridging IQ and EQ in Machine Learning"

Podcast: No Priors: Artificial Intelligence | Technology | Startups
Hosts: Sarah Guo, Elad Gil
Guest: Eric Zelikman (formerly Stanford, xAI, founder of Humans&)
Date: October 9, 2025

Episode Overview

Key Discussion Points & Insights

Eric’s Early Motivation for Machine Learning

Eric was drawn to ML by the potential for AI to unlock human talent and passions constrained by circumstance.
- [00:33] “There’s just so much talent out there… AI is all of humanity’s not living up to their full potential.” — Eric Zelikman
Initially enamored with automation, Eric realized empowering people requires truly understanding their goals—far more complex than mere task automation.

Technical Innovations: STAR and QuietStar

STAR (Self-Taught Reasoner):
- [03:15] Eric describes STAR’s intuition: let the model generate solutions to problems, reward it for correct answers, and iterate.
- Surprise finding: continual training yielded improvements in abilities like N-digit arithmetic, with no early plateau.
  
  [04:47] “As you actually trained for more and more iterations, the number of digits that it was actually able to do kept increasing… there’s no obvious plateau here.” — Eric Zelikman
- Introduced variants addressing learning from failures, not just successes.
QuietStar:
- [06:09] Last Stanford work, scaling reasoning to pretraining-sized datasets.
- Highlighted key advances: keeping learning online, using baselines to focus on hard problems, and moving beyond simple Q&A to arbitrary text chunks for broader reasoning.

IQ Progress & Contemporary Model Capabilities

How smart are today’s models?
- [08:24] Models are “reasonably smart,” often surpassing humans on non-trivial tasks (e.g., advanced physics, math, trick questions).
- Failure modes cluster around tricky/jagged edge cases and particularly, lack of emotional intelligence.
- [09:27] “One category… are trick questions that require basically people… But also they don’t really… I think one of the core things is that they’re not smart emotionally…" — Eric Zelikman
Advice for Non-Researchers:
- [10:41] The more context you feed current models, the better they perform; they thrive in “closed-form” answer spaces.

Challenges in Verifiable Domains (e.g., Code)

[11:42] Limitations:
- Responsiveness requirements vs. model thoroughness.
- In/out-of-distribution issues (models falter in novel domains).
- Verifiability strongly impacts model performance.

Reflection on Scaling and “Human in the Loop”

Most scaling strategies focus on minimizing human input because it’s “messy”; however, Eric argues for the importance of keeping people involved:
- [16:10] “Being very mindful… to effectively keep people in the loop, it’s actually a very active decision.”
- Trends like models autonomously running for hours are worrying, leading to less transparency and less user agency.
- [18:41] “If you have a model that goes off and does its own thing for eight hours and comes back… This is a weird regime where people probably feel less real agency.” — Eric Zelikman
Human-in-the-loop: The why
- Error correction, philosophical agency, and true innovation require human input.
- [19:17] “If you actually have models that really understand what people’s goals are and really empower them more, you end up in a very different situation."

From IQ to EQ: The Vision Behind Humans&

Eric shifts focus to emotional/interpersonal capabilities that go beyond the current “task-centric” paradigm.
- [22:49] “We have these incredibly smart models that are capable of so much, but… The role that they play in people’s lives is a lot less deep, a lot less positive… fundamentally these models don’t really understand people.”
On Current Benchmarks & Training:
- [24:19] Calls out the field’s focus on narrowly-quantifiable benchmarks and credits incentives; little consideration for longitudinal, interactive metrics.
- [26:12] “And the most popular [environments] are… encoding and computer use rather than anything that requires simulating people.”
Missing Capabilities:
- [26:28] “The most fundamental thing is that the models… don’t understand the long term implications of the things that they do and say.”
- Today’s AIs are “single-turn focused,” rarely clarifying or expressing uncertainty, failing to act proactively or remember useful context.
- [31:37] “Imagine if you had a friend where you had to re-explain everything about yourself to them every time you spoke.” — Sarah Guo
The Data Challenge:
- Collecting true multi-turn, longitudinal human interaction data is hard—but essential for moving beyond the current limitations.

What Does "Humanly Intelligent" AI Look Like?

Future AI should:
- Remember rich personal context over time;
- Understand evolving, sometimes contradictory user goals and preferences;
- Assist in productivity and well-being (not just character/companionship);
- Collaborate and coordinate, not just automate.
- [34:59] “Everyone kind of has things that they’re passionate about, and… can do really cool things. I think the role of the model should be to allow people to do those really cool things…” — Eric Zelikman
Addressing Uniqueness:
- [32:49] “I’m a unique snowflake. You can’t possibly simulate me…” — Sarah Guo
- [33:26] Eric acknowledges the difficulty but asserts the goal: models should try to learn about each user, improving well beyond today’s generic outputs.

The Recruiting Pitch for Humans&

Hiring builders across engineering, infrastructure, research, and product:
- [35:43] “I’m looking for really strong infra folks… researchers… product folks… who’ve thought a lot about users, who’ve thought a lot about memory… about building beautiful, tasteful products.”

Notable Quotes & Memorable Moments

On the reality of model agency:
[18:43] “20,000 lines of generated code looks good to me.” — Sarah Guo, on the opacity of fully autonomous codegen.
On current AI as bad friends:
[31:37] “Imagine if you had a friend where you had to re-explain everything about yourself to them every time you spoke.” — Sarah Guo
On the long-term value of EQ-focused models:
[29:51] “For most labs, the human is kind of the intermediate until you have this fully automated system… optimizing things for being really good at… interacting and collaborating… is almost like an intermediate thing until you get to this fully automated point.” — Eric Zelikman
On the limits of task-centric training:
[24:23] “It’s ludicrous that all the benchmarks are still oriented this way.” — Sarah Guo
On empowering people, not just automating:
[19:17] “If you actually have models that really understand what people’s goals are… you end up in a very different situation.” — Eric Zelikman

Timestamps for Important Segments

00:33 — Eric’s motivation for AI as a tool for human potential
03:15 — The intuition and steps behind STAR
04:47 — Model scaling and the surprise of continual improvements
06:09 — QuietStar and advancing the reasoning paradigm
08:24 — How smart are today’s models, and limits of their IQ
10:41 — Advice for practitioners using current LLMs
11:42 — The challenge of AI in code: verifiability, context, and responsiveness
16:10 — On the importance (and challenge) of keeping humans in the loop
18:41 — Dangers of fully autonomous long-horizon models
24:19 — On the field’s fixation with task-based benchmarks
29:51 — Why most industry sees “human interaction” as temporary
31:37 — Memory, context, and what empathy in models really means
34:59 — The “Culture” sci-fi example: AI overlords or AI enablers?
35:43 — Who Humans& is hiring

wavePod

Humans&: Bridging IQ and EQ in Machine Learning with Eric Zelikman

Summary

Podcast Summary: "Humans&: Bridging IQ and EQ in Machine Learning"

Episode Overview

Key Discussion Points & Insights

Eric’s Early Motivation for Machine Learning

Technical Innovations: STAR and QuietStar

IQ Progress & Contemporary Model Capabilities

Challenges in Verifiable Domains (e.g., Code)

Reflection on Scaling and “Human in the Loop”

From IQ to EQ: The Vision Behind Humans&

What Does "Humanly Intelligent" AI Look Like?

The Recruiting Pitch for Humans&

Notable Quotes & Memorable Moments

Timestamps for Important Segments

Episode Tone & Takeaways

Summary

Podcast Summary: "Humans&: Bridging IQ and EQ in Machine Learning"

Episode Overview

Key Discussion Points & Insights

Eric’s Early Motivation for Machine Learning

Technical Innovations: STAR and QuietStar

IQ Progress & Contemporary Model Capabilities

Challenges in Verifiable Domains (e.g., Code)

Reflection on Scaling and “Human in the Loop”

From IQ to EQ: The Vision Behind Humans&

What Does "Humanly Intelligent" AI Look Like?

The Recruiting Pitch for Humans&

Notable Quotes & Memorable Moments

Timestamps for Important Segments

Episode Tone & Takeaways