Pedro Domingos, "The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World" (Basic Books, 2018) - New Books Network

Summary8 min read

Podcast Summary:

New Books Network — Pedro Domingos on "The Master Algorithm"

Host: Gregory McNiff
Guest: Pedro Domingos, author, Professor Emeritus of Computer Science and Engineering, University of Washington
Date: May 30, 2026

Episode Overview

This episode features a deep dive into Pedro Domingos’s influential book The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World. The host, Gregory McNiff, guides a comprehensive conversation covering the history, progress, and paradigm shifts in machine learning and artificial intelligence (AI) since the book’s first publication nearly a decade ago.

Domingos explores the five foundational traditions ("tribes") of machine learning, discusses the notion of a "master algorithm," and reflects on both the technical advances and societal impacts of AI, including concerns surrounding automation, consciousness, and the so-called singularity. The conversation offers both approachable analogies and technical insights, making it valuable for listeners of all backgrounds.

Key Discussion Points and Insights

1. Rationale and Audience for the Book

Motivation: Domingos wanted to demystify machine learning for the general public, comparable to understanding how to drive a car rather than how its engine works.

"Machine learning was not just something for the experts to know, it was something that everybody needed to understand as private citizens, as professionals, et cetera." (03:12)
Popular Science Approach: The book adopts a quest narrative, exploring the search for a "master algorithm" uniting all machine learning traditions.

2. AI vs. Machine Learning: Defining the Relationship

Clarification:
- Machine learning is the automation of learning.
- AI is the automation of intelligence; machine learning is its core.
Domingos’s Analogy:

"Machine learning is the engine of AI, if you will." (06:45)
The public now conflates the terms, but they represent different scopes.

3. Early Inspiration & History

Spark for Domingos: Reading the first AI textbook, not playing Tetris as sometimes cited.

"The thing that actually got me into AI was that one day I randomly ran into a book called Artificial Intelligence in the bookstore… the book had a small chapter on machine learning towards the end... I immediately thought, number one, this is the linchpin." (08:09)
Machine Learning vs. Knowledge Engineering:
- Early AI focused on rule-based systems by manually encoding expert knowledge.
- This hit a "knowledge acquisition bottleneck"—collecting all possible human knowledge was impossible and expensive.
- Machine learning’s rise was tied to the growth of data and the realization that algorithms could learn autonomously from vast information streams.
"Machine learning is exactly the solution to that knowledge acquisition bottleneck." (14:34)

4. The Five Tribes of Machine Learning

(See [19:51] onward for this foundational segment)

a. Symbolists

Inspired by logic, mathematics, and philosophy; mimic scientific method.

b. Connectionists

Inspired by neuroscience; reverse-engineer the human brain (e.g., neural networks).

c. Evolutionaries

Modeled on biological evolution; use algorithms that evolve programs/circuits (e.g., genetic algorithms).

d. Bayesians

Based on Bayesian statistics; focus on handling uncertainty through probability (e.g., Bayes theorem).

e. Analogizers

Take inspiration from psychological reasoning by analogy; example algorithms include nearest neighbor.

"Each has its own master algorithm… all solve different important problems. But at the end of the day, you don’t have a master algorithm until you solve all of them." (20:10)

5. Defining the Master Algorithm

The "master algorithm" is an algorithm that can learn anything, given the right data.
It must unify structure learning (finding the shape of a model) and weight/parameter learning (optimizing parts of that model).
Analogy to Lord of the Rings: one algorithm to rule them all.

"At an abstract level, a master algorithm is an algorithm that can do anything if you give it the right data to learn from." (26:44)
"It is only one algorithm that solves the problems that all these five different algorithms solve." (24:31)

6. Modern Advances & The State of AI

Transformers and LLMs: The emergence of large language models (LLMs) and transformer architectures mark major advances; we are "way closer" to the master algorithm but not there yet. (26:12)
Size Comparison: The human brain is still ahead of any LLM regarding neuron and connection count and efficiency—energy use is a major gap (42:22).

"The brain is still more powerful... at this point, we’re maybe at the level of a mouse brain or a cat brain." (41:40)

7. Core Problems in Machine Learning

Overfitting: Learning the training data too well, failing to generalize to new data.
Curse of Dimensionality: High-dimensional data (trillions of parameters) makes human intuition and visualization fail; many issues of current AI originate here.

"In some ways [the curse of dimensionality] is even more important now than it was 10 years ago..." (39:30)

8. Key Algorithms & Concepts

Nearest Neighbor: Intuitively simple but powerful analogizer method.
EM (Expectation Maximization): Central in statistical learning, especially with hidden variables.
Markov Networks: These subsume Bayesian networks and offer more generality.
Relational Learning: Moving beyond independent data points to interconnected, relational data—vital for social networks, knowledge graphs.

9. Reinforcement Learning and the Master Algorithm Debate

Domingos disagrees that reinforcement learning is the "master algorithm" (contra Rich Sutton).
Analogy by Yann LeCun: “Unsupervised learning is the cake, supervised learning the icing, reinforcement learning the cherry on top.” (48:47)

"It may well be part of [the master algorithm], but not the whole thing." (51:36)

10. The Triad: Data, Compute, Algorithms

Progress in AI is gated by these three interlinked resources—data, computational power, and algorithmic advances.
At present, all three are simultaneously bottlenecks, each crucial for advances.

"If the algorithms aren’t good enough, no amount of data and compute will get you there." (54:12)

11. Alchemy and Markov Logic Networks

Domingos’s own contribution: alchemy is a system combining logic and Markov networks, integrating lessons from all five tribes. It's called "alchemy" to acknowledge the field is still proto-scientific (like alchemy before chemistry). (56:34)

12. Jobs, Automation, and The Economics of AI

Fear of mass unemployment is "basic economics fallacy"—automation reduces costs and increases demand, shifting rather than eliminating work.
Human intelligence and AI are complements, not direct substitutes.

"There are almost no jobs that can be completely done by AI… In your job, you should think about automating the things that AI can automate, which will improve your quality of life and your productivity." (58:04)

13. AI, Consciousness, and the Skynet Myth

The "rogue AI" scenario is largely science fiction, based on anthropomorphizing and misunderstanding of AI as algorithms with objective functions, not will or consciousness.
Giving AI rights is "absurd"

"[AI] is an algorithm with an objective function and all it does is maximize that… the idea of AI suddenly deciding to do something else and taking over is… just science fiction." (61:26)

14. Singularity vs. Phase Transition

A technological singularity (infinite intelligence) is physically impossible—the reality is "S curves" and phase transitions.

"Progress… starts off slow and then it speeds up… then it slows down again until it flattens. That’s called an S curve… that is going to be the same thing with AI, because it has to be." (65:25)

15. What Would an Updated Edition Include?

The foundational concepts haven’t changed; what’s new is the technical landscape (e.g., transformers, LLMs).
Would add two chapters: one on technical developments of the last decade, another updating societal and ethical implications.

"The book is about…the things in machine learning… that have been the same since the 50s… those are all still current. But I would add… a chapter about the technical developments… and a chapter… bringing the societal implications up to date." (68:16)

Notable Quotes & Memorable Moments

On why a popular science book about ML was needed:

"Textbooks are boring. Students are forced to read them. It has to have a theme. It has to have a story." (03:12)
Machine Learning as the core of AI:

"If you had a system that was as good as humans at everything, but didn't learn, the following day it would already be worse, and it would only keep getting worse." (06:45)
The future of jobs:

"What AI does fundamentally, from economic point of view, is greatly decrease the cost of intelligence, because now it can be done by a computer instead of a human… the demand for… what you can do with those things will go up." (58:04)
On the "AI apocalypse":

"It's just science fiction. It makes for good movies, but it really has nothing to do with reality." (61:26)
On singularity hype:

"That is not going to happen because it's physically impossible. For the physicists, they know this very well. If they see a singularity in their equations, they know there's something wrong with them." (65:25)

Timestamps for Major Segments

Introduction & Book Origins — 02:55
AI vs. Machine Learning; Definitions — 05:44
Domingos’s Inspiration & Early AI — 08:02
Knowledge Engineering vs. Machine Learning — 13:17
Minsky, Perceptron, and AI Winter — 17:24
The Five Tribes — 19:51
Master Algorithm Concept — 24:21
Technical Advances (Transformers, LLMs) — 26:12
Overfitting & Curse of Dimensionality — 37:43, 39:30
Human Brain vs AI — 41:07
Nearest Neighbor — 42:52
EM Algorithm, Hidden Markov Models — 44:27, 46:21
Relational Learning — 46:56
Reinforcement Learning Debate — 48:35
Race: Data vs. Hypotheses — 53:06
Data, Compute, Algorithms — 54:12
Alchemy & Markov Logic Networks — 56:34
Jobs & Economic Impact — 58:04
AI Apocalypse and Consciousness — 61:26, 63:07
Singularity vs Phase Transitions — 65:25
What Would Be Updated? — 68:16

Final Takeaway

Pedro Domingos’s The Master Algorithm remains timely and foundational despite fast-paced advances in AI. His vision of a unifying master algorithm, the interplay of five machine learning paradigms, and sober analysis of the societal impact of AI provide a prescient and levelheaded roadmap for understanding where AI has been—and where it is heading.

Loading summary

Transcript107 lines

[00:00]
A
Study and play come together on a Windows 11 PC and for a limited time, college students get the best of both worlds. Get the Unreal College Deal Everything you
[00:12]
B
need to study and play with select Windows 11 PCs.
[00:15]
A
Eligible students get a year of Microsoft 365 Premium and a year of Xbox Game Pass ultimate with a custom color Xbox wireless controller. Learn more@windows.com studentoffer while supplies last ends
[00:27]
B
June 30th terms at aka mscollegepc when you need to build up your team to handle the growing chaos at work, use Indeed Sponsored Jobs. It gives your job posts the boost it needs to be seen and helps reach people with the right skills, certifications and more. Spend less time searching and more time actually interviewing candidates who check all your boxes. Listeners of this show will get a $75 sponsored job credit@ Indeed.com podcast that's Indeed.com podcast. Terms and conditions apply. Need a hiring hero? This is a job for Indeed Sponsored
[00:59]
A
Jobs have you ever rearranged your furniture and discovered the carpet underneath looks brand new, while the rest of it looks, well, not so new? It's time for a carpet upgrade at the Home Depot.
[01:10]
B
We have stylish choices at simple prices
[01:12]
A
from all the top brands. Best of all, we can install it for you, starting at only 49 cents per square foot. So all you have to do is
[01:20]
B
pick your perfect floor.
[01:21]
A
Start your carpet project today at the Home Depot.
[01:23]
B
How doers get more done Exclusions apply
[01:26]
A
For licenses see homedepot.com licensenumbers welcome to the New Books Network welcome to the New Books Network. I'm your host, Gregory McNiff, and I'm thrilled to be joined by Pedro Domingos, the author of the Master Algorithm how the Quest for the Ultimate Learning Machine Will Remake Our World. The book was published by Basic Books in 2015 as a hardback and later 2018 as a paperback. I chose this book, despite its age, because time has only confirmed its value. Written nearly a decade ago, it's offered then, and I believe still does, the clearest explanation of machine learning's five traditions, or five tribes. As Pedro articulates in the book, as well as a strikingly prescient map of where the field was headed at the time and candidly seems to have arrived at. Essentially, the book is a blueprint that has held up remarkably well. Pedro Domingos is professor emeritus of computer science and engineering at the University of Washington, where he is a pioneer of machine learning and a developer of Markov Logic Networks. He is the recipient of if I'm Pronouncing this right. The SIGKDD Innovation Award and the IJCAI John McCarthy Award, two of the field's highest honors, and is a fellow of both the AAAS and aaai. He's written for numerous periodicals and he is based in Washington. Pedro, thank you for joining me today to discuss your book.
[02:56]
B
Thanks for having me.
[02:57]
A
You know, as I mentioned in the introduction, Pedro, the book is somewhat, I don't want to say old, but it's had some breathing time with that. Could you talk about why you wrote this book back in 2015 and who was the audience at that point?
[03:12]
B
I wrote the book because I felt there was a dire need for a book on machine learning that really explained clearly to people what it's all about. We were at the point then where I felt machine learning was not just something for the experts to know, it was something that everybody needed to understand as private citizens, as professionals, et cetera. Not understanding, at the level of understanding how the engine of a car works, that's for the mechanics, but at the level of understanding how to drive the car to take it where you want to go. And in fact, ever since I was a PhD student in the 90s, I always felt that someone should write a popular science book about machine learning. I've always been a fan of popular science books, but it didn't seem that urgent back then. And I also didn't have a very good idea of how to write the book. Right. Because popular science book isn't just a textbook. Textbooks are boring. Right. Students are forced to read them. It has to have a theme, that it has to have a story. And then these two things happened. In 2012, when I started writing the book, there was the big data explosion and there was all this nonsense being written in good publications. And the books that had come out were all in my few, kind of weak in different ways. But also at the same time, I had this idea of the master algorithm, which really is, I think, the central idea around which all of machine learning revolves. And hence it's the thought of the book, it's the core of the book. And so I wrote the book as there's different ways you can write a popular science book. One of them is the mystery, one of them is the quest. And this book is the story of the quest for the master algorithm that we in machine learning, some of us without knowing it, some others very knowingly, are on and of how that's going to affect the world.
[05:13]
A
Yeah, no, I thought it was great. I love the map at the end, that sort of Tied it all together, I guess. Pedro, to ask you the obvious question, as I mentioned, the book has been at least 10 plus years since publication. What has changed the most in AI and machine learning since he wrote it? And I should note, throughout the book, you refer to it as machine learning. Do you consider those two terms comparable or synonymous? And if not, how would you differentiate them?
[05:44]
B
Yes, people use them interchangeably today. But that's a mistake. At the time that I wrote the book, the public had no idea what machine learning was. And I remember thinking, if the only thing the book did was put that term machine learning in people's vocabulary, it would have been a success. And now, of course, things are on a different plane. But to answer your question, also, at the time, AI was a dirty word, right? We did it in academia, but when I talked to people in industry about what I did, I wouldn't say AI because AI was bad. Again, that has changed 180 degrees. But to answer your question, the relationship between them is actually very simple and it's important to understand it. AI is the automation of intelligence. And intelligence has different aspects. There's vision, there's speech and language, there's planning, there's reasoning, there's common sense, and there's learning. Machine learning is the automation of learning. And when I got into AI, got interested in the 80s, got my PC in the 90s, people thought machine learning, even in AI, very few people believed in machine learning. They thought it was too hard or just a bad idea. But my view, which now has been massively vindicated, was that machine learning really is the core of AI. AI is downstream of machine learning. If you can't learn, nothing else will happen. If you had a system that was as good as humans at everything, but didn't learn, the following day would already be worse, and it would only keep getting worse. So to my mind, machine learning was really the core around which AI would revolve. And that is very much what the case is today. All of these things that you see in language, in vision, in everything, they are all driven by machine learning. So another way to put it is that machine learning is the fuel of the AI, is a rocket taking us to some future. And it's machine learning. And the data, of course, I guess you could say the fuel is the data. But machine learning is the engine of AI, if you will.
[07:41]
A
No, I really want to dive into that and let's go there. And I should say the initial spark, if I can give you the analogy, I think was you playing Tetris in college. Right. That was at least some point where you started thinking maybe about how to solve these problems or how to think about solving, I guess, programming.
[08:03]
B
Unfortunately, to be honest, Tetris was not the spark. Tetris was actually something that I wasted a summer of my life playing. But I bring it up in the book because Tetris is actually a beautiful example of what is called an NP complete problem, which is a problem that is hard to solve but easy to check the solution. Again, these days, this is particularly relevant, but it's a core concept in computer science. And AI is the field that deals with solving these intractable problems and all its pros and cons derived from it. And Tetris, while being just a simple computer game, the beauty of these NP complete problems is that they're all equivalent at some level. So if you can solve any one of them, you can solve all the others. So if someone found the optimal solution for Tetris, that would revolutionize biology, physics, medicine, which is very counterintuitive. That's a little game. So in some sense, maybe playing it is not such wasted time. But the thing that actually got me into AI was that one day I randomly ran into a book called Artificial Intelligence in the bookstore. It was the first AI textbook. It was a very small and short book at the time. And I looked at it and thought, what could that be? It seemed almost like an oxymoron, Artificial intelligence. And I read that book and immediately I thought, okay, I see what they're doing. This is an interesting field. And the book had a small chapter on machine learning towards the end, because AI back then was all about knowledge representation and reasoning and problem solving. And I immediately thought, number one, this is the linchpin as I had just described. And number two, everything is in such a primitive state, much easier to have an impact in AI than in physics or biology that are very mature. Right. And then number three, if you can solve this problem, the impact would be spectacular. And number four, which I think comes up a lot in the book, it's such a rich field. It has all these connections to psychology, to evolution. There's almost nothing it doesn't relate to. So I felt I could spend my whole life working on this and I would never get bored or tired. And indeed, by now I have. And that was correct.
[10:15]
A
Well, yeah, it might be time to update or provide a second edition, but I'll ask you about that. I will say you hit on two points. That one, all I kept thinking was the master algorithm is the physics equivalent of the final equation or the Holy Grail. And, yeah, I should have mentioned I really enjoyed the asides. I mean, you talk about solving cancer, evolution. You have a whole chapter on how brains think. So it's not a very narrow, disciplined. It's not a very narrow discipline or a book that's very focused on specific machine learning. It's quite broad and deep, which is nice. Pedro, I just want to ask you some questions a little more along this line. I think in the prologue, you say we don't have to program computers. They program themselves. Do you still believe that's the case today?
[11:02]
B
Absolutely. That is indeed the essence of machine learning. We've known since Turing that a computer is a machine that can do anything, but you still have to program it. And programming it is very expensive, very time consuming, very difficult in many cases. In many cases, we don't even know how to program what we want. I know how to ride the bike. Nobody knows how to program a computer to ride a bike. Even now, machine learning is where the computer actually learns to program itself. Right. The computer actually comes up with the programs itself by looking at data. And again, this is very difficult, and success is not assured. But if you can do it, it's amazing. And indeed, what you can see today is computers programming themselves to a level that was unimaginable even 10 years ago, but was exactly what I was predicting in the book. As machine learning progresses, the computers will be able to program themselves to do more and more by learning from data. These days, everybody can converse with a chatbot and have it solve problems, which it learned to do by learning. No one knows at all how to program these chatbots to do what they do. It's all machine learning.
[12:12]
A
Yeah, I do want to go into that, because I know, I believe there's a field called mechanistic engineering where we're trying to reinterpret it. You would know better than me how these large language models, how these AI models work. And I want to ask you, is that a concern that we don't understand how they're doing what they're doing? And I've read in other books that the more data we give it, the more accurate it actually becomes. And mathematicians like Terence Tao have said it's helped me with proofs. I think it contributed to winning the Nobel for a protein model in early 2022, 2023. I mean, these things have gone way beyond the search algorithms of a Netflix and an Amazon and a Google. So they're definitely getting more complicated. I want to ask you about that, but I Do want to start at the beginning with machine learning and you trace sort of the history between knowledge engineering and machine learning. And maybe knowledge engineering is more rules based. Could you briefly talk about how machine learning overtook knowledge engineering and basically won the race, won the argument?
[13:17]
B
Yeah. So in the early days of machine learning, going back to Turing and Shannon and von Neumann and the pioneers, in the early days of AI, we're talking about the 50s, people thought learning would be the way. I mean, Turing says that quite explicitly. And why not, right? Humans learn and machines have better do it as well. But then what happened was that at the time it was too hard. People weren't able to get very far with it. And so in the 60s, towards the end of the 60s, people just came to the conclusion like, no, machine learning isn't going to work. And so what dominated AI for the next 20, 30 years, and it had some successes, was knowledge engineering, where what you do is you code by hand, you encode all the knowledge that the system has to know. You interview doctors and lawyers and whatever, right, whatever experts you want, and you build the so called expert systems or knowledge based systems. And there were these efforts to build knowledge bases that would contain all of humanity's knowledge, right? The problem, however, is that they failed. The reason they failed is that there's always more knowledge and the costs of acquiring it takes interviewing people or people working full time. The cost is high and the payoff is uncertain because there's this long tail where you never know if that knowledge is going to be used. And then having more knowledge actually makes your system more expensive and slows it down. So that whole approach to AI ran into a number of problems, one of which was the so called knowledge acquisition bottleneck. You can never acquire enough knowledge and it's too expensive. And machine learning is exactly the solution to that knowledge acquisition bottleneck. The way you solve it is you don't interview all those experts and do all of that hand coding anymore. You just have the system learn from data. And the beauty of that is that as the amount of data in the world grows, and of course it's exploded in the last decades, and that's crucial part of this equation, the machine learning just gets automatically more powerful. Not completely automatically because you have to scale it up and that's not trivial. But at some level, once you have machine learning, you can just ride this data rocket to all the planets that you want to. And so around the 90s, precisely when I got into machine learning, there was this sea change where the whole knowledge engineering approach Kind of died out and there was a so called AI Winter where people didn't believe in AI and AI got a bad reputation which lasted until not that long ago. But then machine learning started scoring these successes, bigger and bigger ones, and that's where we are today.
[15:53]
A
You thought this was your run club era. Turns out it was more of a thinking about run club era. The good news, someone's marathon training is about to start. Sell your workout gear on Depop. Just snap a few photos and we'll take care of the rest. They get their race day fit and you get a payout for trying. Someone on Depop wants what you've got. Start selling now. Depop, where taste recognizes taste.
[16:23]
B
Whatever your thing, it could be anything. Canva helps you make that thing a thing. Canva is a simple online tool thing. It's a way to design with our magic AI tool things you can social media your thing, generate images or videos of your thing, make decks or presentations to show your thing. Whatever needs to be done for your thing.
[16:46]
A
Canva can make it an even better and bigger thing.
[16:49]
B
Canva, the thing that makes anything a thing. Expedia and visit Scotland. Invite you to come experience the beauty that awaits in Scotland.
[17:02]
A
The sweep of wild coastlines, quiet lochs and untamed landscapes.
[17:08]
B
Fresh cuisine that feels rooted in the land. Come experience the kind of stillness that stays with you long after you leave. Plan your Scottish escape today@expedia.com VisitScotland yeah, I'm curious.
[17:24]
A
How much was Minsky, the. I think you call him the villain, how much was he made me responsible? I think he really took down Rosenblatt's Perceptron and you know, could one man have really created the AI Winter for so long there? Or how do you view him?
[17:40]
B
So, I mean, he was not the only one responsible for this. Again, there were different schools in AI. One school was the Symbolists. The leaders of the Symbolists and often considered the founders of AI were Minsky, McCarthy, Newell and Simon. Right. But Minsky wrote with Seymour Papert this book called Perceptrons, which was the takedown of the neural network approach. And that was really deadly. That killed it dead for another 10, 20 years. That didn't lead to an AI winter. I mean, there was an AI winter around that time, partly caused by that. But what that led was to the machine learning paradigm and the connectionism, which used to be nominant, being replaced by knowledge engineering and symbolic AI for the next 20 years. So it did have a big effect to the connection. He kind of is the villain. But there was a sociological, well, let's say sociological reason behind that book. And that criticism is that these people were competing for funding, in particular from darpa. And the symbolists were getting very jealous that all the money was going to neural networks and their wild promises. So even though they kind of don't. They don't like or didn't like because he's passed away now to admit that this is what he was doing, his co author actually said very explicitly, this is what we're doing. We just set out to kill neural networks.
[19:12]
A
Yeah, it's funny, the people involved. I wanted to ask you about Hinton and I always find it fascinating. He's a descendant of George Boole. That almost seems too coincidental to be a coincidence. But let's maybe focus on the structure of the book. I know there are. You talk about these five tribes, and this is what's interesting. The master algorithm isn't. It is one algorithm, but it reflects these five, I guess, groups and kind of a debate within each of the groups. But maybe you could talk a little bit about the five tribes of these symbolists, the connectionists, the evolutionaries, the Bayesians and the analogizers. Could you just briefly touch on those five? Yeah.
[19:52]
B
There are five main paradigms in machine learning, and each one of them draws on a different area of knowledge. Again, that's, you know, those areas don't come up at random. They come up because literally the machine learning approach was inspired by that area. And a lot follows from that. And then each of these areas has its own master algorithm. And the more fanatical believers in that area think that that's all you need, that one algorithm will solve everything. And what I say in the book is that actually, no, because these algorithms all solve different important problems. And at the end of the day, you don't have a master algorithm until you solve all of them in the same way that you don't have a real model of the universe in physics until you have a model of all the forces. And right now we have one of three of them, and then the other one we don't know what to do with. So my argument is that the same thing applies to machine learning. So what are those five paradigms? Well, two I already touched on. There's the symbolists. These are the people who say AI should be based on logic and math and first principles, ideas from philosophy and so on. The connection inside the people. In a way, you could say that symbolic AI is a little bit like saying AI is the automation of science. What we're trying to do is we're trying to be an artificial scientist who will apply the scientific method, formulate hypotheses, reject them, refine them, test them, et cetera, et cetera, but just do this at the speed of a computer. There, in a way, is the rough idea of symbolic AI. The connectionists also have a very natural idea. They're inspired by neuroscience. They say, hey, the brain is the competition. We're far behind. What do you do? You reverse engineer the competition. That's where you start. So let's figure out how the brain works. It's a neural network, et cetera, et cetera. And so what we're going to try to do is build AI by imitating the brain at some level. And this was the idea. This has always been a very seductive idea in AI. And then back in the 50s, almost everybody, including Minsky's PhD thesis at Princeton, was on neural networks.
[21:56]
A
Neural networks.
[21:57]
B
Everybody was doing neural networks. John Holland, who was the founder of the evolutionaries, as we'll mention in a second, also started out neural networks. The problem again was that they never got it to work for real. But now fast forward to today. Boy, does it work. Right? So there's those two tribes, then there's the Bayesians, who they take their inspiration from Bayesian statistics, as the name implies, right? And to them, again, they also believe in doing things from first principles. Would be. Biology is a mess. Why should we imitate the brain? The brain is just one damn thing after another. But to them, the foundation is dealing with uncertainty. And axiomatically, it has to be done using probability. And the right calculus is Bayes theorem. So to them, everything derives from Bayes theorem. And all the AI algorithms should be based on Bayes theorem. Then there's the evolutionaries, which again, have another very natural idea, a very powerful one, which also comes from biologists to say, like, no, no, no, no, the mass religion is not what your brain does because your brain already has a very sophisticated architecture. And all that changes is the strengths of the connections between neurons. Hence the name connectionism. The real master of them is evolution. And remarkably, going back to the 19th century, before there were computers, there are already people effectively saying evolution is an algorithm. One of these people, I forget which one, said evolution is the algorithm by which God creates all animals and plants. So we know how evolution works, and so we can implement that on a computer, except that instead of evolving animals and plants, we evolve programs, circuits, robots. That's the evolutionaries. And finally, there's the analogizers who take their inspiration mainly from psychology. There's a lot, a lot of evidence in psychology that what humans do all the time is reason by analogy. There's even this book written by Douglas Hofstadter that basically says all of cognition is nothing but analogy. Prove me wrong. From the smallest things, everyday things, to Einstein and Galois and the most advanced kind of science, we're all just reasoning by analogy all the time. And in fact, in my experience, it's actually of the five tribes, the one that lay people find most intuitive because we kind of, of course, reason by analogy. Right. So those are the five tribes, and then they each have their own master algorithms and. And we'll see how this all turns out in the end.
[24:22]
A
If you were rewriting the book today, would you frame the master algorithm still as a composite of those five tribes?
[24:31]
B
I think the master algorithm is more than a composite. Right. So in the book I talk about the five tribes. You know, in our quest, we go visit the five tribes and their individual quests, but then we talk about unifying them and, and I deal with a few ways of unifying them that are somewhat immediately apparent. But not the right one, one which is very popular. Right. That's part of why I address it is you find some system that has a subroutine that's symbolic, and then that's connections and whatnot. But I don't think this is the real solution, number one, because it's overcomplicated. And number two, again, let's do the analogy with physics. It's not like the universe is a program that sometimes calls electromagnetism as a subroutine and sometimes calls the weak force of. That's just wrong. Right. And so I really do think, and a lot of my research in the last 40 years has really been about this. And we in the air have made enormous progress that these paradigms that are superficially very different, actually, when you dig down deeper, they're actually remarkably similar. So master algorithm is not an algorithm with five sub algorithms. It is only one algorithm that solves the problems that all these five different algorithms solve. So I think that is still entirely true. There's been tremendous progress in the 10 years since I wrote the book, and we are closer looking at the closest thing to Mass Rogan that we had then and what we have now. We're in a very different place, but we haven't found it yet. Again, some people might say that we have, but my view is we're still not there.
[26:08]
A
Pedro. I think the Answer is obvious. Are we closer than when you wrote the book 10 years ago?
[26:13]
B
Yeah, as I said, I think we are way closer. Like for example, 10 years ago we didn't have transformers.
[26:17]
A
Yeah. Oh, you stole my thunder. I wanted to ask you about those as well as GPUs, but maybe I should ask you to define the master algorithm in the book. At one point you say the master algorithm is neither genetic programming nor back propagation, but that it must include the key elements of both structured learning and weight learning. So, Pedro, how would you define the master algorithm? One algorithm to rule them all?
[26:45]
B
Yes, that's part of my. I do this analogy with the Lord of the Rings and there's all these rings, but then there's one to unite them all. And the master algorithm is kind of the same evil plan, but for AI. So at an abstract level, a master algorithm is an algorithm that can do anything if you give it the right data to learn from. This is the amazing thing about machine learning, is that the same algorithm, let's take backprop, which is the master algorithm of the connections. The same algorithm is behind all of these different things that we see today. It can learn to play go and chess better than humans. It can learn to speak to you and answer your questions. It can learn to do marathon program. It can learn to predict the structure of proteins. It's amazing. It's one algorithm, the master algorithm. Many ways from machine learning, what a Turing machine is for computation in general. Turing machine is a machine that can do anything, right?
[27:45]
A
Yeah.
[27:47]
B
The last, rather than mouse algorithm that can learn to do anything if you give it the data from that problem. Of course, the second problem is much harder than the first one and builds on it, but that's what it is. Then to get a little bit more into. So what is that algorithm? Right? And if you look at all these different paradigms and all machine learning algorithms, the models that I learned always have these two aspects. One of them is the structure of the model and the other one is the parameters. And in neural networks, the structure is given. We call it the architecture. I give you the architecture. And now backprop learns the parameters. But the evolutionaries and symbolists and others correctly say, well, that you're only solving half the problem. Where did that structure come from? So the master algorithm, at a minimum has to learn both the weights or the parameters more generally, and the structure. And one way to do that, which I touch on the book, an obvious one, right? Because it's the one that happened in nature, is evolution learns the structure and then experience learns the parameters. So if you just wanted to copy nature, that's what you would do. And indeed that is an approach that some have followed. It is very expensive. But today, even now, they continue to come out papers and people doing precisely this combination of things. It's one approach. Right. If you want to be inspired by biology, that's probably what you should do. You could also say, like the symbolists and the Benjamin said, like, ah, forget all of that, let's figure this out from first principles. But at the end of the day, I think those two components will always be there.
[29:17]
A
That's interesting. Assuming evolution is an algorithm, what is the role of the fitness function?
[29:23]
B
The fitness function is the objective function of the algorithm. So one of the ways in which all these different paradigms, superficially so different, are similar, is that every machine learning has the same three components. And really there's this vast zoo of these days, tens of thousands or more of learning algorithms and tens of thousands more coming out every day. And it's easy to get lost in that tide. But if you understand these three components and what they can be, then suddenly you have a map of the territory. Really part of the goal in the book is to give people a map of the territory. And as you alluded to earlier, at one point I do kind of cartoonishly give them that map, right, with these three components and the different versions of each one. So what are the components? The first one is representation, which is the language in which the model is written. English or French are human languages. Python and Java are programming languages. And then AI has its own languages. Right. You can think of these paradigms as being different languages. And then there's the objective function, which is the metric that the learner is trying to optimize that drives everything else. All the learning, all the power, the compute, that they are all at the service of this metric. And finally there's the search, which is the optimization procedure by which you find in the space defined by their presentation, the model, the set of statements, if you will, parameters and structure that will maximize that metric for you. So all machine learning algorithms essentially have these three components. The question is which version of which one they pick and how do they combine them?
[31:00]
A
Got it. If I were to say, I think Bayes is the father of AI. Bayes theorem is the closest we have to a master algorithm. How would you respond to that?
[31:10]
B
I would say, you must be a Bayesian.
[31:13]
A
I'm extremely attracted by it. And maybe it's Laplace that maybe it should be Bayes with An asterisk from Laplace, but it feels very uncanny. However, I'm sorry, I think he lived 100 years ago, a minister in England. And yet, even if it's not the entire master algorithm, it seems to be a very big piece.
[31:34]
B
Right, so there's a coup. So you're in good company again. The Bayesians are a perennial school of machine learning that has its ups and downs, but continues today. Right. So there are people who truly, even fanatically, believe that, as I say in the book, they're the most fanatical of all the tribes, partly for historical reasons, because in statistics, there used to be an oppressed minority because the field was dominated by the frequentists. But there are several reasons why I think that is not the case. And again, I appreciate that it's seductive. I got interested in it as a grad student because I kind of individually rediscovered some of the basic ideas, one of which is that you shouldn't just pick a model. You don't know what the true model is. So what you should do is compute the probability of each model given the data, and then to make predictions, you average over all of them. This seems to be, how can this not be the right way to do things? And the other guys are just doing approximations to the right thing. And again, like the axioms of probability, if you want some basic properties of dealing with uncertainty, you unavoidably have to be using probability, which disposes of a lot of stuff that people have done in AI over the decades. So at that level, I'm on board with the Bayesians. Right? But now there's a few very important issues. One is that Bayes theorem itself is such a simple thing that it's almost a tautology. Bayes Theorem is just a restatement of the definition of conditional probability, Right? Calling that the foundation of AI is almost like saying, oh, addition and multiplication are the foundation of AI. Sure, we use GPUs to do a lot of them, but that doesn't get me very far. So that number one is like just saying Bayes theorem doesn't get you very far. Right. But number two, and more seriously, then comes the problem of how do I find those models and average over them? And AI is already an intractable. Machine learning specifically is already an intractable problem. And the problem with going Bayesian is that it makes things even more intractable. So Bayesians were not spending all their time, and I have spent some of it myself before deciding there are bad things to do, just dealing with, trying to do this at all. And again, part of why Freakingists used to dominate was that before there were computers, visionism was just completely hopeless. You end up having to make simplifying assumptions that basically make it useless, or you have to make approximations like Monte Carlo inference that basically wind up making it as heuristic as all the other approaches. So at the end of the day, Bayesianism is very attractive, but doesn't solve the problem. And then most importantly, I think, and again, I think the last 10 years have really eloquently illustrated that uncertainty is one of the key problems that the masterolum has to solve. It is indeed the ones that the Bayesians are fixated on. And indeed they know how to deal with it very well up to these issues. But it's only one of them. And at the end of the day, if you had the perfect solution today, you'd still be very far from having the masterol with it. Tomorrow morning is knocking. Stock your fridge now. How about a creamy mocha Frappuccino drink? Or a sweet vanilla smooth caramel maybe? Or white chocolate mocha? Whichever you choose, delicious coffee awaits. Find Starbucks Frappuccino drinks wherever you buy your groceries. Starting a business can seem like a daunting task unless you have a partner like Shopify. They have the tools you need to start and grow your business. From designing a website to marketing, to selling and beyond, Shopify can help with everything you need. There's a reason millions of companies like Mattel, Heinz and Allbirds continue to trust and use them.
[35:03]
A
With Shopify on your side, turn your
[35:05]
B
big business idea into sign up for your $1 per month trial at shopify.com specialoffer Zootopia 2 has come home to Disney.
[35:15]
A
Let's go get ready for a new case.
[35:17]
B
We're gonna crack this case and prove we're victorious partners of all time. New friends, you are Gary Desnake and your last name Desnake. Dream Team Hit new habitats.
[35:29]
A
Zootopia has a secret reptile population. You can watch the record breaking phenomenon at home.
[35:35]
B
You're clearly barking at Zootopia 2.
[35:39]
A
Now available on Disney Plus. Rated PG. Got it. You suggest in that chapter Markov chains or Markov. I think I have it right. Markov chain, Monte Carlo. Markov chains might be replacing or could be a substitute for Bayes Theorem. Is that, is that correct and is that still something people are working with today?
[35:59]
B
No. So just to be clear, Bayes Theorem is just a fundamental theorem in probability which as you alluded to, really should be called Laplace's theorem, because he's the one who really formulated it. But everything is. There are so many things in Laplace, including frequentist ones, that the Bayesians wouldn't take that. But then the state of the art in machine learning and also in other fields is there are these things called graphical models, where there are random variables that are nodes in a graph and then the edges represent dependencies between them. And for a while, the dominant approach was something called Bayesian networks, where the graph is a directed graph, the edges have arrows, but there's a more powerful, more general representation called Markov networks, where the edges don't have arrows. It's more complicated, harder to deal with. So in the beginning, people shied away from it, but now we actually know how to handle that. And in some sense, Markov networks subsume Bayesian networks. So you could say in that sense they have replaced them. But I think the more accurate thing to say is that they have advantages and disadvantages. People still use Bayesian networks for some things, Markov networks for others. But a true master algorithm has to be able to do both and really needs to be based on Markov networks, because Bayesian networks really are just a special case. And that is indeed what the tentative master algorithm that I described in the book, which was something that I. That I worked on for about 10 years does. It combines Markov networks with aspects of the other paradigms.
[37:32]
A
Nice. I want to go to a few problems you call out in the book. The first is the problem of overfitting in machine learning. What is that and how do we address it or how has it been addressed?
[37:43]
B
Yeah. So overfitting really is the central problem in machine learning, although, interestingly, maybe less so today than it was 10 years ago, for interesting reasons. But the problem when you learn from data is that you can learn to predict. For example, I'm learning medical diagnosis, I'm learning to diagnose, I don't know, lung cancer from X rays of lungs. Right. Quintessential example. It's very easy to learn to predict the X rays in the data very well. You could even just remember them. Right? And they get this is there's one type of algorithm that, to a first approximation, that's all it does, called nearest neighbor, and say, like, oh, wow, look, see, I can predict perfectly now. But the whole question in machine learning is how well do you do outside of the training data? And when you do well on the training data, but poorly on what we call test data, which is new patients, new cases that I haven't seen, that's overfitting. And overfitting is really what makes machine learning much harder than other optimization problems. Because in normal optimization problems, if you've optimized it, you know you succeeded. In machine learning, when you optimize, you actually could be going in the wrong directions because the problem that you're optimizing on the training, that is only a surrogate for the real one. So really the essential thing in machine learning and what makes it so hard is the relationship between what you learned on the training data and what it really means in terms of the deeper, more general patterns. And so when you fail at that, you say you've overfit. And overfilling really is the central problem in machine learning. If we solve overfitting, in some sense, everything else gets a lot easier.
[39:22]
A
Got it. You actually say the second worst problem I think behind overfitting is, quote, the curse of dimensionality. Could you briefly talk about that?
[39:30]
B
Yeah, indeed. And that continues to be the case again, although with some nuances in the last decades. So what is the curse of dimensionality? The cursive dimensionality is that. So we live in a three dimensional world and we understand things in three dimensionals very well, too well. The problem is that machine learning lives. For example, these days we have these large language models that have trillions of parameters, so they live in trillion dimensional worlds. And unfortunately, this is not just a problem of computational cost, is that all our 3D intuitions fail in high dimensions. We keep going wrong when we try to do machine learning because we don't understand what goes on. We're always visualizing things, right? We're visual creatures, but every visualization is a lie. Right? A famous AI researcher said many years ago that if our brains could see in high dimensions, we would need machine learning. If the points that you want to say classify are in 2D, you can kind of draw the boundaries, not hard at all, but when they're in the trillion dimensions, there's just no hope. And there's all these highly unintuitive phenomena that happen there. So I would say that in some ways that problem has become. Opinions will vary, but it's even more important now than it was 10 years ago because we're in even higher dimensional spaces. And even though in some ways we're doing very well, I think a lot of the opacity and all, like we don't really understand what's going on really comes from the fact, not only, but in large measurements, from the fact that we don't understand what happens in High dimensions.
[41:08]
A
Got it. I want to ask you briefly about the brain. Clearly these large language models have more neurons, but in terms of connections, I think the brain has over a trillion. And the largest model has one to three, must be billion. But the brain still has, in terms of the ability to make these connections and think, is it still advanced? I mean, we don't have the computational power, but as you point out, the ability to recognize faces almost instantaneously. What I'm asking is how do you compare the brain right now versus a large language model?
[41:41]
B
No, the brain is still more powerful. A lot of people don't appreciate this. The brain is still bigger, if you will, than even the biggest language model in both the number of neurons and the number of connections, the number of parameters. The biggest language models have trillions of parameters, which sounds like a lot, except that the brain has orders of magnitude more than that. So we're maybe at the level of a mouse brain at this point, or a cat brain. And so we're still not, we're catching up rapidly. Right. And we'll get there. Right. If you put a bunch of these data centers together, you would, I think at this point, if we wanted to put together something the size of the brain, we could, but the expense and the energy consumption would just be prohibitive.
[42:23]
A
Yeah, I think you quote it's a light bulb that powers the brain or something equivalent versus a neighborhood for Watson or something. And I think that's still misunderstood somehow that these are smarter or more powerful than the brain. And I get it, computationally they can do the math quicker than we can, but the ability, I think, to make the connections, the synapses, is still far behind. I wanted to ask you, you referenced the nearest neighbor problem and you talk about that in the book. Could you briefly touch on that?
[42:53]
B
Yeah. So Nearest Neighbor is the simplest algorithm in the analogizer paradigm. It's a very old algorithm, goes back to the 50s, but it was actually the first algorithm in history that was able to learn. That was truly a master algorithm in the sense that it didn't just learn a restricted class of functions like say linear regression only learns lines basically, or straight lines in some space. Nearest Neighbor is a dreadfully simple algorithm. It's like this. I'm not going to do anything with it, then I'm just going to remember it. So the simplest version of this labor, its computational cost is actually zero. Great stuff. But then what happens when I need to generalize to new examples? I look for this most similar example in my training data, the so called Nearest neighbor. So for example, if I want to fake being a doctor, and I don't know anything about medicine, but I have a nice file of past patients, when a new one walks in my office, I say like, okay, so what are your symptoms? And then I just look for the patient with the most similar symptoms and I give the same diagnosis. And this is remarkably good, despite its simplicity. In fact, if you refine this slightly to the so called K nearest neighbor algorithm, where I do an average of the nearest neighbors in the limit, you can learn any function as well as is mathematically possible using nothing but this algorithm.
[44:16]
A
Yeah, yeah. Very powerful. You also talk about the EM expectation maximization algorithm as one of the most popular in machine learning. Is that still the case?
[44:27]
B
So EM is probably the most important algorithm in statistical learning, which encompasses not just Bayesian learning, but all learning where the model is a probability distribution, if you will, and EM really is the central algorithm there. It's less important today than it was 10 years ago just because neural networks have taken over the field. Right. So there's few people doing em, but there's still some things for which it's the best thing to do. And in fact you can also, some neural networks are probabilistic models and you can use variations of EM to train them. And I think some of the ideas in EM are important. We'll see what happens in the future. The basic idea in EM, to just try to state it very briefly, is that in Stats 101, for example, I want to. I have a biased coin and I figure out what the probability of heads is just by flipping it a number of times and counting. That's all I have to do. I estimate my parameters. That's the M part of EM expectation maximization. I find the maximum likelihood parameters. But in the more interesting problems, which are almost the problems in the real world, I don't see some of the variables. The variables are hidden from me or latent. Right. And so I have to infer those and that's the expectation part. So what the AI model does is it alternates between. I infer the missing variables and now I have complete data. And now on the complete data I can just count or average or maximize likelihood. And there's a very non obvious proof, but there's really the foundation of that algorithm that says if you alternate between these two things, you actually wind up converging to a local optimum of the likelihood or the Bayesian posterior or some such measure.
[46:17]
A
Yeah. Is a hidden Markov model an example of that?
[46:22]
B
Yeah, a Hidden Markov model is a classic example of that. So in a hidden Markov model, say, for example, you're recognizing speech. What you see is the sound signals, it's the waveform. What you don't see is the actual words that you were saying. But what you want to do is predict the words from what you've seen. So now you have precisely this combination of like, I need to both predict the words from the sounds. That's the E part. And then estimate the parameters of the model, which is the M part.
[46:48]
A
No, that was an interesting chapter on Siri. I have one or two more definitional questions. Relational learning, what is that and how important is that today?
[46:57]
B
It's very important and it continues to be very important. Things have mutated in the last 10 years. There's different approaches now than in addition to the ones that I talked about back then. But relational learning, at some level, it's an almost blindingly obvious thing, which is in statistics and also in machine learning, for a long time we just assume that the objects that we see are all independent. There's a so called IID assumption. I'm going to learn to do medical diagnosis by looking at a stream of independent patients. This really simplifies the math, but it's a big lie because in reality nothing is independent of anything. So for example, look, contrast classical economics, where the market is a bunch of independently acting people, with the reality of the market, where it's a social network and we influence each other and so on. So in relational learning, you want to model the relations. You don't just want to model, for example, each individual's properties, but you want to say like, oh, who are my friends and how do we influence each other? And so on. So this is tremendously more difficult. But again, we have made spectacular progress, some of which I talk about in the book, where at this point, at some level we know how to solve those problems just as well as we knew how to solve the easier ones. It's computationally more expensive, but we have the computers to do that. And for example, when you hear a word like knowledge graph, right, what is a knowledge graph? It's precisely a relational model with concepts, relations between the concepts and so on. And these days we have things like neural, so called graph neural networks. These are all different approaches to doing relational learning.
[48:35]
A
Got it. In the book you quote, I think it's Rich Sutton believes reinforcement learning is the master algorithm. Solving reinforcement learning amounts to solving AI. Do you believe that then or now?
[48:48]
B
No, I did not then and I do not now, although I have great sympathy for Rich and the people doing reinforcement learning. Reinforcement learning is one of those fields that has grown tremendously in the last decade. But so first of all, we need to understand why people like Rich would think that, right, Reinforcement learning is a solution to the problem of sequential decision making. A lot of machine learning is one shot decisions. Give me an X ray, tell me where the tumor is, or tell me there is no tumor. But real life isn't like that. Real life happens in time. I exist, I do one thing after another. The things that I do kind of set the stage for what happens next. I run away from the tiger, the tiger chases me, et cetera, et cetera. And so clearly a master algorithm must be able to deal with this, otherwise you're nowhere near right? So in machine learning, the people who have been more into AI now it's everybody. But I remember when most people in machine learning didn't care about AI, they were always very attracted to reinforcement learning. Right? Now the thing is that you can't confuse the problem with the solution. Reinforcement learning is one particular solution to the problem or set of solutions, if you will, to the problem of sequential decision making. And I'm not convinced that they are the solution. And in fact, after decades of work in reinforcement learning, the fundamental problems still haven't been solved. You might get the wrong impression from all the hype that you hear these days. But a lot of that reinforcement learning is actually only so like it's other things that are being called reinforcement learning because it's sexy or it doesn't work nearly as well as its announced to be. So I hope reinforcement learning succeeds, but I think I really wouldn't put my bets on that horse as that being the master algorithm. But also more importantly, every enforcement learning algorithm is really something that you wrap around the supervised learning algorithm inside that reinforcement learning algorithm that does the sequential behavior is back prop or is some other learning algorithm. And that's where the hard problem is. So saying that reinforcement learning is a mastermind is not saying the skin of the orange is the orange. Or as Yann Lecun famously put it, reinforcement learning is the icing on the cake. Supervised learning and unsupervised learning, they're the cake. So don't confuse the. Sorry, sorry, I correct myself. Unsupervised learning is the cake. Supervised learning, which is learning from labeled examples is the icing and then reinforcement learning is the cherry on top. Right? I think this has become very famous and it's very apt. The cherry on top is Nice, but the cherry isn't the cake. And likewise, I don't think reinforcement learning is the master algorithm. It may well be part of it, but not the whole thing.
[51:37]
A
Do you hear that?
[51:39]
B
Sounds like breakfast is ready.
[51:41]
A
Because Quaker's coming in hot with morning nutrition. 100% whole grain oats and a good source of fiber to fuel the rhythm
[51:49]
B
of your morning and kickstart your day. And that sounds absolutely delicious.
[51:57]
A
Fuel to start whatever's next. Quaker official sponsor of FIFA World Cup 26. It's springtime. Which means that Princeton University Press is having its annual 50% off spring sale. From May 4 through June 9. You can get 50% off nearly every single print, ebook and audiobook from Princeton University Press. Just go to press princeton.edu. to get 50% off incredible books like Disneyland and the Rise of Automation and Beyond Belief, How Evidence Shows what really Works. There are so many fantastic books that you can get an incredible deal on. Go to press princeton.edu and use the code spring50. That's S P R I N G50 press princeton.edu. the sale only lasts for a month, so go and get some books. Okay, so let me, as a follow up, ask you, you describe learning in the book as a race between the amount of data you have and the number of hypotheses you consider. Is that how you would define learning today?
[53:07]
B
Yeah, I mean, this is one way to look at it. There's others. But I think this is a very important one, which is that if I don't have a lot of data, I cannot distinguish among a lot of different hypotheses. It's just not possible. Right. I have to just start with a lot of knowledge and then I learn a more restricted model, which is what people did in statistics and in machine learning for decades. But you shouldn't be under the illusion that that's going to get you to AGI or the master algorithm or any such thing. So in order to be able to have more powerful learners, you can't have more powerful learners without more data. And also if you don't. But conversely, having more data is no use if your learning isn't very powerful. The traditional types of statistical learning algorithms, at some point, there's just no point in giving more. They've hit the wall. They're not getting any better. So more data and more hypotheses go hand in hand.
[53:57]
A
Got it. And Pedro, in the news, we hear there's such a appetite for more data. Is that still the case? Or do we. Is the the weak link in the chain if you will, the algorithms and the models. I mean, what's, what's holding back the next inflection?
[54:13]
B
In machine learning, there's really three things that you have to keep in mind. There's the data, there's the compute, and there's the algorithms. Each one of these could be a bottleneck at any given point in time. And indeed, at different points in time, different things were bottlenecks, right? If you have a lot of data and a lot of compute, but the algorithms are dumb, as just alluded to. Well, you're not going to get there, right? But if you have dumb algorithms but not a lot of data, you won't get there either. Right? Compute is. If I have a lot of data, I need a lot of computes to handle it. Also, machine learning is a very expensive search, right? So you gotta have all three. Right. And what you see these days is, you see actually at this point in the history of AI fuel, you actually see all of these things simultaneously being bottlenecks, right? People are like, hey, we learned these large language models from the entire Internet, from everything that everyone ever wrote to a first approximation. And now what do we do? We run out of data. There are some people that say we haven't and say, like, oh, we will generate simulated data. We will come up with. But like, clearly, if you could give AI researchers more data, they would love it, right? And they keep trying to come up with ways to have more data. But at the same time, you see compute being a big bottleneck. You need more GPUs, you need more data centers, you need more energy. That's the bottleneck too. Again, if you ask Jensen Huang, energy is the bottleneck, Right? If you ask other people, the availability of GPUs is the bottleneck. So the compute right now is also a bottleneck. But then maybe it doesn't get as much attention in these things. But really it's the most important I say of this is the algorithms are the bottleneck. You also keep everyday thing like, oh, someone has come up with an algorithm that learns better and uses 10 times less data. And if you have a lot of, let me put it this way, if the algorithms aren't good enough, no amount of data and compute will get you there. And so each of these things is the bottleneck today, and we're seeing progress on each of them and we'll see where each of them lands in the next five, 10, 20 years.
[56:16]
A
Got it? I want to ask you about the coming AI apocalypse here, but two more questions. In one of the later chapters in the book, the pieces of the puzzle fall into place. You spend a fair amount of time talking about alchemy and the master algorithm. Could you briefly explain that relationship?
[56:34]
B
Yeah. So Alchemy is a system that me and my group originally implemented and then a bunch of other people added to that is an implementation of the version of something as close as possible to the master of that we developed that is called Markov Logic Networks. And as the name implies, it's a combination of logic and Markov Networks, but also combines the key features of the other paradigms. Again, as I describe in that chapter, it is one single algorithm that solves the key problems of all the five tribes. And then we implemented it in an open source library called Alchemy. It is called Alchemy. Tongue in cheek. Because what I say very clearly at the outset of the book and in that chapter is like, we're still in the alchemy stage of AI, right? We don't have chemistry yet, we're still alchemists. And so we're going to call this system Alchemy, just to make the point that we don't think this is AGI, Right? This is just the best we can do right now.
[57:30]
A
Yeah, no, I thought that was very interesting analogy because eventually chemistry was birthed, if you will, from alchemy. So, I mean, you're very optimistic in the book and it sounds like you still are. Okay, I want to ask you just a few big picture questions here, because I'd say the last chapter or two of the book, again, quite prescient. You start asking questions. I think people are on the front pages of the papers today. But I'll just start with the obvious one on jobs. You address that head on. Is AI going to take over 90% of the jobs?
[58:05]
B
Well, yes. And again, this is. So that last chapter was about the future. So what will the AI revolution, what will even just the current state of the art and then the mass world do to the world? And what's going to happen going forward? In a way, that was the riskiest chapter to write 10 years ago. Looking back, I would say that pretty much. I mean, it's uncanny the extent to which everything I said in that chapter has come true. In fact, I would say the main filling of that chapter is that things have come true in spades way beyond what I was even. It's one thing to predict something, it's another thing to see it happening. And a lot of those problems that I said, like, hey, look, we're going to run into Problem A, B and C. Boy, we've run into that problem, but it's 100 times bigger. And so one such problem is the jobs apocalypse, right? There's this fear, which these days is on the front pages of newspapers, that the AI jobs apocalypse, AI is going to destroy all jobs. And I try to go in the section on that briefly through some of the main reasons why I don't think there's going to be a jobs apocalypse. And again, in the last 10 years, the economists have gotten on the ball and they agree, because it is basic economics, that AI will not be a jobs apocalypse for a very simple reason. What AI does fundamentally, from economic point of view, is greatly decrease the cost of intelligence, because now it can be done by a computer instead of a human with all of that that implies. And this now is just econ 101. What happens to the demand for a good or a service? When the price goes down, the demand goes up. And there's also this famous thing in economics called the lump of labor fallacy, which is like, there's only this much labor and once you do it better, it means there's nothing else left to do, right? Yes. AI will automate a whole bunch of things. But now the demand for the things that you can do with those things will go up. The demand for the things that you couldn't do before that you can do on top of those things will go up. And this is another piece of economics. The demand for the complements will go up. If people eat bread and butter and the price of butter goes down, the price of bread actually goes up because now people can eat more bread and butter. And what is the complement of AI? It's human intelligence and AI. There are almost. This has been studied quite well. By now. There are almost no jobs that can be completely done by AI. Each job consists of multiple tasks. AI can do some of them, but not all. So what's going to happen in your job is that you should think about automating the things that AI can automate, which will actually improve your quality of life and your productivity. And at the end of the day, you'll be making more money than you were before, and there'll probably be more people in that profession. And this is the same thing that has happened with automation over and over again since the Industrial revolution. People thought ATMs would be the end of bank tellers. There's more of them now than there were then. People thought machine translation would be the end of translators. Right. When Google translate came along 20 years ago, there's more translators now than there were then, right? Et cetera, et cetera.
[61:09]
A
Another topic you hit head on again, very timely is AI and I guess the ultimate apocalypse or Skynet. Should we be concerned AI is going to slip the leash and basically view us as a threat and try to create an extinction event?
[61:26]
B
We shouldn't. And I'm happy. So since the book came out right, and AI has exploded in the early years, if you will, circa 20 something 2022. 3 Post ChatGPT this whole idea of Skynet was very up, you know, in people's minds. But likely at this point this idea has started to become discredited, which doesn't surprise me and gladdens my heart because what I thought was happening is. And you know, we can get into why this happens, but as people start to interact with AI and see AI for what it really is, these notions of AI as being this thing that will. The problem with that is that it's anthropomorphizing AI. It's saying AI because it's like humans and it's in intelligence, is like humans in other respects. It has consciousness, it has a will, it wants to take power, it wants to blah blah. None of those things are true, they're just anthropomorphization. And precisely one of my goals with the book was to demystify AI. Let people at least see AI for all it really is. Instead of imagining AI sort of like being an artificial man, because it's not. If A was an artificial human, then we'd have to worry about it wanting to take over or not, but it's just not. It's an algorithm with an objective function and all it does is maximize that. So we need to worry a lot about what that should be. But the idea of AI suddenly deciding to do something else and taking over is, you know, it's just science fiction, right? It makes for good movies, but it really has nothing to do with reality.
[62:55]
A
Let me ask you a follow up question there, and I think you may touch on it indirectly in the book. What about AI gaining consciousness or having rights at some point? Is that something you think is even possible or would you just laugh at that?
[63:07]
B
No. So very good. So those are different but related things. Now people debate if existing ARs are conscious. For the most part they say heck no. Or whether they will ever become conscious. And I think honestly my answer to this question is who knows and who cares? You can't even tell if I'm conscious. Why do you think I'm conscious? Because I kind of look and act like you. So since, you know, you're conscious, right. You make the difference about me. But that seems like not a very well founded reason to think someone is conscious. Right? And there's so called, the so called easy problem called consciousness, which is to find neural correlates of consciousness. Very good. But that doesn't answer the hard problem, which is, well, why do you perceive red is red and green is green and pain is pain. Right. And we don't even know how to answer that question for humans, let alone for AIs. And at some level it doesn't matter. AIs are machines that we build for certain purposes and what matters is whether they serve that purpose. Now, I think in practice, if you have a housebot, you will treat it like it's conscious. Right? There's nothing terribly wrong with that. The problem with that is when people are saying like, oh, conscious things should have rights and therefore AI should have rights. There's several fallacies in that. Number one is that like, well, why should conscious things have rights? Right? Is a cockroach conscious? Should the cockroach have rights because of that? And then, you know, how can you even determine if the AI is conscious? And even if all of that was the case, that still doesn't imply that AI should have rights. So this idea of giving rights to AIs is absurd in my view. And whether or not you base it on the AIs being conscious,
[64:45]
A
we didn't touch on this so much. But you talk about the impact of AI on biology, evolution, DNA, and we can now splice the genome and effectively write our own language and I guess to some extent maybe take over evolution, which I think most people would say is the singularity, that Kurzweil phrase. But you actually suggest we're heading for a phase transition. And I thought that was really interesting because I. If all people know, they know is where a singularity. They don't know what a singularity is. But everyone knows we're here or coming to it or may have been there. There's some debate. I thought that was interesting. You say, no, why are we going through or will go through a phase transition, not a singularity?
[65:25]
B
Yes. So first of all, what is a singularity? Right. I think we should start there. A singularity is a term from mathematics where it's a point at which a function goes to infinity. So for example, a function 1 over x goes to infinity at zero because as x gets smaller and smaller, 1 over x gets bigger and bigger. Right. And the idea of the singularity, which goes back to von Neumann and the forties, is when we see technology accelerating exponentially, it's hard to not come to the conclusion that this is going to go to singularity at some point. Right? And the basic thing that I say in the book, which again, I think is even more relevant to this, like, that is not going to happen because it's physically impossible. For the physicists, they know this very well. If they see a singularity in their equations, they know there's something wrong with them. For example, relativity gives rise to singularities in black holes, which is why we know relativity needs to be improved. It's not the final theory. Even ignoring quantum mechanics, the same thing in AI, right? The amount of intelligence, let's just put it that way, can't go to infinity because we don't have access to an infinite universe. So that's physically impossible. Now, what can happen and thus happen all the time in practice? If you look at technology, evolution curves, Kurzweil, this is what he does in a nutshell, is he looks at a whole bunch of technology turns like, ah, exponential, Ah, exponential. Ah, exponential. This one is an exponential, too. But if you actually continue with those curves forward, all of those exponentials eventually flatten out because they have to, right? The progress in any measure, the speed of your cars, the adoption of different technologies, it starts off slow and then it speeds up, it gets very fast, and then it slows down again until it flattens. And this is why it's called an S curve, because it's in the shape of an S. And all I'm saying in the book is it's going to be the same thing with AI, because it has to be. There's a term for physics for this, which is a phase transition. You transition from the phase where, for example, there were no cell phones to a phase where everybody has a cell phone. And all I'm saying is what we're going through 10 years ago and now at a later part of the curve is a phase transition. Now, the early part of an S curve is mathematically indistinguishable from an exponential. So these days, even more than then, you just see this all the time now, even from people at some of these labs, like OpenAI and others. Oh, look, the curve is an exponential. So it's going to continue. And you know, Dario Miller says, like, we're going to reach the singularity with x 0 to, like, this is correct. Complete nonsense. Right? For obvious reasons.
[68:03]
A
Yeah. No, I really. You had a great section on the S curve in the book. Pedro, I'll just ask you the last question, which is probably the most obvious of them. If you were to release an updated edition today, what would you change?
[68:16]
B
Two things. So, first of all, to my great satisfaction, the book. Let me put this. Nothing in the book is outdated. One of my preoccupations in writing that book, precisely because it was in reaction to the stuff that was available at the time, was to write something that would last. Because I saw not just the things that were happening that day, but there are certain things in machine learning, certain themes, certain problems, certain paradigms that we've been talking about that have been the same since the 50s. This is amazing in this field that is perpetually rapid and ever accelerating change. Some things actually don't change at all. And the book is about them. And those are all still current. So I think that part. I'm glad that worked. Having said that, of course, a lot has happened since. So what I would do if I was doing a section of the book is like, I don't think I would change much of anything about what's there, but I would add two chapters, actually. I would add a chapter about the development, the technical developments of the last decade, like Transformers and things like that, that. That clearly deserved a chapter of its own, just to bring things up to where we are now. And honestly, I think there is a real need for that. There are a lot of attempts, some of them very valiant, to explain to people or to understand what LLMs are. I think they don't succeed. And I think somebody should write that. And then I correct myself, I would change some things in the last chapter. Not that they are wrong, but they need to be amplified and brought up to date. So I would write a chapter that maybe would be an amplification of the last chapter about how things look now and how the future looks going forward. Again, I don't think there's anything in that chapter that was off the mark, but certainly things are on a completely different level now.
[70:03]
A
Yeah, that's interesting. The only thing I could think of is, I think, and again, this predates the current political climate, but I think you ask who's going to win between Hillary Clinton and Jeb Bush and different age. But I don't think that's on you. You didn't have a crystal ball. Pedro. This has been a fascinating discussion. Again, the book is the master algorithm. How the quest for the Ultimate Learning Machine will remake our world. Timely as ever and as much a great read today as it was when it first came out. Thank you so much for the conversation, Pedro. It was wonderful.
[70:35]
B
Thank you. This was fun. Thank you for listening to this episode of the New Books Network. We are an academic podcast network with
[70:42]
A
the mission of public education.
[70:44]
B
If you liked this episode, please share it with a friend and rate us on your preferred podcast platform. You can browse all of our episodes on our website newbooks network.com connect with us on Instagram and Bluesky with the
[70:56]
A
Handle Ebooks Network and subscribe to our
[70:58]
B
weekly substack newsletter@newbooksnetwork.substack.com to get episode recommendations
[71:04]
A
straight to your inbox.
[71:11]
B
Foreign.
[71:17]
A
The right window treatments change everything. Your sleep, your privacy, the way every
[71:21]
B
room looks and feels.
[71:22]
A
@blinds.com We've spent 30 years making it
[71:25]
B
surprisingly simple to get exactly what your home needs.
[71:28]
A
We've covered over 25 million windows and have 50,000 five star reviews to prove we deliver. Whether you DIY it or want a pro to handle everything from measure to
[71:36]
B
install, we have you covered.
[71:37]
A
Real Design professionals free samples. Zero pressure right now. Get up to 45% off with minimum purchase plus get a free professional measure@blinds.com rules and restrictions apply.