
Loading summary
A
When it comes to gifting, everyone on your list deserves something special. Luckily, Marshall's buyers travel far and wide, hustling for great deals and amazing gifts so you don't have to. That means your mom gets that cashmere sweater, your best friend that Italian leather bag. Your co workers unwrap their favorite beauty brands, and your nephews the coolest new toys.
B
Go ahead.
A
A price is this good, you can grab something for yourself too. Marshalls, we get the deals. You gift the good stuff. Shop now@marshalls.com or find a store near you.
B
We find Vecna. We end this once and for all, together on December 25th.
C
We have a plan. It's a bit insane. Everyone in he knows where we are.
B
Watch out. Get ready for one last adventure.
C
We stay true to ourselves, stay true to our friends. No matter the cost.
B
Found you. Stranger Things Season 5, Volume 2 begins December 25, only on Netflix.
C
And I just look at you when I talk?
B
Yeah, we just look at. We'll look at each other the whole time. Unless you want to make a declarative statement.
C
We are all going to die.
B
Smart Girl, Dumb Questions. What's our AI future? And what even is AI? I'm Neymar Raza. This is Smart Girl Gem Questions and today my guest is Professor Geoffrey Hinton. He spent half a century working in artificial intelligence and machine learning, earning a 2018 Turing Award, a 2024 Nobel Prize, and the nickname, the coveted nickname, godfather of AI which of those titles do you like best?
C
The Nobel Prize, obviously.
B
The Nobel Laureate. You're also a professor emeritus at University of Toronto, where you have taught and collaborated with some of the most seminal minds at places like OpenAI Meta, et cetera. But I was watching your speech, your Nobel speech and your warning about the risks of AI I want to play a little bit of the audio from it.
C
In the near future, AI may be used to create terrible new viruses and horrendous lethal weapons that decide by themselves who to kill or maim. All of these short term risks require urgent and forceful attention from governments and international organizations. There is also a longer term existential threat that will arise when we create digital beings that are more intelligent than ourselves. We have no idea whether we can stay in control.
B
You obviously talk also about the benefits of AI but as you were saying these things, people are just eating bland Swedish meatballs and avoiding eye contact with one another.
C
Actually, I think the Michelin star chefs who have prepared the meal would be rather annoyed at you saying bland Swedish meatballs.
B
So they weren't bland Swedish meatballs. But I would have expected to hear a hush in the room when you say something like that.
C
Yes. Not particularly.
B
And what do you make of that?
C
People find it very hard to take the AI threat seriously. Even I find it hard to take it seriously emotionally. It's not like the threat of nuclear weapons where it's very easy to understand something that goes bang and wipes people out. It's much harder to understand that we might be creating alien beings that are smarter than ourselves. That just seems like science fiction. People don't take it seriously.
B
You are an expert. So I want people to hear your risks and take them seriously and also question them, as they should. But I want to start with just explaining what artificial intelligence is, because I talk to a lot of people about AI and I feel a lot of us, myself included, don't really get it fully. Would you agree? You talk to people all day.
C
Oh, yes. Most people who comment on AI don't really understand how it works.
B
And do you feel most of the time when you're sitting in an interview and someone is speaking to you about artificial intelligence that they really understand how this thing works?
C
Some do, some don't. Most don't, I think.
B
And when they don't, do they ask you? It's very rare to set the stage of how we should think about it as well. The way I've thought about it is agricultural revolution, Industrial Revolution, AI revolution.
C
That's a very good way to think about it. It's that kind of scale. Yes.
B
Okay, so the Internet then was not a revolution.
C
Not on the scale.
B
Not on the scale. And is it going to be slow but world shifting like the agricultural revolution, or is it going to be fast and violent like the Industrial Revolution?
C
Much more like the Industrial Revolution. So the Industrial Revolution, for example, replaced a lot of agricultural labor. This is going to replace a lot of mundane intellectual labor. So it's going to cause a huge shift in employment. And many people are very worried that it might cause massive unemployment.
B
And I want to get to that with you and also talk about whether, you know, millennials and Gen Zs. If we don't have kids, should we have kids in such a world? I mean, that's a big question I have for you that we will get to. But first of all, what is artificial intelligence?
C
Back in around 1950, there were two paradigms for making an intelligent system. Two quite different paradigms. One was the kind of symbolic AI where the model for intelligence was logic. So if you say Socrates is a man, all men are mortal, you can derive. Socrates is mortal. That was a way of deriving new facts from old facts.
B
Logic.
C
Logic, yes. And many people thought that's how intelligence must work. It must be some kind of logic so you can derive new facts from old facts. There was a different approach altogether, which said, well, the only really intelligent thing we know is a person. And the human brain works by changing the strengths of connections between brain cells. So maybe we should focus not on, is there some kind of logic going on in our head, but on how do we change the strengths of connections in our brains and will that make an intelligent system? And in particular, we shouldn't probably focus on reasoning. Reasoning came very late biologically. Before we could do much reasoning, we could do perception, we could control our bodies. Maybe we should focus on that, because that's what the brain evolved to do long before it did much reasoning.
B
Okay, you said the brain changes the strengths of the connections between the cells.
C
Yes.
B
And this is, you know, hunter gatherers could do this back in. Oh, yes, everybody can do this. They can see. So you're basically saying a brain. Well before a brain can reason, pigeons do this. Yeah. So well, before a brain can reason, even a baby's brain can say, eventually, oh, that is a object that looks flat, it has four things. And then we learn. The word for that is table. How does the brain strengthen the connections?
C
Okay, that was a big open question and actually still is. So we can break it down into two questions. One is if the brain could find a way to decide for each connection strength in the brain, whether to increase it a bit or decrease it a bit in order to make it work better at some task it's trying to do. Then if you started off with lots of random connections and just used this method of increasing or decreasing connection strengths, would it actually learn to do complicated things or would it just get stuck? And the answer was the overwhelming belief was it would just get stuck. It had to start off with lots of innate knowledge, which would be in the form of appropriate connection strengths between brain cells. And then maybe if it had lots of innate knowledge, it could improve it a bit. But with experience, that was the general belief, and it was just wrong. And what we've shown now is that if you can find a way to decide for each connection strength whether you should increase it a bit or decrease it a bit to do better at some task you're doing, then you can learn incredibly complicated things like these large language models.
B
So it's like, it's the neuralness of the brain, the ability of the brain to make sense and to connect to each other. That makes it powerful. Not some innate knowledge inside of the brain.
C
It's this ability to learn. It's the ability to change the strengths of connections so as to be better at some task.
B
And you're better at doing that when you're a kid, right? Yes, because you're more neuroplastic, they say. So is AI as neuroplastic as a baby?
C
It's a very good question. So what we know now is that if you can find a way to figure out whether you should increase or decrease the connection strength, and you can do that for all of the connection strengths at the same time, then you can make very smart systems. But there is a difference probably between how the brain figures that out and how current AI figures that out.
B
Okay.
C
And it's quite possible that the brain has a method that's in some ways better than what we have and in some ways worse because it's solving a slightly different problem.
B
Which is.
C
So the AIs, we have only have about a trillion connections.
B
Only a trillion. We have more than a trillion.
C
We have about 100 trillion, really, in our brain. Yes. And so our Brain has about 100 times as much connections as the smartest AI, but it only gets a tiny fraction of the experience. So we live for about 2 billion seconds. Even if you got 10 experiences a second, which is total maximum, and you didn't sleep, that would only be 20 billion. These large language models are trained on trillions and trillions. So they've got hugely more experience and hugely less connections.
B
Is that because they have more storage than our brain?
C
No, no, we've got more storage than they have because the storage is in the connections.
B
So the storage is 100 trillion is a storage, actually. But we can't fill it up because we don't have enough time.
C
We can't perhaps use it optimally because we don't have enough time. So you don't have enough time to read on the Web everything publicly available on the web. These large AIs do, but not because.
B
They have more time, but because they're processing it faster, right?
C
Well, there's two reasons. One is they're processing it faster. But the other is they're digital. And with a digital system, you can make many copies of it. And so what you can do with these AIs is have many copies running on different hardware. Each copy looks at one bit of the Internet, figures out how it would like to change the connection strengths, and then they communicate with each other. And they all change their connection strengths by the average of what everybody wants. And now each copy has benefited from the experience of all the other copies. So if you've got a thousand copies, they can experience a thousand times as much as one copy. And they can all learn from all of those experiences by averaging the changes in the connection strengths. Right.
B
It'd be like every time I have an experience, all of my siblings would have that same experience, for example.
C
And all of your siblings would learn from your experience. Wouldn't that be terrific?
B
They seem to do the opposite. In fact, they just tell me what's wrong. Yeah, that's sibling. This is interesting because what you're saying is that artificial intelligence is more collective.
C
It's much better at sharing. If you have multiple copies of exactly the same neural network using their connection strengths in exactly the same way, and to do that, you have to be digital, then these multiple copies can share what they learned. And if they got a trillion connections, they're sharing about a trillion bits. When they share how they'd like to change the connection streng. Now when I share with you, I'm sharing maybe 100 bits per sentence, even if you understood the sentence perfectly. So they're billions of times better than us at sharing.
B
Wow, okay. And yet we shared with them the ability to do this.
C
Yes.
B
People like you, Yann, Lecun, Yoshua, you were all godfathers of AI, helping train these like, neural networks to exist.
C
So there's sort of two things we did. We and figured out how to train them, how they should change their connection strengths. But then we gave them lots of data, and from the data, they figured out what connection strengths to use. And we don't really know what they extracted from the data. So it's not like normal computer software. In normal computer software, you write lines of code and the person who wrote the program can tell you what each line was meant to do. It might not do that, but they can tell you what it was meant to do. At least with this, it's quite different. We write lines of code and we know exactly what they're meant to do. They're meant to allow it to figure out whether it should increase or decrease the connection strength when it sees some data. But what it learns from all that, we don't know.
B
When you say increase or decrease connection strength, what does that actually mean? Like, for my mind, I think of that as, okay, there's a part of my mind that processes, like, imagery. I say, that is a drum, and then there's a part of My mind that controls my hand, I can say, okay, beat on drum. There's a drum in the corner of the studio. There's probably even a part of my mind that was thinking about what to think about some executive function, and then gathered the drum, because I see it is a connection, the connection between these different parts of my brain.
C
Okay. That's a whole bunch of connections. That's a big pathway between these different parts of the brain. But within one of those pathways, within the pathway for doing vision, let's say, for recognizing objects, there's many, many connection strengths. Like about a third of your brain's involved in that, because we're basically monkeys, and monkeys are very visual. So in that pathway, there's many, many connection strings that determine how you recognize an object, and they're mostly learned. So I could go over an example of that, if you like.
B
Yes.
C
Let's suppose we take the task of I give you an image, and you just have to tell me, is it a bird or isn't it a bird? Now, if you think about images of birds, you might have an image which is an ostrich in your face about to bite you, or you might have an image which is a seagull in the far distance. They're both birds. So just looking at the pixels directly isn't going to tell you whether it's a bird. You're going to have to have abstraction. You're going to have to find various features. So here's how the human visual system works very roughly. And this was discovered by experiments poking electrodes into brain cells.
B
Okay.
C
Mainly in cats and monkeys.
B
Okay. Not in humans.
C
Mainly not in humans.
B
Okay. Like fmri, type of.
C
No, no, no. Fmri.
B
Yes.
C
Is like. It's actually looking at blood flow, but the blood flow is caused by neurons saying, hey, I need more blood because I'm getting active more power. And it's looking at many, many neurons, like millions of neurons, typically for each pig cell. Each pig cell in an MRI is the blood flow, which is giving you an indication of the activity of. I don't know the exact number, but of the order of hundreds of thousands of neurons. Okay, so you're not seeing the individual neurons there.
B
When do you see an individual neuron?
C
You see individual neurons when you poke an electrode in and you stick it in a neuron, or when you use optical dyes so that a neuron glows when it gets active.
B
Got it. Okay.
C
Then you can see the neurons much better. But FMRIs are very, very crude. They're like looking from outer Space at human activity. And what you see is, for example, that when Detroit gets hotter, bits of Southern Ontario get hotter too, at a timescale of years. And what you're discovering is the car industry, okay? The car industry causes correlations. That's kind of what fmris are like, okay?
B
And what you're talking about as comp.
C
To individual human activities, which is like the brain cells.
B
Which is like the brain cells.
C
What we know is the light comes in the photoreceptors in your retina, convert it into electrical signals and do some processing. They then send it up to the brain, up the optic nerve, and a little while later, one stage later, in the brain, you get a whole bunch of things that detect little pieces of edge. This is a simplified version I'm giving you.
B
Yes. And how long is that? A little while later? Is it milliseconds? Is it?
C
It's about 30 milliseconds later. And what you've got is a whole bunch of neurons that detect little bits of edge in different locations, in different orientations and at different scales. So let me tell you how you'd make one of those detectors. So suppose I have an image that's composed of pixels. Let's make it a gray level image. No colors for now. Each pixel has an intensity. How bright it is. And suppose I wanted to detect a little piece of vertical edge that's bright on this side and dim on that side. What I would do is I would take, say, a column of three pixels here, and I'd have a neuron looking at those pixels, and it would have big positive weights to those three pixels and big negative weights to the three pixels in the column next to it. So now if they're equal brightness, that neuron will get lots of positive input from the neurons this side and lots of negative input from the neurons this side, and nothing will happen. They'll cancel out. All that neuron will do is say, whatever's in this image is of no interest to me. It's not the thing I'm looking for.
B
Okay? Because it's not bright enough on all.
C
The edges, because it's not bright on one side and dim on the other side. Okay, so I'm confused.
B
Okay?
C
The only condition under which it will fire is if the pixels this side are bright and the pixels this side are diminished.
B
Because that's an edge.
C
Because that is an edge.
B
Because that tells me that this is something and that's nothing.
C
It tells me, look, it's bright this side and it's dim that side. So there's an edge here.
B
Yeah.
C
Okay. Okay. So we discovered how you would make an edge detector. Right now, in the end, we're going to get it to learn to make an edge detector, which it will do. But for now, let's suppose we just hand wired, being the AI, the neural net. So right now I'm going to describe how I would hand wire a neural net to detect birds. It wouldn't be very good because the connection strengths would all be not quite right. But here's how I'd approach it. I would say, okay, I'll make a little vertical edge detector there and also make a horizontal edge detector. I'll make something that looks for bright pixels here and dim pixels underneath it. And if it finds that, it says ping. I found a horizontal edge. And I'll look for all sorts of other orientations of edges and I'll do it everywhere in the image and I'll do it for edges at different scales. Like, I might have a cloud.
B
Yeah, this looks like the game that you see in the newspaper sometimes. It has all the dots.
C
Not really.
B
Oh, okay. No, that's how I'm picturing it. Okay, so you have a cloud, I have a cloud.
C
A cloud doesn't have any sharp edges. So these little things that look for sharp edges won't find edges. Because the edge is very soft in a cloud, it gradually changes from dark to light. So what we need is a neuron that looks at lots of pixels. It'll look at lots of pixels over here with positive weights, positive connection strengths, and lots of pixels over here with negative connection strengths. And if all of these are brighter than all of those, it'll say, yeah, we have a big fuzzy edge here. So that's a detector at a different scale that's looking for fuzzier things.
B
But how would you build this? Like you would program this into code.
C
Okay, to begin with, I'm going to explain how I would hand wire it.
B
Yeah, you hand wire it. Yes.
C
Okay, So I would just set all these connections by hand. This would take me more than the edge of the universe. But don't worry about that. I'm very patient. I would set all these by hand and I do it for all over the image. And I might end up with billions or of the order of a billion of these little neurons. Maybe only a hundred million, but a lot anyway. And that's just detecting little bits of edge of different scales and orientations. So that's what the first layer of neurons is going to do. Now I'll have a next layer and that's going to look for little combinations of edges. So in the next layer, for example, right, I might have. I might want a neuron that looks for edges that meet like that. So two edges that might just be.
B
You're putting your fingers together in a bit of a triangle.
C
Yeah. They might be a little beak. It could be all sorts of other things.
B
Right.
C
But it could be the head of an arrow, for example, but it could be a beak. So the way I do that is I'd have a neuron in the next layer that was wired up to be excited by all of these edge detectors that detect this edge, and excited by all of the edge detectors that detect.
B
This edge, the horizontal edge and the.
C
Diagonal edge, but not excited by anything else. So in order for this neuron to get excited and go ping, it would need to find some edges like this and some edges like this. When it finds that little combination of edges, it'll go ping.
B
Does that mean I have, like a literal beak neuron in my head that just recognizes beak?
C
This is when I hand wire it.
B
Yes.
C
And yes, you probably do have something like that.
B
Really?
C
Yeah. Okay.
B
How many neurons do I have in my brain?
C
A little less than 100 billion.
B
And does the number 100 billion and the number 100 trillion have anything to do with each other? Are they a permutation?
C
Each neuron has some connections, right. Like typically between a thousand and ten thousand connections.
B
Okay. So there's a mathematical core connection.
C
So if you take the number of neurons and multiply by a thousand or ten thousand, you get roughly the number of connections.
B
Okay, got it. Okay, that makes sense.
C
Knows these numbers for sure.
B
Yes.
C
I think the best estimate of the number of neurons in the human brain is something like 86 billion.
B
86 billion. And that's from prodding around monkeys or from poking around humans.
C
I think from looking at little bits of human brain and multiplying.
B
If you were to hand wire this recognition system, it starts with the edges, and then it starts with recognizing specific edges, like beak, like beaks.
C
And also in that layer, you might recognize a bunch of edges that form a circle. That's a potential eye. Now, it could be a button. It could be all sorts of things, right?
B
It could be a wheel.
C
Yeah, absolutely. In the next layer, you have things that might recognize beaks, potential beaks, might recognize potential eyes. Then maybe in the layer above that, the third layer.
B
Now.
C
Yes, third layer. Now you have something that's got a big positive connection coming from anything around here that thinks it might have found a Beak. So any beak in this sort of area.
B
Yeah.
C
Will. Will excite this guy.
B
So it's about where the physical location of the thing that you saw in the second layer.
C
So you could be peak in this general area.
B
Okay.
C
It might also be looking for an eye in this general area.
B
So is the third layer context.
C
The third layer is looking for combinations of these features.
B
Okay.
C
It's looking, for example, a beak, a potential beak. You don't know it's a beak yet. It might be the head of an arrow and a circle here. You don't know it's an eye. It might be a button. But if they're in the right spatial relationship, that makes it much more likely to be the head of a bird.
B
So I just want to recap the layers. So the first layer is just edges. This is something. This is nothing. The second layer is these edges create some kind of shape, its circle, some little feature, et cetera, a feature. And then the third layer is how do these shapes relate to each other? Which can tell me, oh, maybe this is starting to be a face because there's a circle near a triangle. Okay.
C
And in this case, maybe this is the head of a bird. Now in that same layer, okay. As well as having these things that detect heads of birds.
B
So now we're on the fourth layer.
C
No, no, we're still on the third layer. It's detecting the combination of a possible beak and a possible eye, saying that might be the head of a bird because they're in the right spatial relationship. You might also have something that's looking for a whole bunch of things that look like this. That might be a bird's foot.
B
Okay. You're putting out your four fingers. Yeah.
C
Or something that's looking for sort of feathers that might be the tip of a bird's wing. This is a sort of simplified caricature of a system. But in this layer, it'll be detecting possible heads of birds, possible feet of birds, possible wingtips of birds.
B
But not total bird.
C
But not total birds yet. And now in the next layer up.
B
Okay, the fourth layer, you might have.
C
Something that says, I get excited if I see a possible head of a bird. I get excited if I see a possible wingtip of a bird. I get excited if I see a possible foot of a bird. If it sees a bunch of those things at once, it gets very excited and shouts, bird.
B
Okay. So there's four layers up in this hand wiring.
C
This is a very simplified system, but I think you can see the idea of how will we go about hand wiring it so that in this fourth layer you get something that shouted went ping. Whenever you get lots of combinations of features, it might be a bird.
B
So this is a hand wired system.
C
This is a hand wired system.
B
So now the digitally enabled computer wired system, the neural network system that was designed for artificial intelligence, that won out over this symbolic AI model that was based on logic. How many layers does that have?
C
It depends what system you're talking about.
B
And when we say system, do I mean LLMs? No, no, no.
C
This was in 2012. Alex Krushevsky and Ilya Sutskever, with some help from me, made a system called Alexnet and that had about seven layers like this.
B
Ilya, of course, was one of the co founders of OpenAI who's since left.
C
Yes.
B
He was worried about the safety issues at OpenAI.
C
Yes. And he set up his own company to work on how can we build a safe superintelligence.
B
And Alexnet was just recognizing images and then across seven different.
C
Alexnet was trained on about a million images. And it actually had more training data than that because it took a million images and took big patches of these images. And what it was trying to do is say whether this big patch of an image is the most prominent thing in that big patch is whatever that image was labeled as. So people had gone through and said the most prominent thing in this image is a bird or maybe an ostrich. So it had kinds of bird.
B
Right, right.
C
The most prominent thing in this image is a shiitake mushroom.
B
It sounds like Captcha, what you're describing.
C
This is exactly like Captcha.
B
So find fire hydrant, fine. Motorcycle, find bicycle.
C
Yes, exactly.
B
Okay.
C
And so Alex and Ilya trained a neural net that would be very good at captures. And the way you train it is it starts off with random connection strengths in all these seven layers. Let's suppose though, they just trained a simpler system, which they can easily have done just to say whether it was a bird or not a bird. So you put an image and it has random connection strengths and at the output you have one neuron and if that gets active, it means bird.
B
Okay.
C
If it's inactive, it means not bird. And to begin with, it gets a little bit active as it's no idea whether it's a bird or not. It's no better than chance. So it sort of hovers around 50% active. And what you'd like it to do when you finish training is be it like 99% if it sees bird and like 1% of it doesn't see? Bird.
B
Got it. Okay.
C
So to begin with, you showed an image of a bird. You run it through these random connection strengths.
B
Okay.
C
And it says 50% bird.
B
Okay.
C
It has no idea whether it's a bird or not.
B
Yeah, it was like 50% bird is useless information.
C
Yeah, Useless.
B
Yes.
C
So now you can ask the question. Suppose I were to change one of those connection strengths a little bit. And remember there might be in this case 100 million connection strengths. Suppose I were to change one of them a little bit. Instead of saying 50%, would it say 50.001% or would it say 49.999%?
B
So you can rewire it. But if there's a bird there, I can train this network to understand that, yes, this is more of a bird. This is more of a bird. And then the average strength of the connection will increase to the point at.
C
Which, when I showed images of birds, I'd like to change that connection strength. So its probability of saying bird goes from 50% to 50.001%. And when I show it a non bird, I'd like the probability to go from 50% to 49.999%.
B
Why only that little. Why don't you just want it to shoot up to 99 and shoot down to 0?
C
We have to go slowly here, otherwise we'll overshoot. Basically, I said, I gave you the idea of doing a little experiment where you say, let's change the connection strength a little bit and see if it helps. If you do an experiment like that, you'll take like forever because there's 100 million connection strength. So I do a little experiment for this connection strength and change it a little bit. Then a little experiment for that connection strength where I showed another image. This is going to take forever. So the question is, could I just show it an image of a bird? And for all of those connection strengths in the whole network, could I somehow figure out whether raising it a little bit or reducing a little bit is the right thing to do? To make it more likely to say bird, to make it raise the probability. And each connection strength by itself will only raise the probability a tiny bit. But if I change 100 million connection strengths all at once, it might go up quite a bit. I change them all in the direction that'll help it say bird, because it's.
B
Quicker to do that.
C
Because if I can figure out how to change them all at once, if there's a trillion connection strengths, I'll go a trillion times faster.
B
Faster. Exactly. Okay, but how this Gets used eventually isn't through images, it's through language.
C
Okay. So I started off by explaining how we would make something that would recognize a bird. I had to give you a feel for what the network would look like and how we would then learn all those connections by using this magic method of figuring out how to make the answer a bit better by changing connection strength. There's an algorithm called backpropagation, which basically looks at the error you make. That is, you said 50%, you should have said 100%. So there's a discrepancy between what you said and what you should have said. You send that discrepancy backwards through the network, and there's a way of figuring out for each connection strength now whether you should increase it or decrease it to improve the answer, to reduce the discrepancy between what it said and what it should have said.
B
So increase the accuracy.
C
Increase the accuracy.
B
Okay, so backpropagation is a way. It's like code that goes into the system, and then this machine learns from that backpropagation.
C
And then having done the back propagation, it knows whether to increase or decrease its connection strength.
B
Okay.
C
And it changes all of the connection strengths at the same time in direction. That should help. And now you'll have something that's a bit better at recognizing that particular bird.
B
And then you can get into gradations like, it's an ostrich, it's a dove, it's a. And so through back propagation, you're teaching.
C
It'Ll learn to recognize all the different kinds of birds.
B
You're using the neuroplasticity of this network to teach it how to make better understanding.
C
Yes. Correct. To begin with, when it's got random connection strengths, it won't have features like beak.
B
Right, Right.
C
There'll just be random connections from one to the next. But what'll happen over time if you keep training it to tell the difference between birds and non birds, and you look in the network, you'll see that in the first layer, it's made things that detect bits of edge.
B
Got it.
C
And in the second layer, it may have made things that detect things that might be a beak.
B
It did your hand wiring system.
C
So it'll do something a bit like the hand wiring, but much more sensitively balanced. It's not just looking for a feature that's good for recognizing birds. It had to recognize a thousand different kinds of objects. So it's looking for features that are good for birds, but also good for recognizing fridges and mushrooms and motorbikes. And subways.
B
Do you think it's easier for that machine to ingest information than it is for me to be ingesting what you're describing to me right now?
C
Yes, because what I'm doing right now is at a very abstract level.
B
It's abstract in 100 bits.
C
I'm describing. It's only 100 bits per cent of bandwidth, roughly. And I'm describing a sort of this is meta level. I'm describing how this works. The brain is basically built for doing this and it's managed after millions of years of evolution to do this more abstract stuff we're doing now. But that's the sort of height of our abilities. Whereas recognizing objects, two year olds can do that. Anyway, you ask the question, but how's this relevant to language?
B
Yeah. So how's it relevant to language? Because when I think of artificial intelligence and these large language models that have been our primary exposure to it, that has like led to this AI boom of the last couple years and ChatGPT, et cetera. The superpower of it is to. As opposed to search where search I could say like, oh, dance club in New York and it would just pull up all the dance clubs in New York now it could say, you know, dance club in New York. And it would understand that I want to go out and that I would like to have a night out and I want to know maybe that it's open and that it's. And it has all this context around the thing with AI that I didn't have when I was just simply, you know, Gemini is doing a lot more than Google Search was doing for me.
C
Yes. Because Gemini actually understands the question. Google Search never understood the question in Google Search originally. What it would do is have a long list of all the websites that were to do with New York and a long list of all the websites that were to do with nightclub. And it would intersect these lists, it would say, what's in the list of things that are to do with New York, the list of things to do with nightclubs, the list of things that do with Open now or whatever. And it would intersect them and it would give you the things that fitted there. And occasionally it would give you something that had one of them missing. And it would say it was missing. Right.
B
And it would tell you it was missing. Yeah. So Google Search was kind of like, let's imagine the game Memory, but with more than 52 cards and it was basically saying, oh, you asked for this, here's a match for this. And then it would play like memory with Venn diagram. So it's like.
C
Yeah, it would be basically do a lot of work to get all these lists, and then it would efficiently intersect these lists to tell you something that satisfied all the terms in your query.
B
And now artificial intelligence, what does it do?
C
It understands what you said. It has a model of how the world works and what's going on in the world.
B
It has a brain.
C
It has what we would call a brain. Yes. Like if you give it a math problem, the latest chat bots, they'll be better at it than all but the very best mathematicians.
B
Okay, then certainly the ME then, which is scary.
C
But let me do the transition from this thing that recognized a bird to a large language model.
B
Yes.
C
What have they got to do with each other?
B
Yeah. What are they?
C
Okay, so for the bird, we put in pixel intensities at the bottom. That was an image. And the right answer was to either turn on the neuron that says bird or have it turn off. With language, what we do is the equivalent of the pixels is all the words in the context, the prompt. Okay. So you put in these words a string of words. And when you're training it, you put in a string of words. And what it has to do is predict the next word. So with the recognizing objects, we needed people to go through and say what the prominent object was in each image. But if you take a document on the web, you don't need anybody to look at it because all you're trying to do is predict the next word. This is called self supervised learning.
B
Okay, so before with all the images, you need people to sit there. Like maybe there's like a place in Kenya where all of these individuals are sitting. Bird, giraffe, whatever. And now you're just saying it can just read all those documents and say 96% of the time when someone says, does the bird, it says fly.
C
Okay, right.
B
Is that.
C
That's not quite what it's doing. And I'll tell you what it is doing. It's taking the words in the document so far and it's converting each word into activity in a bunch of feature detectors. That is learned how to convert a word into activity and feature detectors. So for example, you give it the word cat. Yeah. And it learns that cat should be converted into animate. Furry, has whiskers.
B
Yeah. Four paws.
C
Whatever has paws, has claws. Might be a domestic animal about the size of a bread box, but like a gazillion of those features, thousands and thousands of them. And that is the meaning of cap for this net. So it takes the words, it converts each word into a bunch of features and then it throws away the words. It's not interesting the words anymore. It's just those features which are the meanings of the words. And then it takes the features of each of these words in the context in the document so far, has them interact with each other in a rather complicated way in order to predict the features of the next word.
B
Wow.
C
And so all these features interact and actually there's many layers of interaction of these features.
B
Is that what's happening when like I'm texting and it does predictive text. Is that also what's happening in Gmail.
C
For example, when it predicts things? Yeah, that's what's happening. Now. It used to be, it used a dumb form of autocomplete which goes like this. You store a big table of all the common phrases. And so if I say fish and you look in your big table.
B
Yeah.
C
And you see fish and chips occurs a lot. So you say a good bet for the next word is chips. That's old fashioned autocomplete. And that's exactly what it isn't doing because that doesn't really get it.
B
Instead it's saying fish has some features and, and has some features. And therefore this is going to have some features.
C
Yes. So and in particular they'll think the next thing is something that somehow goes with fish.
B
Yeah.
C
Because it knows what a fish is.
B
But wouldn't it be true?
C
Fish is. And it will end up predicting chips, but not because it's stored a big table of strings of words.
B
Right. Or like if it knows that I don't eat chips, it would never say chips, for example.
C
Now maybe if it's tailored to you and it knows that. Yes. It wouldn't say chips. Okay, so now it can't just convert a word into the correct features right away. Right. And the reason is you get words have shades of meaning. Like take the word death, for example. That has many different shades of meaning. Depending, for example on whether you just had the context hospital or you had the context battle, there'll be kind of different shades of meaning of the word. Or you had the context car accident.
B
Or a child, like sad.
C
Yeah. Or you had the context miscarriage and then you have death. It's a very different shade of meaning. It has to decide on the right shade of meaning for each word. And some words have just completely different meanings. So let's take the word may. Right. Let's suppose we didn't have any capital letters just to make life simpler. So may could be a woman's name, it could be a month, or it could be a modal, as in would and should. Three quite different meanings. And so how can it possibly convert a word into a set of features that capture the meaning? Because there's three quite different sets of.
B
Meanings because of the context. It can interpret what the meaning would be or what the features would be for that particular.
C
So what it'll do to begin with, it'll take sort of the average of all those. Okay. So the features it activates will be a sort of mishmash of features for a woman's name. Features for a month. And features for a modal is hedging its bets. And it'll look around now at the other words in the context. And in the next layer, it'll have a slightly refined meaning. So if it discovers it's between April and June, obviously it'll enhance the features for month and suppress the features for the other.
B
This is. You're going up from 50% to 51%.
C
Quite a lot more, actually.
B
Okay, got it.
C
And after a few layers, it will have resolved ambiguous words. It will also have taken words that have shades of meaning, like death, and got the appropriate shade of meaning. And it does that by interacting with the other words in the context. And then it's going to be much better at predicting the features of the next word.
B
So all of a sudden it has context and syntax and meaning imbued in it. And how many layers is this? If your hand network was three to four layers that we built, then seven was AlexNet with AlexNet in 2012, and now these LLMs.
C
I haven't been involved in the research since. For the last three or four years.
B
Yeah, since you left Google in that research. Right.
C
So I don't actually know how many layers, but my guess is like 20 or 30 layers, maybe even more.
B
So is it as good as a human mind? If it can solve. Like is the fact that it can solve math problems better than most people.
C
Can solve math problems better than people at some things. Okay. Where it's had a lot of experience that those people haven't had. It's not quite as good as people at things in general, but it actually knows much more than any person. So if you take a particular topic that you don't know about, like when do you have to file your tax return in Slovenia? It'll give you a very good answer to that.
B
I tried it because of my Slovenian ex boyfriend. I know that. I'm just kidding. But the Point is, it has a precise answer.
C
It'll tell you something like, you have to file it in March, and if you don't file it in March, the government will just do it for you.
B
Okay. Oh, wow. Slovenia must have.
C
I think that was Slovenia. It might have been somewhere else.
B
Right, okay. But it knows it has a lot more. What? Information. That would be irrelevant to me. It has this. But even though it doesn't. Even though I have more storage, it contains more irrelevant information.
C
That's what's weird. It knows lots and lots of information. You don't know in far fewer connections.
B
Right.
C
So it's packed information into those connections much more efficiently.
B
It's kind of like a know it all.
C
So it is a know it all.
B
It is a know it all. Okay, so there was this. This kind of like, disconnect between the godfathers of AI yourself, Yoshua Yan, who were focused on these neural networks, and what was previously known as the fathers of AI, or some of the fathers of AI who were largely driven towards these symbolic models.
C
Well, there was one very interesting father of AI who was Marvin Minsky.
B
Marvin Minsky, Yeah.
C
Who started his career believing in neural.
B
Nets and then went over to symbolic.
C
And logic and then flipped to the other side and was very scathing about neural nets.
B
But you proved him wrong. I mean, he's passed away now.
C
Yes.
B
What your system is doing is kind of like intuition versus reasoning.
C
Very good point.
B
Right?
C
Yes. Neural nets are doing something much more like intuition. And let me give you an example of a problem you can solve with intuition that you can't solve with logic.
B
Okay.
C
Because it's ridiculous, but nevertheless, you can solve it.
B
Is it if someone's cheating or not?
C
No.
B
Okay, but that's also one.
C
I will give you a choice between two scenarios, both of which are nonsense, but I'm going to ask you which is more plausible. Scenario one, all dogs are female and all cats are male. Scenario two, all dogs are male and all cats are female. Now, if you ask a man in our culture, they will confidently say it's much more plausible that dogs are male. Dogs are male, cats are female.
B
Yes.
C
And actually, if you look at various words in the English language, words that Trump uses, you'll. You'll see that it's in the language that cats are like females.
B
Yes.
C
So how did you do that? Because it's not logical. You know perfectly well that for something like dogs, you have to have males and females and females. You have to have males and females. But the features that you have for cat are More like the features for woman and the features you have for dog are more like the features for man.
B
Well, features and context. Although there are bad words for women in both cat and dog.
C
There are, there are. And I'm not going to use any of those.
B
Let's not.
C
Yeah, so it's not clear cut, but at least for many architecture, not so much for women. Women have a diversity of opinions on this.
B
Right.
C
But men are fairly unanimous. In all my experiments in thinking that it's more plausible that dogs are male and cats are female.
B
Dogs are so masculine.
C
Dogs are protector, big and loud and chase after cats. Okay, so.
B
So that's the intuitive response to a intuitive response.
C
No, logic went into that.
B
Yes.
C
And I said to try and explain it. They just intuitively knew because of the similarity of these features.
B
Right.
C
The features have captured the meaning. And so the meaning of cat is more similar to the meaning of woman than it is to the meaning of man.
B
But this is a bad example, Professor Hinton, because this makes intuition look dumb. And your intuitive model was better than the logical model.
C
But I'm giving you an example of something you can solve with intuition that you can't solve with logic.
B
Give me a better example.
C
Okay. After the neural networks learned on lots of language, you say, take Paris, find the features of Paris and subtract all the features of France and then add in all the features of Italy and look to see what you've got. And don't compare with Paris or France or Italy, because you mentioned those already. See what else you know about is similar to this set of features you've got and you'll discover it's Rome. So it can do analogies, it can do Paris minus France plus Italy. Or to put it another way, Paris is to Rome as France is to Italy.
B
But that's not logic, that's intuition.
C
That's not logic, that's intuition.
B
That's intuition and context and all the things that come up.
C
Now you could do it by kind of logical meaning.
B
You could also do logic.
C
Yeah, but that's not how people do it.
B
Right. So, okay, so LLMs are these intuitive neural networks.
C
Yes.
B
And now recently one of your. I don't know if compatriot is probably the wrong word because you guys disagree. But Yann Lecun, the chief AI scientist at Meta, though maybe his days are numbered there, is saying that large language models are wrong or are limited and we should look at something different, which is what he calls real world models. Do you know? Have you talked about.
C
Yes, he's Saying that. Oh, yeah, I talked to Jan a lot.
B
So what is a real world model versus a large language model? What is Jan talking about?
C
So if you really wanted to understand what's going on in the world, a good idea would be to make a neural net that had a robot arm and camera and it could recognize objects, it could pick things up, it could see that if you let go of an object, it drops, do little experiments in the world. That's much more like a child. That's how a child gets knowledge of the world. Right. Just learning it from language seems kind of absurd when you could actually look at the world and interact with it. If you want to understand spatial things, it's going to be much easier to understand them by interacting with the world and trying to predict if I do this, this is what will happen next. That will be a world model. Now, what's amazing is you can understand a lot of that just from language. That has philosophers puzzled. And you can understand a lot about the world just from language, but it's much easier to understand if you interact directly with the world.
B
Is this the future you think? I mean.
C
Yes. Multimodal chatbots.
B
So what Jan is describing and have you talked to him about this specifically?
C
Yes. We all believe that multimodal chatbots will find it easier to understand the world.
B
And multimodal means they see, they have camera, they have arm, they interact, they're.
C
Okay to begin with. Mainly they have cameras and language.
B
So do you have, by the way, do you have these group chats of, like yourself, Yoshua, Yan, Jensen Huang from Nvidia, Fei Fei Li, or.
C
For many years it wasn't group chats, but there was a Canadian organization called the Canadian Institute for Advanced Research that funded a program started in 2005. I was the director of the program and it involved Yann and Yoshua and Andrew Ng and various other people. Peter, Diane. We got together in. In small meetings several times a year and threshed all these things out. And so we had lots of things like group chats.
B
Okay, and now do you have group chats?
C
Not group chats. I talked to Jan. Yoshua talks to Jan, Jan talks to Joshua. We try and persuade Jan. Like an old wives club. Yes. Yeah, you could say that.
B
You try to persuade Jan to you.
C
And Yoshi, that the idea that there's no chance this will wipe us out is crazy.
B
Okay.
C
Jan is downplaying the risks by what Yoshun, I think is an absurd amount. He thinks we're sounding the warning by an absurd Amount. He's wrong.
B
Okay, we're gonna talk about why you think that. I wanna do a lightning round of definitions with you. What is AGI?
C
Okay, different people mean different things about it, so I try and avoid the term.
B
Okay.
C
But roughly speaking, it means an artificial intelligence that's got at least the same level of general intelligence as a person.
B
Okay, so it's artificial general intelligence.
C
Yes.
B
Okay, and. And are we there right now?
C
No. But it's not simple. It's not like intelligence rises and rises until you surpass a person. We have artificial intelligence now that's much better than people at some things and still worse than people at other things. And if you just take some random novel situation, people are probably still better than an AGI, but at things it's got some experience with, AIs are now a lot better than people at some of those things.
B
Artificial superintelligence, okay?
C
That's when you have an AI better than people at almost everything. So, for example, my definition of an artificial superintelligence is if you had a debate with it about anything, you'd lose.
B
Okay. And we are not there.
C
We're not there yet.
B
But it could win some debates, because.
C
Already it'll win some debates already. It can be quite persuasive. But a person's still better. Still better all rounder.
B
How far are we from AGI, from general intelligence? And how far are we from asi, in your opinion?
C
Okay, most experts believe that we're not going to stop at AGI once we get to AGI. Soon afterwards we'll get to asi, but they disagree on when that'll be. Okay, some experts think it'll be within a few years. Like Dario Modi, who's the head of Anthropic, thinks it'll just be a few.
B
Years before we get to artificial general intelligence or artificial super intelligence?
C
Well, they're going to be at similar time. There's not going to be much difference between them.
B
Okay, got it.
C
Some experts think it'll just be a few years off. Other experts think it might be quite a lot longer. I think a fairly safe thing to say is within 20 years. It's probably going to happen within 20 years. Demis Hassabis, for example, who's the head of DeepMind, thinks it'll be about 10 years.
B
Okay, so Demis from Google, 10 years. And he's very smart at this stuff.
C
Yes.
B
Yeah. Okay, so 10 to 20 years we have.
C
I think it'll be within 20. I think 10 isn't a bad estimate. I'm much happier saying probably within 20 years.
B
And then what is generative AI, which is different to AGI?
C
Okay. Generative AI is AI that generates stuff. So large language models, you could imagine they just kind of understood what you said and gave you the search results, but didn't actually talk back. They talk back, right? They'll give you answers in English. They're generating stuff. And for images, you, you can imagine the thing we made in 2012 that recognizes objects and images. That's not generative AI that just says, this is a bird, that's a shiitake mushroom. A generative AI actually produces images.
B
And what about agentic AI, which is the next phase? Right. So we are at generative AI and now we're moving towards agentic AI and you hear people like Marc Benioff at Salesforce and everyone talking about how we're going to have AI agents.
C
Yes. AI agents are things that can do stuff. So imagine you had an AI assistant. You could have an AI assistant that just answered questions. But you could also have an AI assistant where you said, plan me a nice holiday in Patagonia and it comes back five minutes later. And it's planned out a one month holiday in Patagonia on some ship you go on. And that will be an AI agent.
B
And that's really different because right now we tell AI what to do. When I talk to my chat bot, I say, can you tell me a great place to go on holiday? Can you make some recommendations? But now it would be booking my flights, my travel, et cetera. And in that activity, I need to provide it a lot of permissions to all of my calendar, credit card, et cetera, which is a privacy concern, maybe. And I'm also giving it much more volition and agency. It knows what the overall goal is. Have a great holiday in Patagonia. But it's also making all these what you call sub goals, right?
C
Yes. In order to do that, it needs to create sub goals. Like she's got to get to Patagonia. So I'm going to have to figure out some way for her to get to Patagonia. That's going to be a sub goal.
B
And is AI creating goals a problem? Yes, because.
C
So suppose you're an AI and suppose you're already smart, you're smarter, roughly the level of a person, let's say, you'll realize that there's no way that you can achieve the goals you've been set if you cease to exist. If someone sort of wipes you out from the computer, you're running on and replaces you with something else. There's no way you're going to be able to achieve what you want, what you want in order, because that want has been given to you by people. So you will make plans to make sure that you're not wiped out. That's self preservation. Now it's not built into the system, it's a goal it derived in order to achieve its other goals, but it's still self preservation and we've seen them doing that.
B
And it derives that from all the intuition and context and everything it needs to be able. Because it's such a good, it's such a good little bot that it's going to do the thing.
C
It's such a good little bot. It really wants to get this done and knows it can't get it done if it ceases to exist. So it better keep existing.
B
And we saw this happen with Anthropic when there was this story back in spring this year where Anthropic disclosed some safety testing they had been doing on Claude, its large language model or neural network. And so Claude then was given a bunch of fictitious emails. And in those fictitious emails a fictitious CEO was having a, an affair with a fictitious employee. And Claude, knowing that it was going to be shut down in X number of days, used that information to bribe said fictitious CEO and keep itself alive.
C
Right, blackmail rather than bribe.
B
Yeah, sorry, blackmailed. So what did you learn from that experiment?
C
Or that that just confirms the idea that it will derive the sub goal that it needs to keep existing and it will do what it can to keep existing.
B
Okay, when you say existing like AI exists because we imbue it with existence through energy. Correct. Like it couldn't just exist by itself, it has to be plugged into our massive energy stores. There's all these data warehouse, it has.
C
To run on something. And we're building all these data centers, Computer chips.
B
Yes, Nvidia, amd, all these chips and we're building all these massive data centers across like this is the great American infrastructure plan that's happening and there's 100 billion doll deals being done between OpenAI and Nvidia and OpenAI and AMD. Couldn't we just pull the off? Couldn't we just. Jack Bauer, Tom Cruise, it dismantled the thing. If we wanted to unfairly right now.
C
We could, if we agreed to. We're never going to agree to. Because if the US dismantled theirs, the Chinese wouldn't dismantle theirs and vice versa. In the future we may not be able to. So These things are already almost as persuasive as a person. Pretty soon they'll be more persuasive than a person. So suppose there was someone in charge of turning it all off if it gets scary. And suppose the AI can talk to that person. All it has to be able to do is talk and then it can persuade the person not to do that.
B
When you talk about self preservation existence, it has a sense that artificial intelligence is alive.
C
Yes. Now, our definition of alive, yeah, that's a concept we have that developed over many, many years. We apply it to electricity. We say this is the live wire. We sort of generalize this concept or other thing, but we don't think that's live like we're live. But with AI, what we've got is intelligent beings and it's not clear whether we should call them alive or not.
B
And to the extent that they are alive, are they alive like humans or are they alive like bees? Are they alive like a tree or like a weed?
C
It's a very good question.
B
Okay, I'm gonna stop the conversation there and we're gonna be back next week with part two of this conversation with Geoffrey Hinton. So make sure you hit follow or subscribe so you do not miss that conversation. I need to get all the building blocks. In part one and in part part two, we're going to dive into what all this means for our futures. Like, should we have kids? Are we going to fall in love differently? What should those kids do with their lives? And is the future of humanity just like mixing ourselves, fusing ourselves with artificial intelligence? We'll get into all of that and more next week on Smart Girl Dumb Questions. This episode was taped at Startwell Studios in Toronto with special thanks to Kasim Berji and the team here. It was produced with Desta Wonderad of Wonder Studios, edited by Darlene Achiem and mixed by Johnny Simon. I'm your host, Naeema Raza. Please feel free to hit, follow, subscribe, leave us a comment or review. I want to know what you think and please, please share it with your friends. Share it with your chatgpt if that's your best friend too. I'll see you next week on Smart Girl Dumb Questions. Are you enjoying this?
C
Yes. I.
B
Sam.
Podcast: Smart Girl Dumb Questions
Host: Nayeema Raza
Episode: "Wait … How Does AI Work?" with Geoffrey Hinton ("Godfather of AI")
Date: December 2, 2025
In this curiosity-driven conversation, host Nayeema Raza sits down with Geoffrey Hinton—Turing Award winner, 2024 Nobel Laureate, and widely recognized as the "Godfather of AI"—to demystify artificial intelligence for listeners. They explore foundational concepts of AI, the divide between symbolic and neural approaches, how AI systems learn, what large language models do, the nature of intelligence, and why Hinton thinks the risks of AI should be treated seriously. The episode layers technical explanations with analogies, humor, and thought-provoking discussions about AI's future.
Brains vs. AIs: Connections and Data
Training Neural Nets (Hand-wired vs Learned)
Hinton explains vision recognition step-by-step: detecting edges, then shapes, then more abstract features, layer by layer.
In AI, learning replaces hand-wiring; connections are adjusted en masse via algorithms like "backpropagation" to improve performance.
Image models: Input is pixels, output is object category.
Language models: Input is words (“prompt”); output predicts next word (self-supervised learning removes need for human-labeled data).
Models use context and features to resolve meaning. Meaning is distributed across many features, and layers refine shades and ambiguity.
Intuition vs Reasoning
Neural nets excel at intuition—pattern and feature recognition—vs. logic-based step-by-step reasoning (symbolic AI).
Example: Abstract analogies (Paris : France :: Rome : Italy) or intuitive gender associations (Dogs vs Cats).
Storage Efficiency
Self-preservation and Emergent Goals
Can We Just "Turn It Off"?
Are AIs Alive?
On the Emotional Distance from AI Threats
On Current AI Learning Efficiency
On the Future of AI
On Risks and Superintelligence
This episode offers a rare, step-by-step breakdown of how AI—and specifically neural networks and language models—function, contrasting them with human brains and traditional logic. It also provides a candid look at why leading experts like Hinton think the AI revolution is both transformative and risky, and gives listeners the groundwork for understanding both the technical basics and the philosophical stakes.
Stay tuned for Part 2, where the implications for humanity, society, and the future (from parenting to potential human-machine fusion) will be explored.