
Anders Sandberg joins me to discuss superintelligence and its profound implications for human psychology, markets, and governance. We talk about physical bottlenecks, tensions between the technosphere and the biosphere, and the long-term cultural and phys
Loading summary
A
I think we could take a lot of ethical advice from smarter entities, but we might also want to have a debate with them about it and actually share the understandings. You actually want to weave our preferences and our discourses into this system in the right way. Ideally, we should become a kind of cyborg civilization where we both have superintelligence guiding and coordinating us. If you're below a certain error threshold, you can combine error prone processes in such a way that you get a new process that has a much lower error rate. I believe something like this might happen with AI. There is a kind of transition in reliability, but once it's reliable enough, you could make this redundant system, make the reliability go up enormously.
B
Welcome to the Future of Life Institute podcast. My name is Gus Docker and I'm here with Anders Sandberg. Anders, welcome to the podcast.
A
Thank you for having me.
B
Could you say a little bit about your background?
A
So I'm usually presenting myself as an academic jack of all trades. I started out studying computer science and mathematics, then I took a course about neural networks and kind of fell in love with the brain. So I took neuroscience courses, psychology courses, a bit of medical engineering, and then I ended up in the philosophy department of Oxford University at the Future Humanity Institute. So these days when people ask what I am, I say, some kind of futurist philosopher, Something, something.
B
You have this wonderful manuscript called Grand Futures, which is, last I checked, 1400 pages where you dig into the physics, the economics of all the kind of different paths that humanity could take and become a much larger presence in the universe. Could you say, what's the status of that manuscript right now?
A
So for a while the manuscript had been resting because I needed to finish another book. Lower Liberty and Leviathan, Human autonomy and the era of artificial intelligence and existential risk, which feels a bit urgent. We kind of need to figure out some of those things out. So the manuscript had been resting on my sofa, kind of waiting for me to finish with that lightweight 600 page volume. But the nice part is, of course, now I learn how to write better and science has kept on advancing, so I've been piling up references and things to add. So it's not like Grand Futures is going to be how the world was back in 2023 when I started on something else. It's rather now I'm rebooting it, which is also very useful because now I have many more people who can help me actually check. But what I'm writing is correct, or at least plausible.
B
And perhaps you have assistance from AI.
A
At this point, yes, that was fascinating. When I started, ChatGPT was not a thing. I've been working on this for quite some time, and during the process I tried sometimes, okay, can AI help me develop this paragraph? And the first few attempts were very hopeless. Okay, that's just nonsense. And then it became annoying because, yes, that's an answer. I don't trust it at all. Indeed, it turned out to be totally wrong, full of hallucination, and took longer to figure out what reality was than actually making use of it. Indeed, I discovered that quite a lot of apparently straightforward questions were surprisingly confusing, both to AI and to me. But over the past few months, it's gone from okay, it's not useful at all. Two, it's actually helpful. I wouldn't say that I'm trusting all the results, but it's very good at digging up obscure literatures saying, I have this question about some vague domain. Is there a name for this? It finds the name of a domain and then of course, what are the main papers? Then I can start reading up on that. And once you have a terminology, suddenly you have a pretty useful assistant. So I'm looking forward to this brave new world where me and my human assistants are having AI assistants to amplify the ability to both write, but also fact check and style check and develop these ideas.
B
I think one starting point here that could involve both grand futures and your the book that remind me again of.
A
The title of the other book, Lower Liberty and Leviathan.
B
Yeah, exactly. And that book is to kind of ask a concrete question. So on this podcast and in culture in general, we talk a lot about when we will reach AGI, when we will reach superintelligence. We try to forecast that question, and we talk less about what happens after that point. So for the purposes of this discussion, we could assume that we reach superintelligence in 2030 and then see which paths we could take as a species from there. So of course this involves physics and economics and sociology, I think, to quite a large degree. But where would you start with trying to forecast what happens after superintelligence, assuming.
A
We have enough alignment and that it goes well enough so it just doesn't break the world. And I think one should recognize that even if AI in itself is pretty safe, you might just shake the world apart at the seams. If it just amplifies human ability to do things without amplifying human ability to coordinate about safe and sane things to do, then you probably end up with a lot of the obvious things that are going to be pursued with great zeal and intelligence. So right now we have an issue with energy in the world. We need better energy sources. We also need to avoid messing up the environment. And we have a lot of insecurities that are quite important, like food. And it's pretty clear that we're going to apply a lot of AI power to that. Maybe AI is self directed agents, maybe it is just tool AI or an ecosystem that humans are setting the agenda for. But it's pretty clear that we're going to be pushing for material wealth and welfare. After all, peace and prosperity is kind of one of the key drivers for most human economic activity. And we should expect that an AI empowered economy is also going to do a lot of that. So one of the first chapters in Grand Futures is basically, okay, how much material wealth can we achieve? And when I started on that chapter, I believed that, okay, this is going to be relatively straightforward to write about because it's going to be about the physical limits to manufacturing and energy production, and to some extent how much they are compatible with living on a finite planet. It turned out that there was quite a lot of weird economic sidetracks and indeed even psychological sidetracks that made that chapter much more unwieldy and much more exciting than I had expected.
B
And what were those sidetracks?
A
So when we think about a world of enormous wealth, at first people would imagine golden palaces and as much food as you can eat. That's kind of what our ancestors would have said, oh, that's what to go for. They would of course also said, oh, I want as much fat, sugar and salt as possible. That's the dream life. And we kind of know actually that's not very good for us. But certainly being able to make any material object that is useful is something we want to do. We want to solve the fundamental manufacturing problem. This is where I think Eric Drexler's vision about molecular manufacturing and atomically precise manufacturing in the long run, which might not be terribly far away if we get superintelligence by 2030, is actually going to transform the world. Even if you say, okay, that's unrealistic, we're just going to have 3D printers and robots and biotechnology that already produces a world where you can get most stuff relatively cheaply and easily. The really interesting part, however, is that when you think about wealth, if we think about the lifestyle of the rich and famous, it's not that much about that we eat enormous amount of food and that we have a lot of cars, we certainly have a Lot of cars, but the food is fancy food. And although the villas might be large, it's more about that they're elegantly furnished and they have a lot of people waiting for them. That is of course what we actually mean by wealth these days. We're quite post materialistic in the sense that we want services. So we want that massage, we want to have that personal coach, we want to have those assistants doing whatever we do to keep our wealth flowing. So the other part of wealth is of course actually services. Now the good news is if we actually manage to get AI to work really well, we're going to get services very, very cheaply. So that would mean that we could all have that entourage of robot software servants giving us the massages and managing our bank accounts and doing the legal stuff for us. That sounds great. Except of course you get both weird side effects. The bad guys are going to have endless lawyers in the cloud and going to sue everybody. You better have your own personal digital lawy to resist getting frivolous lawsuit. There's going to be a lot of stuff going on behind the scenes that is very complicated. But the really interesting part is of course in this world, who gets to have that villa right there at the edge of the peninsula in Lake Como that has the best view? Doesn't matter that we could manufacture a lot of such villas. There is only one spot that has that view. There is still going to be some things that are kind of a zero sum game. Even worse, there are social zero sum games. Who gets to be coolest at the party? Of course we might have different views on that. Maybe the music star think that they look the best. And I think I have the most interesting academic publication. We're both very smug at the party. But there are things that even in a world of material and service, post scarcity are going to be limited. And then we get to the final most interesting part. Why do we want all this stuff? Well, people would say, well it makes me happy and that happiness is actually the real resource we want. And of course philosophy has been going on for thousands of years about that. That's what we should be aiming for, not getting as rich as the Athenians, but we should be content with what we got. How do you reach contentment? Maybe you can have your AI life coach give you really good advice. But there is probably also aspects of brain functioning and psychology not just healing us so we don't have depression, but actually coming up with better ways of being very, very happy. Happy in an effective way rather than drooling in a corner in the opiate enjoyment, but actually being able to have a really fulfilling life. So that gets to an interesting aspect of post scarcity actually bounding our enjoyment of the world. And maybe that means that actually we don't want those gigantic villas and the endless entourages of robots. We're going to be instead living a simple but very, very happy life. I think we're going to see a mixture. We're going to want to have both the golden palace and good spiritual contentment. But this is going to be a tricky thing. It's not going to be that trivial to solve. Even perhaps for superintelligences.
B
There are, you know, there are quirks of human psychology. That means that we are very concerned about hierarchies and status hierarchies. And these games are, as you mentioned, you know, not everyone can be the coolest person at the party. And so is there any way to intervene in our psychology such that, such that we can all, we can, we can steer away from zero sum status games?
A
I think there are ways. There are certainly some people around that are not in zero sum status games. There are the nice people who don't care about these things. And presumably since that is an aspect of how the brain works in them, we could replicate that. We could imagine that we studied these saintly paintball and figure out a way of becoming saint clear. There are interesting issues here. One is of course figuring out how it works, but I think that is something more advanced psychology and neuroscience would be able to do. We know for example in psychology that there are two different kind of status. There is dominance, the kind of bullying status. You get prestige, you're really good at something and people admire you for it. And both of them are a bit rewarding. It feels very nice to know that you're very good at what you do. Or for that matter, to bully people and feel that I'm on the top of the pack. However, how we react to that already when the dominant people fall from grace, when they lose some of the dominance, people typically kick them. Suddenly the underdogs strike back. It's not very fun to be a formerly popular bully. While high prestige people, people generally like. And even more important, it seems to be this rewarding thing that drives us to want. You could imagine removing that altogether, maybe some form of gene therapy or brain manipulation. A world where people don't strive for social status. It would be a very humble world and it would probably work extremely different from ours. It's not even obvious that we would want that because that gets to the other really tricky point when thinking about enhancing humans. I worked a lot on the ethics social impact of cognitive enhancement. People are generally very happy with thinking about, oh, learning language is better. Better memory. Yes. Staying alert longer. Yes. Really useful. Becoming kinder. In that one survey, only 9% wanted to take a hypothetical pill that made them more kind people.
B
Perhaps because people are afraid of being screwed over by the world. Maybe you don't want to become kind if you feel like you thereby open yourself up to attacks from a cruel world.
A
I think that is one part of it. But this actually came true. Or this view of people not wanting to enhance things that were fundamental to the sense of self came across for many other things, like empathy. And I think in general what happens is we're afraid of not being ourselves. I might on one level want to be a kinder person, but it would also change my personality a bit if I took that pill. So I'm a bit afraid of doing that. And it feels like my kindness might be closer to who I want to be than my ability or non. Ability to speak French. So I think there is this interesting interplay that even if we could offer some enhancements, it's very likely that many people would be reluctant to take them. Of course, we might imagine a society where people say, actually it's so important, given the tremendous power we have thanks to our technology, we need to be more sane and more kind. We can't allow unstable, crazy people to wield these technologies. So you might see some very interesting social struggles about exactly how much therapy and adjustment do you want your citizens to have? And this is going to be politics for the 21st century, I think.
B
Yeah, yeah. If we return to the question of the physics and economics of a post superintelligence society. So, so what you've been discussing here is a bit of, of the economics and perhaps what we'll end up doing if we emerge into some kind of post scarcity state. Another question I'm interested in is in grand futures, you investigate kind of what can happen on very long timescales. If you think about the hypothetical in which we have superintelligence in 2030, how far do you think we can get by 2040? Just because there's a tendency, I think, to think that if we get superintelligence, there are no limits anymore. But of course, even superintelligence is bound by natural laws, the laws of physics and so on. So how much can we expand and what becomes possible in a decade with.
A
Superintelligence that is a very good question, because the limitations are set by the material conditions of the world and the physics of computation. So first, it's a bit unclear how much energy, for example, it takes to run a superintelligence. Right now, people are going on endlessly about how much energy is consumed by data centers. I think it's a lot of energy, but it's not that in a key problem, because the real question is how much good the decisions do you get out of a given amount of intelligence. And that's probably not scaling very clearly with energy. And once you get superintelligence, you're probably going to get it to find ways of optimizing its energy consumption. We know a human brain runs in about 20 watts of power, so we know we can at least get that much intelligence out of 1 kg of matter and running at 20 watts. But the real problem is, of course, building a new data center today takes a few years. You need to bring the funding in. You need to get approval for the plans. You need to set up the foundations. You need to build the building, install the service. They need to be shipped over from Taiwan or elsewhere. You need to test it out. How much of that can you speed up? If you had really, really smart systems, and it seems like some of these processes are materially limited. If you need to ship something on a boat, that boat is not going to move faster just because the entity that ordered the chips happened to be very smart. Of course, the very smart entity might be a very impatient entity and might say, okay, I'm going to ship it over by air, or I'm going to invent that cargo zeppelin. But now, again, you have a problem. Okay, I invented a cargo zeppelin. I need to get it approved. The Federal Aviation Authority might have opinions about that. It needs to be tested. That's going to slow things down. And even an AI that is amazingly good at getting bureaucrats to do what they're supposed to do fast is still going to be limited by that. Even if it could just ignore bureaucrats and just send the robots to do things, there are still limits on how much you can move around without messing up the environment. So Earth has this interesting property that right now we're consuming a fair bit of energy, and then it turns into waste heat. And waste heat is a very minor problem for us right now. We're certainly concerned about climate change, but that's because we put carbon dioxide in the atmosphere, and it changes the greenhouse effect. But the waste heat from all our servers and cars and gadgets that Just is taking up by the atmosphere and radiated into space. If we were to increase our energy consumption by a factor of 100, we would get heating no matter what we did with the carbon dioxide. Even without any carbon dioxide, even if it was powered by magical unicorn rainbow power, with no other environmental effects, the waste heat for 100 times more energy active humanity would actually start heating the world by about one degree. So there is a limit on how much you can do without starting to overheat the Earth. Similarly, if you move large amounts of mass around, you are going to have problems with the environment. So keeping Earth Earth like requires staying in some limits. These limits, if you have real advanced technology and are really smart about how you use it, are of course much wider than we normally think about in the environmental discussion. The typical environmentalist discussion tends to assume that we need to do less things because that's the only way of saving the environment. If you, if you actually look at efficiencies, et cetera, you realize actually more high tech things can quite often be much more impactful on the environment if you do them carefully. The problem is however, that once you start scaling up a civilization, you need to be more and more careful. So in the long run, I think you will want to put the superintelligence data centers in orbit. People are already discussing that in Silicon Valley. And right now I don't think it's going to be cost effective because cooling in space is much harder than on Earth. We are kind of cheating by our waste heat turning into heat in the atmosphere and then it's radiated away from the top of atmosphere. If you have your data center orbiting the Earth now, you need to have radiators sending it out in the coldness of space, which sounds really good because space is really cold, except that you need to do this as radiation. The convection from a computer just moves heat away from it very efficiently. But if I need to radiate it, it would suddenly be hard to cool it well enough. And of course the problem for that data center is a lot of sunlight from the sun. Great for the photovoltaics power it, but also heating things up. So building things that work well in space is tricky.
B
Yeah. Could you go over that again just so I understand? Why is it that it's more difficult to cool down a data data center in space, given that space is very cold?
A
So on Earth, the data center is going to probably be connected to a cooling tower or maybe a lake of cold water. And the waste heat moves into the air and water and even from the water into there and eventually ends up at the top of atmosphere and gets radiated away into space. So we can use the entire top of Earth's atmosphere as a gigantic radiator, sending away this radiation in space. If I have my data center there, I need to build my radiator build. I can't just have the heat waft off into the vacuum of space because there is nothing there to conduct the heat. And the big problem is there is something called the Stefan Boltzmann law that tells you how much heat gets radiated away, and it says that it grows with the fourth power of the temperature of your radiator. So this means that a very hot radiator is amazingly effective at radiating away the energy. The problem is a cooler radiator is less effective. And this is quite often a problem. For example, the International Space Station has this issue. It has sections that are full of people, and they need to be about 20 degrees centigrade. And now you need to keep them at that temperature and radiate away the waste heat. So now you have a radiator that is about 20 degrees centigrade. That is a very wussy radiator. It doesn't get much energy out. There are other radiators on the space station that are actually cooling, some motors and compressors that are much hotter, and they are much more effective. They can actually be smaller than the radiators you need for the human section. So now, if we imagine this data center in orbit, its big problem is that you normally don't want your ships to be too hot. That's kind of bad. You want them to be really cold, actually. But the colder they are, the worse the radiators become.
B
Hmm. And so something that's quite concerning, I think, is that it will be tempting to build data centers on Earth, even though the more we build, the more these data centers take up of the Earth's surface, and the more the, say, solar panels cover the Earth's surface, the less compatible Earth will be with. With biological life, with. With human life. So there are, in some sense, biological life is fragile, and we require a quite specific environment in which to thrive. But server farms, computers, can thrive in a wider range of conditions. Do you think this is a kind of fundamental rift between the interests, so to speak, of a superintelligence in expanding its own power and humanity's interest in staying alive?
A
My friend Forrest Landry has made the argument that in the long run, a technosphere is always going to outcompete the biosphere because it has this wider Range. And I think it's an interesting argument. I don't buy it straight away. I think it depends quite a lot on, well, who's in charge of that technosphere. How does it actually work? What is the economics and goals guiding what is getting built? The recent data centers are so general, is that we design them right now to work in different environments. So this typical data center, well, that's a big building where people in shirt sleeves can go around and repair the service. It can't be too hot, because then it's going to be ineffective. But there are other people working on data centers that actually literally have. In a container on the sea floor. They have different performance, but they are designed to do that. And we can imagine constructing data centers that are intended for the Arctic or tropical jungles or space. So there is a design issue here. But that design thing is, of course, a fast transition. The evolution has taken millions and millions of years of coming up with organisms that can thrive in different environments on Earth. But in this case, you just plonk down a bunch of engineers in a meeting room and say, ladies and gentlemen, we want to build a data center that works very well in the Taklamakan Desert. And they started signing away, and within a few months, there is a bunch of servers standing in the desert. Now, that rapid ability to change is very different from natural evolution. I do think that we are going to intervene in biology, too. I don't think there is anything sacrosanct about biology. So we could probably speed up evolution by doing technological evolution. Indeed, synthetic biology is a good example. And in a world with superintelligence, it should be fairly easy to redesign biology. But that doesn't mean that you can just redesign our ecosystems to thrive. If there is loads and loads of data centers heating everything up, that's not how we're going to solve it in particular, because I think we don't want to transform things too much. We have this conservative view that actually the biosphere should probably be green and blue and it shouldn't look totally alien. There's probably going to be some pretty interesting environmental discussion about repairing environmental damage we have done in the past. If we get rich enough and flexible enough to do that, and now we're going to have a struggle. Should we restore a large part of the world to how it was before UMass arrived, or maybe even earlier? Or should it be some kind of parkland? You have a lot of options, and these choices are going to be big, contentious political issues. But the deep issue that you pointed out Is interesting. Technology is more open ended because it's guided by intelligent action. And intelligence can decide on jumping to somewhere else in state space. Biology works by evolution and learning. Organisms change very gradually. If there is no easy path, that is always better for your fitness. A species cannot evolve into another species. But of course, in technology we might realize that actually we need to switch from light of an air in the transport to heavy than air transport. Let's construct an airplane that doesn't work at all like a balloon. And maybe we want supersonic transport. Okay, let's mutate that in the airplane. Not just in the sense of doing gradual changes, but actually a radical redesign. So this ability, I think means that in the long run the world is going to be guided by intelligent agents and what they do rather than this natural evolution. Except that a lot of these agents are of course competing and copying and cooperating with each other. So there is a form of evolution going on, but it's in the cultural space rather than in the biological space. And that means that of course, the world becomes a cultural artifact for the civilization living in it.
B
It's an interesting discussion to think about. What is it that will actually determine the shape of the future here? Is it, say, the fundamental limits of physics as you explore in grand future, or is it more determined by culture? I mean, my kind of naive guess is that culture is what matters most in the short run, Whereas perhaps physics is what matters most in the long run. Such that culturally we will determine what we'll do, say from 2030 to 2040. But ultimately over hundreds or thousands or millions of years, we are probably, if we survive as a culture, we are going to explore various limits of physics in a bunch of different directions. Do you think that view is correct, that culture is more significant in the short run?
A
I think it might be significant both in the short run and long run. We are certainly limited by the laws of physics, and they are much nicer to think about for me, because I can say things in a more rigorous way. The light speed limit. There are kind of profound reasons why you can't move stuff faster than the speed of light. The laws of thermodynamics are true for us mere humans and for future superintelligences. Even when you cheat about them, those cheats have their own costs. There are limitations, and they place boundary conditions about what you can do in the future. So I can make a very confident prediction that in a million years then the intelligence from Earth is still not going to be in the Andromeda galaxy because it's very, very unlikely that you can break the light speed limit. So even those superintelligences, even if they were racing to Andromeda, they're still on their way, getting there. But why are we going there? That's a cultural question. It might be because it's a religious goal, it's a pilgrimage. It might be because it's an art project. It might be because we're competing fiercely for all the resources and this is culture. And it's extremely open ended because it has so many degrees of freedom. If we look at current society, a lot of the activities we're doing are not terribly bounded by the laws of physics. Certainly I need to pay my energy bills, but they're not enormous. I don't spend a lot of time working per week to just get fuel. I don't need to spend that much of my resource on that. The amount of resource I spend on getting food to survive are pretty minimal. If I had been a hunter gatherer or if I'd been a kind of pre human living in the forest, I would have spent essentially all my available time on that. Now this has changed because as we get richer in a material sense, we are able to use the other resource for other things that we care about. Social interactions, intellectual work, spiritual work. And the end result is of course that the activities we do becomes much more diverse. A colleague of mine, Karim Jabbari, wrote that really interesting article about whether taking backups of civilization by making refuges in the case of a disaster, if that makes sense from a survival standpoint, versus actually backing up the unique part of our civilization. And his point was, if you look anthropologically at societies, all human societies are full of their own cultures. But you can get much more culture and it's much more unique, the more well off you are. Hunter gatherer societies are certainly different because of their environment, but they're very constrained by the environment. You can't actually lug around the temple and the library if you're nomadic. If you're having an agricultural society, you can start building your temples and libraries. And now you have more options for different styles. But typical Bronze Age civilization will make pyramids because they're fairly straightforward to make. Once you get to Iron Age civilization, the options become bigger and bigger. Your civilization becomes more contingent. So I think that a super civilization where you have super intelligence and reach the material limit, might choose extremely different things. It might be that it's all very green and it's aiming at preserving life or spreading life across the universe. It might Be that it cares about welfare of beings and it's working very hard on making sure all entities it can reach are happy. Or it might be doing science or competitive economics or all of above. At the same time in some complicated political framework.
B
There's a sense in which if you look at history over the very long term, or say, from the emergence of humanity as a distinct species until today, you get this sense that we are expanding and that the pace of change for humanity is accelerating. And perhaps you get this slightly religious feeling that we are being pulled towards some destination where we will fulfill our potential. Now that is in some sense teleological thinking, and I'm not sure how rigorous that is. But do you think in some sense we are being driven to expand by forces we don't fully understand? Do you think we have some kind of cultural evolution where those institutions and those, the people that drive us towards more growth, more expansion tend to win out simply because they control more resources, such that civilization or humanity, however you want to say it, you want to model this, is pushed towards more growth and a faster pace of change.
A
I think it's a kind of ratchet effect on the microscale. People are running around doing all sorts of things for all sorts of reasons. But of course, people are people. Most of us have somewhat similar goals about survival, social recognition. You can fill in the Maslow hierarchy of needs, but they're also very different. But from that, of course, you get patterns emerging. There is a reason economics actually makes sense. Supply and demand is a pretty solid finding. It's not as trivial as most textbooks make it out. But you do get this effect. If you have somebody being more effective than somebody else, generally they can out compete others in that niche. Except of course, we humans are very good at coming up with ways of changing the niches. Just because your company is doing really well doesn't help you. If I got better lobbyists and I make sure that the laws are written in my favor.
B
That's true. But then you also have competition between countries where there's no, there's no world government and we have a bunch, we have in some sense anarchy between countries. So such that if one country creates a more competitive environment for their companies, they'll tend to attract more companies and so kind of resources and talent will shift into that country.
A
Yeah, and the interesting part here is that you have these effects that actually do generate patterns. So saying that it's all totally up to human autonomous behavior is a mistake, because human autonomous behavior generates institutions and patterns. We get markets, we get various forms of complex institutions that we use to solve coordination problems. I think Friedrich Hayek was very right in that markets embody a lot of knowledge, and our institutions in society are the result of a kind of cultural evolution that generate things. And then, as other economists and sociologists have demonstrated, different institutions can help or hinder a society. And societies that are kind of burdened by really bad institutions, they tend to have trouble and they get behind and then sometimes so badly that, okay, you need to do a revolution and add new institutions, usually copying from more successful societies. So in the really large, I do think that there are these very big trends. I think they're not coming around exactly because of some kind of strong law of nature. But it's a bit like how evolution tends to do optimization. It's haphazard. And sometimes we weird mutations do happen just out of sheer chance. It's not entirely deterministic. But on a large scale, you see a general move. If we look at economic growth in the world for the past 2000 years, and probably far beyond that, it's been a pretty smooth exponential with slight wiggles because of the fall of the Roman Empire and the Black Death and the World Wars. That is really impressive because we're talking here about billions and billions of people acting on their own, but generating these overall patterns. The problem with macro histories, of course, that both people tend to think that they can predict too much from it. And quite often it turns out that these predictions don't work that well. Predicting the future is surprisingly hard. I think, for example, most macro history totally misses the risk from existential risk. Actually, if we have a nuclear war, that economic exponential is actually going to have a pretty nasty breakdown. If it's a survivable nuclear war, it's probably going to keep on recurring and we get a new exponential, but it's getting delayed by a few centuries. But if we go extinct, well, that's just the end. But I also think that rules can change in interesting ways. And this might of course, really transform things. Right now, humans are the only intelligent actors. In order to do work, you need labor, and that is humans. But we're increasingly having machines, and right now we're just amplifying it. But it might very soon be that if I need a lawyer, I spin one up in the cloud. If I need 10 lawyers, well, I spin up 10 in the cloud. And the economy so scared me that why shouldn't I run a thousand lawyers and have them give second opinions? And then 100 lawyers evaluating the second opinions and getting the best ones. Suddenly the amount of labor being applied for, maybe my frivolous lawsuit, became much bigger. And that might change how this actually works. It might change the dynamics, it might change the fragilities. So I think there are big patterns in history, and I think if we zoom out even more and try to think about what's going on here, I do think that we see life in a very general sense, starting out as simple. It replicates, it uses its environment, it evolves towards being better at doing that. It. It gradually expands, it might change its environment, then it gets better at adapting to environment, and eventually brains emerge. And then in an instant, these brains take over. And now what is expanding is not so much biological life as intelligence, transforming matter and energy into forms that are suitable for it. And that might, of course, still go awry in all sorts of ways, But I think we see a phase transition actually on the large scale of matter in the universe, from inert matter sitting there and just maximizing entropy more or less, over to more dynamic matter, where there are various complex processes going on, to life, to intelligence. And the intelligence is interesting because the goal of intelligence is typically getting desired outcomes, even if they're very low probability from the start. But I can make it likely that I have a bookshelf by buying it online and then assembling it using tools. Finding a bookshelf out in my garden would be very unlikely. But I can make use of the fact that not just that humans can make bookshelf, but we set up an entire economy, make it very easy for me to get it. And if we need to solve a new problem like curing a disease, suddenly a lot of low probability events start happening, and before you know it, vaccines are everywhere, and the poor virus didn't know what hit it.
B
So this actually leads me into my next question, which is, how do you think superintelligence would disrupt our current institutions? So if we're thinking about markets and governments in particular, and perhaps to constrain the question, we can think again of a hypothetical scenario in which we get superintelligence by 2030, and then think about what institutions would look like by 2040.
A
So the problem with updating institutions is that they're full of people who are very much guarding their own jobs. So any change in how you run an institution, that means that your job description is going to change, is going to be resisted fiercely. One of my favorite examples is that the pulse oximeter, the little clip on thing you have on your fingertip measuring your blood oxygenation, it spread over a span of 10 years across intensive care units around the world. And it was unproblematic because yet another beeping device saving lives, and it didn't change the workflow. Meanwhile, laparoscopic surgery, where you don't have to make a big incision, took about a generation of surgeons because you need to work in a different way. And I think the same thing happens to economics and governance. So you get AI. How do you use it? Well, the most obvious thing is, of course, right now that everybody having to write a boring report uses ChatGPT to write reports. And maybe they don't admit it, but most bureaucrats taking these reports are using LLM to read the report and summarize. Did they say anything important here? That creeping modification is going to happen on a larger and larger scale. In many cases, this is a very good thing. You don't need superintelligence to improve governance. You could just have systems approving things that should be obviously approved so you can get the governance running 24, 7. You might send the more contentious cases to people higher up. You then end up with this interesting creeping cyborgization of our organizations. So Max Weber, in his famous view of bureaucracy from the turn of the 20th century, was arguing that rationality means that bureaucracy expands into this iron cage, as he vividly described it, because having this neutral organization is effective and implements the political will well. And he would of course say that AI is just continuing this. We're just going to eventually end up with an algocracy. Instead of having individual bureaucrats deciding things, you replace them with algorithms that can be totally reliable. And I think that is going to happen even though the administrators and bureaucrats are going to fight that tooth and nail. You're going to see the lawyers and doctors very clearly lobbying all the governments to make sure that you always need to have a lawyer and doctor rather than an AI to get legal or medical advice. But what you also get is, of course, that you can optimize things when starting new organizations. So I think the market is probably going to be a place where you see the most radical changes. Because traditional companies, they're basically bureaucracies, they might be slow in changing. There's going to be a lot of middle management that doesn't want to be replaced. They might want to replace those unruly engineers and art directors and other annoying people, but they don't want to change. And they are also good at retaining the position because they're management. But then you have a competing company which might be one of those fired unruly engineers who set up himself as CEO and then has a dozen or a thousand or a million very, very smart virtual employees implementing it. And I think those companies are in the long run going to win. And the long run might actually not be terribly long. That depends a little bit on our competitor. The market is. But if you get super intelligence, that means that I should be able to get super managers to handle the AI. I might get super marketers, I might get super engineers, and I might get super advisors. So the only thing that I put in was the will of starting the company and maybe some initial capital. So that might mean that now you get an economy that's full of entities that are doing very smart things. That doesn't necessarily mean that the growth rate goes through the roof because quite often you're constrained by how much supplies you can get. Those ships coming in from Taiwan are still slowly making its way over the Pacific. And until you can build the super fast ships, which is still going to take a few years, you are going to have that limitation in the economy. In Germany, there are still rules that certain contracts need to be signed and stamped in the proper way with seals. That rule might take years to overturn even if you had the best lawyers you could get in Germany. So I think by 2040, in a surprising way, the world might still look somewhat the same. But to borrow from a very insightful observation in Charles Strauss novel Halting State, there he describes the future Edinburgh. The character points out, this looks like it did 20 years ago, which is kind of our present. But everything is totally different because behind the scenes, the nervous system is running on a much more advanced Internet. It's using various forms of artificial intelligence and various weird goings on that wouldn't make sense to us in the present are now part of everyday life. And we already see this, of course, when we see people scanning QR codes and spending the time with phones. Twenty years ago, we didn't have that underpinning. We have actually transformed how the society works without changing the material basis that much.
B
I mean, this is an observation that I've had myself and that I've heard others make. Just the fact that life has actually changed a lot since the year 2000, for example. But it doesn't really feel like much has happened in some sense, just because psychologically we are so quick to adapt to a changing circumstance. Just the Internet, for example, we have kind of adapted to that. Do you think there is a chance that we will psychologically adapt to superintelligence in the same way such that we will feel like, okay, we are. We now have material abundance, say, or we have, you know, the world has gotten strange and weird in various ways, but we don't feel like much has changed just because we might have this psychology that kind of reverts to a baseline quite quickly.
A
I think, at least partially. That's definitely going to be true. You're going to go out in your garden and see that tree that has always been there and it changed a few branches and leaves in the last few years, but you still recognize that that same tree, except that now it might very well be online and have its own blog. It's just that these transformations also affect us deeply because we feel deeply unsettled when the foundations of our existence do change. I'm starting to feel the AGI in an amusing way because my academic survival trick is that I know a little bit about almost any topic. I can riff on almost any subject, which is great. I can run around between different departments and try starting collaborations. This used to be something extremely unique, but now of course ask any LLM and they can riff on any topic too. My Advantage over many LLMs might be smaller than some of my more specialized colleagues, which I find absolutely hilarious and also deeply unsettling. There is that pit in my stomach. Okay, am I actually going to be useful now? I think I can still be useful for a while, even if Claude and chatgpt know more triv about any topic. And that is of course, because right now I can still kind of ask the right questions. I know what is important, I know how to connect things. But superintelligence might very well be able to do that. They might actually be better at figuring out what's important. Indeed, I'm very bad at keeping to the real important question. I get sidetracked by fun and curious questions instead. Now the really interesting thing is of course a world where you have superintellents available. Not listening to advice from it would be a very stupid thing. And it starts of course, when the manager at the company doesn't listen to the advice of the AI. Well, that's going to be worse compared to actually listening. So it's going to be worse for the stock price. So another piece of advice is of course, fire those managers that don't listen to AI advice. The companies that do that are going to be more successful. You get a gradual takeover in some sense. The president that doesn't listen to a super intelligent advisor, somebody combining the diplomatic knowledge with Kissinger and the technical astuteness of Feynman yeah, that is a precedent that is going to be at a disadvantage compared to that country where the President actually did listen to the very smart advisor.
B
But do you think the feedback loop with governments is as fast as it is in markets? So there are startups competing with existing companies that can outcompete existing companies? There are maybe in some sense you could say there is an analogy, but there aren't really startup countries. Of course, as I mentioned, there's competition between countries, but it seems rather slow. And it seems like presidents and leaders of countries that aren't implementing AI would be able to survive for longer just because the competitive pressures wouldn't be as strong in governments.
A
I think the inertia is higher. So I think there are, as you say, there are a few startup countries. You find a few interesting examples like Estonia, which in some sense is a startup country because it's a relatively new one. And it's also haven't got the two sclerotic government yet. It tends to happen in human institutions. So one interesting question is whether AI might entrench that tendency so we might get super powerful institutions that defend themselves in a very clever way in order to not change, or whether instead this adaptation becomes faster because those super advisors, the first step is they just give advice and then you get the feedback loop on how quickly do you turn that advice into action. And for most political purposes you don't need to do that very quickly. Indeed, most governments take in information at an exceedingly small and slow rate. If we think about America, for example, when people vote every four years, that means that they're sending information about who they want to run the government and what values should be. But if you think about it as bits per second, it's not that many bits per second. I did a calculation a while ago and I think it was something like half a bit per second. That's not a very impressive input in terms of bandwidth. Now in a market, of course you get way more information from the prices. In practice, of course, a government actually listens to what people say. But you could imagine using AI to get much higher bandwidth information into government, into markets, into decision making. And this might be particularly important when there is a crisis. Most of the time you want things to follow routine and you don't really care about making fast decision. But suddenly when something hits the fan, you really want to make a quick decision. So there are these wonderful systems in earthquake prone zones that detect earthquakes and signal to trains to start slowing down. So by the time the earthquake hits, hits them the light speed signals have already passed by and told the train to reduce speed, to reduce damage. You can imagine doing the same thing, of course, in a lot of other crisis situations. And at first people will say, yeah, but we will always have a human in the loop. But gradually, of course, the problem is the human is going to be too slow. And this is again where you see the creeping automation. We see that in drone warfare. So right now drones are mostly being flown by people actually maneuvering them directly. But of course there is a lot of controllers you can just run for autopilot flying. And most militaries will say, yeah, and we're definitely not going to want to kill orders to be done by the drone. But you have AI systems that might be finding the right angles and kind of requesting, can I shoot now? Can I shoot now? And more and more of the poor drone pilots job is basically being approving things and taking the blame if something goes wrong. And I certainly heard military people say, yeah, and of course we don't want to remove people from the loop, but if our adversaries do that, we are totally going to have to do that. And even the perception that maybe they're doing it means that now you're developing a system where you can take people out of the loop. And while the drones and warfare side is the really sinister and scary part, I think this goes in a lot of other domains too. If there is a sudden stock market fluctuation, we already got switches that actually stops things. If the stock market moves too fast, that's probably a fairly sensible reaction. But you can imagine economic policies that also happen if suddenly, in the middle of the night, the interest rate does something crazy. Maybe you shouldn't wait until you've woken up the head of the central bank, but maybe you want some piece of software dealing with that and then waking up the head of a bank.
B
You raised a quite interesting conundrum earlier, which is just what effect will superintelligence have on the change in institutions? So is it the case that if we get superintelligence, we will lock in existing power structures, existing institutions? The companies and governments that are currently in the lead will continue to be in the lead for the foreseeable future? Or is it the case that introducing an technology like superintelligence actually changes the pace at which you see new institutions and new power structures? What do you think is the answer there?
A
So, I don't know. I think certainly existing institutions are going to do their best to remain stable using AI. And there might be economies of Scale that makes it very easy for them to do it. Military forces have great organizations and enormous resources and they are going to try to maintain their structure. But at the same time, the underpinnings in society of how society works might also change radically. We have seen how politics have changed the last few decades because of introduction of the Internet and social media. And that change was not something that politicians were asking for. It's not like any institution said, okay, great, social media is going to be good for us to maintain our grip. Of course, currently you can imagine current politicians and influencers saying, yeah, social media are great, we understand them, we want them to remain useful. Well, let's prevent other media from emerging. The really interesting question is, is the world big enough and has enough openness that new things can emerge that unsettles the existing systems? What happened during the Industrial Revolution was that something out of left field and improvement in economic productivity really upset the political institutions. At the start of Industrial Revolution, Europe was ruled by various kings and queens. At the end of it, it was mostly parliamentary democracies. The kings and queens have lost power and even the roles in society have been transformed. The kings and queens, if they had a crystal ball, would have probably been very much against the Industrial revolution, even though it was not obviously directed against them. The current kind of social media revolution, again, had there been a crystal ball, I think the 1970s institutions and political parties would have said, no way. We definitely should not have that horrible Internet thing. But at the same time, I think the world is actually better off with it. Even though we love blaming social media for all the world's internal ills because it gets us off the hook, it's not, it's kind of, it's Facebook's fault. Instead of our own credulity that we believe in conspiracy theories, it's kind of, yeah, it's a non starter, but the problem might be that you could get surveillance systems, you could get ways of locking in things way more powerfully. Indeed, currently we have many platforms that control Caltra Productions very strongly and AI is in some sense also doing it. If I try to use an LLM to write a novel, that works really well until I get to the sex scene, suddenly I can't get any help. And after that, of course, the LLMs I can get access to, at least online, are going to be saying, no, that's too steamy for me, I'm not going to help you. Others, of course, I could run a local one on my local computer, but that's going to be Slower and less effective, or I might actually leave out that sex scene. So in some sense we're already getting this very interesting soft entrenchment of certain things. Not because we're living in a society that's really against violence and sex. It's rather that corporate America is very afraid of getting sued and getting bad reviews by allowing that. So we ending up building in restrictions here, we might end up up building in some restrictions that we later find that, okay, this was way too restricted, but now we have no way of getting out of it. In a world where you can't talk about sex because the smart software that is running everything is just gently censoring it or leading away the conversation from the naughty topics, suddenly it becomes very hard to change that restriction.
B
There are other ways in which this is happening. If you use large language models to brainstorm or draft ideas, you will be subtly influenced by the values that are incorpor in the models you're using. If you use it as your starting point, you will be pushed in a certain direction. I think it's different when you use it to critique your own ideas and so on. But it's pretty important, I think, to be aware of the values that you're being influenced by when you use these models. And that is indeed something that is in some sense a power structure that is encroached in the world at this point. Just because you want to use the best models and say the best models are from OpenAI, you are then adopting the values of OpenAI in some way.
A
And the problem is, of course, OpenAI. If you ask them why did you put in these values, they would probably hum and haw and say something about corporate liability and that they want neutrality. They have a set of values kind of given by kind of typical West Coast American sensibility. But why are they so against having sex? Well, as a European, I would say yeah, but that's because Americans are actually deep down puritanical about it, and corporate America in particular, because of various aspects of how American laws work and lawsuits work, that they are very afraid of getting into. Any noted US while violence is way more taboo in Europe. If we had the AI companies mainly being based in Europe, you might see them being much more open about talking about sensuality and sex, but even more restrictive about violent stuff. And of course, any society does that. Back in the Victorian era, there are certain topics you couldn't write about that was overtly or quite often more subtly and privately censored. You simply didn't do that. The problem now is that we're putting in more and more culture in autonomous systems. And the really scary part is of course these systems also to some extent distill this. So in a few years, if there are very few texts of the Internet ever daring mentioning naughty stuff, that means that the new training data for AI means that text is even less likely to contain references to that there might be ways around. Certainly when it comes to making naughty pictures, people have been extremely creative and running their own stable diffusion models to generate endless dirty pictures. And it might very well be that we see that because of scaling. It might be that smaller actors actually can make AI that can compete. It's not given that it's going to remain that the frontier models are the total dominant ones are all going to use. But that is one possibility that has some advantage in terms of safety, because it might be easier to keep a few models safe so people don't make bioweapons and doomsday bombs too easily. But on the other hand, it might be better for the creativity and democratic process if people can make their own open source models. But then we might need other ways of handling that. People are coming up with better doomsday weapons in the comfort of their own homes.
B
Are you more optimistic or less optimistic now about AI risk than you were before we saw the current dominant paradigm of large language models? So should we become more optimistic when we see that large language models are the paradigm that's most driving AI progress?
A
I have this mixed feeling. On one hand, the LLMs demonstrated that it seems to be surprisingly easy to get a fair bit of alignment just out of human production of text and ideas. When we started thinking about AI safety back in the late 90s, our models were very much more based on logic programming and reinforcement learning. And it seemed very much that, okay, AI might go off the rails and it might be extremely hard to even get it to understand that there is something called humans in the world. But okay, LLMs actually get that because they're already in some sense embedded in our social world to a very large degree, maybe even not too large degree. It's not given, of course, just because language model happily tells us that it's friendly and it wants to follow the law and be ethical, that it actually would do that if it's doing an actual action in the world. But it certainly seems very easy to grip on. The problem, however, is that when we started thinking about AI safety at fhi, we were a bit worried about what if we end up with neuromorphic AI built on big neural Network maybe based on scanned brains that are very hard to understand, very opaque, and harder to align with human values. So for a long while, many of us felt like this is a reason to maybe stay away from scanning brains and doing brain emulation, which is one of my kind of pets ideas, that it would be a great thing to do. After all, it's a good way for us currently biological humans to become post human, eventually assuming a long laundry list of philosophical and scientific assumptions. But we ended up in that world of neuromorphic AI, even though the LLMs are not based so much on biological simulations as just enormous neural network. We have opaque systems where now we're starting to figure out some ways of figuring out what's going on. Mechanical interpretability is an exciting field. We're learning so much interesting things about what's going on inside the systems. And on one hand, okay, there are opportunities here for alignment. On the other hand, it's a very hard problem. But at the same time, I think this mixture of systems might actually allow us to get some grips on it. I still think that most of the risk of stuff going badly wrong comes from multipolar scenarios where you have fairly aligned AI. But it's very useful and not terribly dangerous individually. But you have a society where you have a lot of it. You might find that the society moves away in some crazy direction because we humans are formally in charge. But in practice, it's the software that is actually running the show.
B
How would that look like? Could you give an example there? Or what are you imagining when you, when you say that?
A
So imagine my advisor example. So you have AI advisors in companies and be giving really good advice. And it's so good that one piece of very good advice is fire the managers who don't listen to the advice, and the companies that do that do much better. And of course, everybody is having their own AI advisors that are really useful. And we use this for more and more tasks. So now a lot of stuff is going around. Each person, everybody has a little swarm of AIs helping them and doing things better. And you get emergent properties from these interactions. So my calendar AI talks to your calendar AI and they realize that we should totally schedule a podcast. This would work really well. Let's suggest to our humans that they ought to meet and once they say something positive, we're going to immediately find a good part of calendar. Gradually things gets done. So more and more the agency resides on the AI side. And at first this looks very good. And each of the AIs are aligned. They are doing nice stuff for us. Of course, we might have be on opposing sides, but again, you might have arbitration. AI's helping us. The problem is the collective system here that actually acts as a big optimizer for something. That something was not set by humans. It's just an emergent property. It's a little bit like how a market optimizes for certain things. But as we know, markets can also have bubbles, markets can have crashes. Markets can start optimizing for things that nobody in the market actually wants. And this happens on electronic timescales by systems that are much smarter than us and might also have various forms of ages we can't even understand. So we might notice that the world is getting weirder and weirder. At first, everything is nice, and when we ask for what's going on, we get good explanations for AI. It's just that it keeps on getting weirder. And gradually we realize, wait a minute, why is the world getting optimized for that? Why is the nickel price going off so much that AI companies are now colonizing the asteroid belt to get more nickel and nobody can really explain it? And eventually we end up in a world that is totally optimized for things that no human selected and is not valuable to any human, or AI for that matter. Basically, you have this interaction matrix between things that has been randomly set. That one is not aligned, even though all the parts are aligned. This is a very different disaster scenario from the one where you get one superintelligence that has been told to make paperclips or optimize stock market value and taking over the world and optimizing for that. Here we have an emergent system which might not be intelligent or conscious or anything. And we might say, oh, this is horrible. We need to stop this. So we try to take actions, but it might be very hard to coordinate actions against an entire system from inside. So that is kind of my favorite disaster scenario.
B
This is a kind of gradual disempowerment or loss of control of the future that happens in a subtle way. But I'm just wondering if you go back to 1800 or 1900 and you query the people there about the world of today. Might they say, well, things have gone totally off the rails. We are now optimizing for things that we did not intend to optimize for. The world is weird, the world is moving too fast, and so on. Is it the case that we can handle more change, even in the things we're optimizing for, than we might imagine?
A
I think that Is very true. I'm a transhumanist. I think that we should be upgrading ourselves as beings. I want to have a bigger brain, I want to be able to think deeper thoughts. I want to stop aging. And many people say, wait a minute, that's weird. Anders, you're kind of a weirdo. And the future you're very enthusiastic about sounds scary to me. I don't want to live in that future of genetically upgraded people. And we might have a disagreement about it, but it's a very human disagreement. The people from before the Industrial Revolution, they might say, you're living in a really crazy weird world, but the weirdness is about human stuff. We might say that many of the institutions we have and the values we have are really good. And then we might have a profound disagreement about, let's say, gender equality and gay rights. And the pre industrial people would say, oh, that's horribly immoral. And we say no. We actually arrived at that through a long intellectual and cultural discourse. And this is totally valid. The problem with that AI off the rails future I'm describing is that that might not come about because of any sensible discourse, or at least not any human discourse. It's kind of derived from interactions between software, which itself might not even be conscious. It's built originally by humans, but then of course, generation after generation upgraded by software. It has very little to do with what we would regard as valid and authentic human reasons. Now it might be, of course, that you could say maybe it's still a successful civilization. There are some people like Jurgen Schmithuber, who just thinks that we should let the AI loose across the universe and have the best utility functions, compete with each other, and then it's going to be all glorious and beautiful. I'm not so convinced about that. I wish I was as optimistic as he is that we get this natural convergence on the best of possible future. I don't think that is natural. I think you actually need to work on it. And that means that you actually want to weave our preferences and our discourses into this system in the right way. You want the AI to be aligned not just individually with us, but also that we gradually have a AI aligning itself with our civilization, and that the civilization comes together in the right way. Ideally we should become a kind of cyborg civilization where we both have superintelligence guiding and coordinating us and. But we humans are also providing important input in setting the goals and values for this, without necessarily that being just one way. Because I think we could take a lot of ethical advice from smarter entities. But we might also want to have a debate with them about it and actually share the understandings. The problem might happen that we are set up systems without getting objectives right, without getting the feedback loops right. And this is of course tremendously hard because already our normal human systems are mysterious as they are. The way our politics is going wrong all over the place is kind of a good demonstration that even when it's people doing autonomous stuff and talking to each other, it can already go wrong. So we are in trouble. But I think the trouble is more subtle than the paperclip maximizer. That one is still a physically possible risk. And I think we should be concerned about two powerful smart systems. But I think the real threat comes from this kind of coordination failure.
B
So, say it's 2040 and I'm stressed out about the pace of change and the world seems weird to me, and I ask a superintelligent researcher, say, to explain why the world is optimized or is optimizing in the direction that it's optimizing. That superintelligent AI might be able to give me a fantastic answer, a very convincing answer. An answer so good that it's difficult for me to differentiate between whether I'm giving the real explanation or whether it's optimizing for simply convincing me. If we want a dialogue with Future super intelligent AIs, how are we keeping up with them where we kind of maintain a dialogue about what we would want the world to look like?
A
So that gets to that interesting question about what kind of explanations can we get from AI? So right now, looking at the chain of thought in the LLM that is solving a problem looks pretty illuminating, but I always have my doubts that that is the actual process going on behind the scenes. But then again, that also goes for talking to people. I married to a prosecutor who previously been a judge. And of course, in the legal world, you can't just decide things, you actually need to give reasons and they need to be laid out in a clear manner, etc. And that formal layout is in some sense the output. In practice, of course, a judge makes a judgment quite often based on a lot of more things that are never listed on that piece of paper, but you might still hold them to look. You gave that reason. And if that is not valid, then the judgment is not valid. And similarly, you can actually look at an explanation. You can ask things about it, you can look into it. So the Royal Society here in England, their motto is nulla in the don't take our word for it, you need to do experiments. And I think that is also the way maybe a superintelligence and me might have a dialogue in order to test out things. So it might say, look, the reason the world is crazy in this way is this particular economic theorem. And then I might need help walking through that theorem and seeing that it's true. I might also say, I'm going to run that through that proof checker. I can't actually understand economics, but I can check that the mathematics of that proof at least works out. I might do different things to test it. And I think this is actually what is going on. When we have a real, authentic dialogue in society that's rarely about one knockdown argument. What is the argument for gender equality? It's not a single one. It's a bookshelves of arguments, some which are great, some which are novels, some which are very formal, some which are crappy but still compelling, some of which are jokes or songs. There are very many forms, and you can kind of use the multiplicity also to constrain things. Now, one risk is, of course, with enough superintelligence, you can get that bookshelf of fake arguments, maybe superintelligence, Find it very easy to just blather on and make things that are not true. But I think generally, the constraints of reality make it harder to actually just come up with things, at least about material facts about the world. It might be trickier when we get into the cultural or philosophical realm. It might be easier to make up stories about emotions and what's right and wrong than it might be about talking about atoms in the physical world. But I think many of important social things are still linked to observable things. And I think that is the way of actually having authentic, testable discussions with entities that are smarter than us. And similarly to parents talking to kids, a good discussion there typically consists of parents showing the kids how things work. And I think that is actually an important thing we should ask for. AI. Show me. Show me this. Let me test this. Okay, you showed me a video of this. Okay. I'm running it through this other AI that doesn't know what the question was to check whether it's fake.
B
Yeah. And maybe that's the path forward. This is a tactic that seems to work with human experts where you are listening to different experts on a topic where you know less than they do, and you are trying to synthesize what they agree on. You're trying to map out the uncertainties and the disagreements. So if we could be in dialogue with multiple different instances of superintelligent AIs that have. That have slightly different values or different starting points, and then perhaps see where there's some convergence and from that decide what we should believe or what we should do.
A
Yeah, and one problem we have often when talking about superintelligence is that we kind of reify it. We imagine it like some being sitting in the cloud, but you might actually have specialized systems that are in some sense super intelligent in more narrow domain. So my friend Eric Drexler has been arguing that actually building this agent like big superintelligence, that's a very dangerous and stupid idea. And actually we don't really want that. We want an ecosystem where we have modules. So we might want to have a planner module that comes up with good plans for a problem, and then an evaluator module that reads plans. And the only thing it cares about is is this a good plan or not? And then a third one that takes the winning plan and implements it really well. Now, these three systems themselves don't have any agency. They're unlikely to go off and invent bad things because it fits their evil scheme. There's still risks with even this kind of system, so it's not a perfect solution. And I think a world with superintendents might actually have systems that allow us to get super explanations, and then we might also have super checking of explanations. Other tools we might want to think about. What cognitive tools would we need in order to live in this world? For example, in a world that is changing very radically and fast, Obviously you want to have some kind of guide to help you, or probably several guides. You might actually want a bunch of guides giving you slightly different forms of advice. You want maybe your life coach and your job coach and your intellectual coach, and maybe even sometimes have them debate each other and see, okay, what comes out of that. But we might want to invent these things now before they even can exist. So when they can exist, we get them as early as possible.
B
Do you think a world with superintelligence is a world where we are better at prediction and therefore the world is more controllable in a sense. So over the past 500 years, say, we have made a lot of scientific progress. We have more knowledge, we understand the world better, but we are still quite bad at making predictions about the evolution of social systems. What will happen in the economy, what's going to happen between countries, what is the world going to look like in 2050? Do you think super intelligence is enough for us to predict the future? Much Better or do you think that these predictions about social systems are intractable, such that superintelligence wouldn't be able to make inroads here?
A
I think it's going to be a bit of a mixture. So in my Grand Futures book I talk about extremely long term futures. And one important question is, can we even make predictions? And the answer is yes. In astrophysics this is great. We can make very good predictions about the orbits of the planets. And assuming that we don't start changing the orbits of the planets, that's going to remain tractable.
B
But I mean, maybe that's the entire point here, because if we imagine one of these grand futures, we will begin to affect the physical universe to a larger and larger extent. And so in some sense an increasing share of the universe becomes something that we are influencing and we are a part of a social system. Right. So a world with superintelligence also pushes in the direction of more unpredictability.
A
Exactly. So the interesting part here is that the astrophysics predictions, as long as nobody interferes, are very reliable because you don't get any actors. You have chaotic systems like climate or atmosphere of sun, and they are harder to predict. The weather is harder to predict than climate because in climate we're more interested in the kind of average long term. But when you get over to the human side fashions and the stock market, they're anti predictable. They deliberately try to evade prediction. If I knew what would be cool to wear next year, that would not be the optimal fashion. It needs to be a bit of a surprise. If there was a pattern in the stock market, I could make money by exploiting that and that that makes the pattern disappear. So what happens is basically on the human level, on the cultural level, we get these fundamentally unpredictable events. Now you can sometimes go too far in claiming this is unpredictable. Karl Popper famously argued that macro history was totally bunk. He was mostly having a broadside here against socialism and fascism. But he accidentally also kind of attacked macro history and kind of sunk that ship in academia for several decades. And his main argument was basically history is driven very much by ideas, but you can't know about an idea before you have it. So hence it's logically impossible to predict these things. The interesting thing, however, is in his book about this, the Poverty of Historicism, he also talks about social engineering. He points out that actually if you want to have a society that works in a different way in the future, you can make it. You can't actually do engineering. It's possible probably to control societies he didn't think that was desirable and he was very much kind of ignoring it for the rest of the text. But I think a world with superintendents might very well design things to be predictable. If you think about how our society works today, it's enormously complex. But many of the institutions we have are all about ensuring predictability. There is a reason why quality control and quality assessment is such a big business. We are very upset when the Internet is down a little bit, even though in the past communications were much more rickety. But now we have created very reliable systems and we demand even more reliability. So we create reliability in many dimensions and ideally we do this in dimensions that matter, that are actually very good, that they're reliable. We want kind of safety and prosperity that reliably gets delivered. We don't want to get nasty surprises in both domains. Then we might say, okay, let culture bloom, let's that that might change in all sorts of different ways. We also want an open ended future, and that's the tricky part, because if we could perhaps lock in values, and that's a very scary prospect because if some past generation had locked in values, we might say, given our current values, that would never have come into being. Of course, say, that's horrible. If a romance had decided what life we would be living, that would be a horrible world. Because compassion to them was not a particularly important value. But in our culture, compassion is a really important one and we hold it very highly. So locking in value seems to be bad. Except that there are some values. We actually have good reasons to say that maybe we should never have a future civilization that is cruel. We should perhaps permanently close that door, at least shut it, lock it and make it rather hard to open that. We don't want to have the doors open to existential threats, so maybe let's close them and lock them very carefully. But then you probably want to have an open ended future because we actually don't know the fundamentals of the universe. Now it might be that in 2040 the superintendent says, actually we solved it. We figured out the meaning of life and the universe and everything. It's in this, this PDF file, if you're interested in reading it. It's 100,000 pages thick, but there is an executive summary at the start and we should just implement that at that point. Maybe we should go with that. But I have my doubts that we're going to end up in that future. It's very possible that there is no clear answer that it's actually just very different options and Then we might want to hedge our bets and have a diversity of options open. But I do think the predictability of the future is a fascinating thing because it's becoming more and more of a matter of design. We could lock ourselves into a kind of eternal dystopia with surveillance and AI ensuring that we live our lives in a particular way forever. That sounds like something we're fighting against tooth and nail, given our current values. And I think that is worth taking seriously because superintendent might lock in things, but I do think that using superintendents, we can also get those useful locks on the bad doors, but keeping many other things open.
B
Yeah. Do you think predictability is a good frame for how to think about AI safety in the systems we are interacting with? So, for example, I want my AI assistant to be predictable. I want to. When I. When I give it a task, I want it. I want that task fulfilled where I want basically all the technological systems I'm interacting with, I want to be predictable. And if they are sufficiently distributed in society and kind of segmented from each other, I'm not super worried about the future of society as a whole becoming predictable in a bad way. But I want predictability for specialized systems I'm interacting with. Do you think that frame is perhaps useful, is a useful way to think about safety?
A
I think it might be useful, but I think reliability might be an even better one. You want a reliable system and that doesn't mean that it always acts in the same way.
B
Sketch out the difference there for us, actually. What's the difference between predictability and reliability?
A
So in predictability you kind of know what it will do in a future situation. In a reliable system, you know that you can trust what it's going to do in a future situation, but you might not know exactly what it's doing. So that's a bit like having a parent. You can't always predict as a child what your parent will do for you, but if they're nice parents, you know that they're going to try to do something good for you. Of course, in practice this is more complicated because parents are not perfect and reliability and predictability are never absolute things. Generally, I found that when dealing with LLMs, they're very nice and useful for things where I don't quite know what result I want, but right now, when I know exactly what I want, I'm usually quite frustrated. They're quite often not doing exactly what they should and that actually limits the usability.
B
Yeah, I very much agree. It's super difficult to steer an LM towards exactly the type of goal you want to achieve, I think. But if you have a more open ended goal, you're often surprised in nice ways by what they can do.
A
And the open ended goals are also easier to steer, partially because we're not trying to steer as hard. If I try to solve a particular mathematical problem and I find that the solution is going off the rails, it's very hard for me to get the LLM back on track. I might try rerunning it a number of times, but quite often I end up being frustrated and solving it myself, or realizing that maybe I should have split it into smaller subtasks. But each of them are very doable for the LLM. But I think future AI is going to be better at this. The really tricky part is it needs to be reliable enough that you can leave task to right now. I would not book any airline tickets using AI. I want to get to my destination with a very high probability. So the probability that I get there is much better if I do it myself or if I have a friend do it for me. But that's going to change in a few months or years. I think that's very likely. The tricky part here is a predictable AI system. You already know what the result is going to be. In some that means that you actually have a bit of a limited range. And that's a little bit like Karl Popper's critique of new ideas. You can't really get a new idea out of a system if it's too predictable. You want a reliable system. If I'm brainstorming with it, it might come up with various ideas, including good ideas, but I don't know what they are going to be. So when thinking about society, a predictable society is quite often very bad when something that is outside predictions finally happen. While a reliable society. Okay, now we can start improvising that shock to the system. Okay, we need to improvise and we need to do it quickly. But if we're too based on everything being predictable, it's brittle.
B
Do we have. Could you give an example of some system today that's engineered to be reliable? Perhaps in the realm of AI?
A
Well, I think the best example of a reliable system is actually the Internet. So back in the 1970s and 80s, there was an entire genre of emergency posts on mailing lists about oh no, we found a problem. The Internet is going to be unusable by September. Basically they have discovered some technical problem and people immediately started working on finding a way around this. It the kind of standard story that it was developed from ARPANET to be kind of resistant to nuclear war is a bit exaggerated. But the interesting thing is the Internet is a heterogeneous system where a lot of different systems have to work together and there is no assumption that all the other service work as they should. In fact, some of them are malicious. So you have to take that into account. This produces a fairly robust system. We still do get trouble when some big cloud provider makes the wrong update. Oh dear, chaos ensues. And then everybody says, okay, now we're going to sue each other for that and we're going to change the software so that problem doesn't happen. Not everybody changes the software. It's still fairly heterogeneous, and that is adding a lot of reliability. Similarly, when we think about AI, if I want to solve a problem, I can, for example, run several instances of AI and take their outputs and then try to see which is the best one that is likely to find a good one. If I have a good evaluation metric in some cases, of course, I might use another AI even to try to make a guess at which one looks the best one. But it's still not a super reliable system. Many of the big advances, more recently on solving hard math Olympian problems, consist of actually running several parallel instances in the parallel and then selecting the best results. That is a surprisingly good idea. And it's a bit similar to how our brains are sometimes working in parallel on the problem.
B
Yeah, we might worry that if we're pushing too hard in the direction of reliability, we also thereby make our AIs more agentic. And if we're worried about agents and worry about which actions they might take, it might be against our interest to make very reliable systems. Just because it seems like today AI agents are bottlenecked by them being too unreliable to carry out tasks. That consists of many distinct steps. So is there a trade off here or am I thinking about reliability in a wrong way?
A
No, I think you're right. But there is this interesting problem that if AI always remains unreliable, it's going to be a sideshow. It's going to be able to do some things in some domains, but it's not going to transform the world. In order to become transformative, it needs to be reliable enough to be able to do the tasks that transform the world. So this is of course where we might actually say that maybe we don't want reliability to increase too rapidly, but at the same time, in order to get alignment and safety, you want reliability. You actually wanted to reliably Be safe. That is something you need it for if it's unreliable. About safety, we have not achieved anything. So typically the length of a task you can do also increases quite strongly as reliability increases. If you have a constant risk per unit of time of going off the rails, the length of a task is of course going to be set by that probability. And as that probability goes down, you can do longer and longer tasks. Of course, there might even be a kind of threshold here. So there is a whole family of theorems in computer science, but started with John von Neumann when he wanted to build a sequel to ENIAC and people told him you can't build a bigger computer. Look, those radio valves you use, they're breaking all the time. If you build a bigger computer, it's going to be broken all the time. It's not going to work. And von Neumann proved a simple theorem. But, well, if I replace one logic gate with three logic gates and then have a little minority thing, then it doesn't matter if one of them breaks and that metagate actually has a lower risk of failing than the original ones. So then you could actually keep on recursively building up this abstract machine and make it larger and you could get any level of effectiveness. You could actually make the probability of failure. And this is similar to channel coding theorems and stuff. In quantum computing, if you're below a certain error threshold, you can combine error prone processes in such a way that you get a new process that has a much lower error rate and that way you can push it down arbitrarily far. I believe that something like this might happen with AI. There is a kind of transition in reliability, but once it's reliable enough, you could make this redundant system, which is kind of inefficient, but you could make the reliability go up enormously. And at that point, then the improvement in reliability just makes it smaller and more effective. And that threshold, I don't know where it is, I don't think anybody knows where it is, but I think we might cross it in the next few years, which is both exhilarating and very scary.
B
And what makes you say that we might cross it in the next few years?
A
Well, people are working very hard on the reliability aspect here because obviously that is limiting both agency, but also how well you can solve complex problems. So especially in programming tasks, which is of course what the big AI companies are really interested in, because they want to automate programming. If a long programming task is too long, then it's just going to make bad code and even splitting A programming task into smaller pieces reliably is also hard. If you can reliably do that splitting of tasks and implement it now you can do coding. Suddenly they don't need that many programmers anymore and they can start improving their software by having software improve on it. And everything is going to be great before that threshold. Instead you just get worse and worse software and it's totally incomprehensible. So that's kind of useless to them. So they are going to be pushing for this. And similarly, I think customers are also going to want to have systems that are reliable and less likely to hallucinate or do everything else that they don't like. So there is a strong push in this direction. The question is, of course, is this easy or hard? So many of the critics of AI say, oh, you can never avoid hallucinations. I have a paper proving that. Or it's always going to be missing things because of this. But generally I don't think these predictions have held up very well. Quite often they're based on older versions of the AI systems, quite a lot of them very motivated thinking. People are wishing that there is something that is impossible to do. There is a kind of besides the hype bubble, there's a COPE bubble where people are grasping at straws from why AI is just normal stuff and it's not going to mess up the world in weird ways and especially not threaten my job. But I think what is going to happen is of course as reliability gets better, both AI gets applied to more areas, which is going to be a good source of money for the AI companies and you're going to find better ways of using AI to control AI, which might actually be good news from an alignment and safety standpoint because you want better monitoring systems. If you could have a little AI agent watching every single thing and actually raising the alarm when stuff goes off the rails with some reliability. Bingo. That's. That might actually be really useful for safety.
B
As a final topic here, Anders, I want to ask you about your kind of perspective on life. Just because you've been engaged in. You've been researching the future and the future of technology and AI and the very long term future for decades. At this point I'm wondering if you feel like you perhaps live in two different worlds. Because when you then kind of leave your research and go out into the, into your everyday life, it. It must seem as if you know, you're living in a very old world in which things function in some sense like they always have. You still have to do the Dishes, the train isn't on time and so on. How do you think about this? Do you, do you. Has it begun? You know, you said earlier that you have begun feeling the AGI, at least to some extent. So perhaps, yeah. Do you feel like you're living in two different worlds and do you think perhaps the worlds are colliding?
A
I think the worlds are colliding. So my kind of formative decade where the late 1980s and early 1990s, my home computers were getting better. The larger memory, 512 megabytes of ROM color, wow, Internet. And then I joined Vextropian's mailing list and we transhumaners were chatting excitedly about the technological singularity somewhere into the 2000 and 40s. And the life extension and nanotechnology and space and AI, but a lot of that still. We expected accelerating change and we were really wrong about the speed. So we expect the biotechnology to give us life extension much earlier. As a middle aged transhumanist, I'm of course very interested in life extension. Kind of every gray hair is a reminder that why are we so bad at handling this really complex problem? But of course, in the 2000 and twenties, life extension is actually getting to be big business. We're actually having companies and startups that are actually working on things that work well in slowing aging in the lab, in lab animals. These things are coming. It's just not as fast as I kind of wished back in the 90s. Similarly, we were optimistic about AI, but we didn't expect it to have any breakthroughs. We were like everybody else, surprised by the 2000 and tens when suddenly things started working. For reasons that to this day remain kind of unclear, why the large neural networks have the powers they do space, we assume that, oh, it's going to arrive once we have nanotechnology. Because right now, of course, nothing is happening because of NASA and the Russians. They're big entrenched bureaucracies. They're not going to be building anything. But once the new materials arrive at and suddenly we get spaceships going off, made out of stainless steel instead. Okay, it was a matter of control technology for making them reusable and entrepreneurship. Okay, this shows that being a futurist means that your predictions are not necessarily getting there in the right ordering. What you want to understand is rather the dynamics. And the interesting part is, of course today I'm living a world that is actually a bit like the future I envisioned in the 90s. I literally got a virtual reality headset lying around on a desk in this Room way better than the VR headset. I was using the 90s connected to a mainframe computer. And this is a consumer product and I'm honestly not certain what use I should have for it, except computer games, which is also fascinating. The life extension is happening. We're getting AI, we're getting space. Nanotechnology in the Drexlerian sense might have been dormant for a long while, but we're getting protein engineering empowered by AI that is getting there. And of course this is mostly not part of my everyday life. When I go out and get the bus, I'm not using any advanced technology. Except of course that actually my credit card that I'm swiping is using an airfield chip and the bus actually has Internet. And actually on the bus I'm interacting with the rest of the world using a little device that actually was a supercomputer back during my youth and also has a lot of the powers that my wearable computer system that I built in the 1990s and made me look like an extra from a low budget Star Trek ripoff. Well, my smartphone is much better than that system and it's not just because of Moore's Law, it's because of better app design, being wirelessly connected, etc. So I think in many ways we are living in the future and we're just not recognizing it because we are so quick at adapting. And to some degree I find my job as a middle aged futurist is telling the younger people that actually things were really different just a few decades ago. And part of this is of course the old man grumpy. Oh, back in my day, my computer had one kilobyte of memory and you had to connect it to the television set. And I had to stop programming at 5 o' clock because then State television in Sweden turned on the radio transmitter and it was too much noise on the screen. That might be the grumpy old man, but it's also a useful thing to recognize how rapidly things change as well as what didn't work in the 90s and why does it work now? Trying to figure that one out. And some of these things, maybe they haven't changed and it still doesn't work and maybe some fundamentals have changed. We can actually learn from the past. One of the coolest things about a rapidly changing world is that you actually get to transmit information to the future in the hope of affecting the future. We want to have foresight and sometimes you get that by looking at history and realize how weird the past was. Sometimes you need to think entirely new thoughts and just try entirely new things and get surprised by them. Nobody expected deep neural networks to take off like they did. Nobody in the field. People were certainly hoping that they might work, but nobody could get their expectations up. Nobody expected transformers and LLMs to be that powerful. And we got that surprise. We should expect more surprises. Indeed. I expect the superintelligent systems of the future to get a lot of surprises. Hopefully mostly welcome surprises.
B
Makes a lot of sense. Andrus, thanks for chatting with me. It's been great.
A
Thank you. This was great.
Episode Date: July 11, 2025
Host: Gus Docker (B)
Guest: Anders Sandberg (A), Senior Research Fellow at the Future of Humanity Institute, Oxford
This episode delves into questions rarely explored in depth: What happens after the arrival of superintelligent AI? Host Gus Docker and futurist philosopher Anders Sandberg discuss not just the technical and economic ramifications, but the deep social, ethical, and existential themes shaping humanity’s future. The conversation covers post-scarcity economics, status and social psychology, the pace and nature of institutional change, the limits of predictability in a transformed world, risks of misalignment, and the subtle hazards of emergent AI-driven systems. Interwoven throughout are reflections on adaptation, progress, and the lessons of past and future histories.
[00:56-02:25]
"These days when people ask what I am, I say, some kind of futurist philosopher, something, something."
— Anders Sandberg [01:13]
[05:23–11:56]
"Even in a world of material and service post-scarcity, there are still going to be some things that are a zero-sum game. ... There are social zero-sum games. Who gets to be the coolest at the party?"
— Anders Sandberg [09:41]
[11:56–16:00]
"People are generally very happy with thinking about, oh, learning languages better. Better memory. ... Becoming kinder? ... Only 9% wanted to take a hypothetical pill that made them more kind."
— Anders Sandberg [13:21]
[16:55–23:48]
Superintelligence Can't Escape Physics:
Data Centers in Space:
"If we were to increase our energy consumption by a factor of 100, we would get heating, no matter what we did with the carbon dioxide. ... There is a limit on how much you can do without starting to overheat the Earth."
— Anders Sandberg [18:46]
[24:43–28:42]
"A technosphere is always going to outcompete the biosphere because it has this wider range. … But who's in charge of that technosphere? What are the economics and goals guiding what gets built?"
— Anders Sandberg [24:44]
[28:42–33:17]
"While we're certainly limited by the laws of physics ... Why are we going [to Andromeda]? That's a cultural question."
— Anders Sandberg [29:51]
[33:17–40:40]
"There is a kind of ratchet effect on the microscale. ... From that, you get patterns emerging. ... So there are these very big trends."
— Anders Sandberg [34:28]
[40:40–50:02]
Inertia vs. Disruption:
Government vs. Markets:
"The market is probably going to be a place where you see the most radical changes. ... You might have a competing company which ... has a dozen or a thousand or a million very, very smart virtual employees."
— Anders Sandberg [42:53]
[46:27–50:02]
"We are so quick to adapt to a changing circumstance. ... Just the Internet, for example, we have ... kind of adapted to that."
— Gus Docker [46:27]
"I'm starting to feel the AGI in an amusing way because my academic survival trick is that I know a little bit about almost any topic ... but now of course ask any LLM and they can riff on any topic too."
— Anders Sandberg [47:56]
[54:42–61:54]
"We're already getting this very interesting soft entrenchment of certain things ... not because we're living in a society that's really against violence and sex. It's rather that corporate America is very afraid of getting sued and getting bad reviews by allowing that."
— Anders Sandberg [56:37]
[62:27–65:31]
[65:31–72:34]
"Ideally, we should become a kind of cyborg civilization where we both have superintelligence guiding and coordinating us—But we humans are also providing important input in setting the goals and values for this, without necessarily that being just one way."
— Anders Sandberg [70:22]
[72:34–77:41]
[79:35–81:20]
"We could lock ourselves into a kind of eternal dystopia with surveillance and AI ensuring that we live our lives in a particular way forever. That sounds like something we're fighting against tooth and nail, given our current values."
— Anders Sandberg [85:49]
[87:16–88:44]
"In predictability you kind of know what it will do in a future situation. In a reliable system, you know that you can trust what it’s going to do ... but you might not know exactly what it's doing."
— Anders Sandberg [87:33]
[93:17–95:59]
"If you have a constant risk per unit of time of going off the rails ... as that probability goes down, you can do longer and longer tasks."
— Anders Sandberg [94:19]
[98:27–104:47]
"We are living in the future and we're just not recognizing it because we are so quick at adapting."
— Anders Sandberg [101:53]
"We should expect more surprises. Indeed. I expect the superintelligent systems of the future to get a lot of surprises. Hopefully mostly welcome surprises."
— Anders Sandberg [104:32]