
Esben Kran joins the podcast to discuss why securing AGI requires more than traditional cybersecurity, exploring new attack surfaces, adaptive malware, and the societal shifts needed for resilient defenses. We cover protocols for safe agent communication,
Loading summary
Espen Kran
Absolutely foundational to AIs being useful and functional for use in society is the fact that they are secure. If they are not secure, it's ridiculous to even try to use them for any commercial activity, any national security activity, or anything else. I think something that is actually worse and more dangerous than AI risk right now, and this might come as a surprise, it is the cult of inevitability. Okay, it's inevitable that we won't be able to convince politicians, but you never talk with any politicians. Like, come on. And the same now of like, oh, we cannot pause. We cannot create this distributed security system, and so on and so forth. There's like two ways you can be optimistic. One is by not realizing the danger. Another one is by deluding yourself. And then there's a third option here, which is actually see what is the realistic version of what's going to happen over the next years and how can we shape that ourselves.
Gus Docker
Welcome to the Future of Life Institute podcast. My name is Gus Docker and I'm here with Espen Kran, who is the co director of Apart Research. Espen, welcome to the podcast.
Espen Kran
Thank you so much, Gus.
Gus Docker
Fantastic. All right. You have a wonderful essay on AGI security and what's needed for security in a world with AGI. So first, maybe sketch out for us what's the difference between cybersecurity as we know today, and defending against AGI level threats?
Espen Kran
Well, largely, I think people and listeners of this podcast will get what I mean when I say AI risk and the potential risks that come from introducing general intelligence. In my view, as well, this comes with a lot of implications for security and the defenses we need to put in place in institutions, in the systems and in the algorithms themselves that run our society in various ways. And specifically, we are talking about this new paradigm where everyone's familiar with cybersecurity. It's this layer you put on top or this thing you install in your firmware that someone else runs or that you run or you have a team inside your company running. While this new paradigm is really about embedding it into the foundations of every single thing we're building. And in my view, it's also a question of rebuilding our societal infrastructure. It's about rebuilding it in a way that is compatible with introducing both very, very comprehensive and advanced tool AI, and also controlling and monitoring various general intelligences. So that's really the foundation of that. It's about rebuilding the societal infrastructure in a new way that's more sustainable to this and also realizing the many new attack Vectors that come with AI.
Gus Docker
Yeah. I think a good way to show what we're talking about here is to discuss scentware, which is something quite different from a traditional virus. So maybe explain what scentware means.
Espen Kran
Yeah, so the concept of scentware was something I formulated in an earlier explanation or earlier exploration of this topic, which is really about how we've seen all these various viruses and malware and Trojan horses, et cetera, come in and do a lot of damage. So I think it's been over a hundred billion dollars for something like Nutpidge and some of the other viruses, malware that have been released that are dumb malware. It's dumb viruses that replicate and propagate through computer networks through extremely simple programming. Now, what we will see in the future and the natural evolution of these is that it will be so easy to give these viruses and this malware sentience in one way or another. And you can say if it's sentient or not, it doesn't really matter behaviorally. It's about whether it self improves, whether it can manipulate users and humans, people in the system, while it also improves its ability to do cyber offense. So this is just one example where the cybersecurity itself is very much at risk. Where scentware is then this like sentient malware, as I called it in the post itself.
Gus Docker
So what we're talking about here is software that is adapting to new situations, trying different options, that is much more flexible than what we know from traditional viruses, and therefore much more difficult for us to and to defend against. So what is it that we need to defend against these types of threats?
Espen Kran
If.
Gus Docker
Could you sketch out the different parts of the infrastructure that we're interested in building up?
Espen Kran
Yeah, so I think largely we are looking at, at every single level needing a new type of defense. Like, it's going to be on the societal level, it's going to be on the individual level, it's going to be everything in between. It's also going to be completely new attack surfaces, as it's called in cybersecurity. These new ways that we can now be compromised. So this includes cognitive. It includes information stream control and these ways that companies that run the AIs today can actually control what information you have. It's the types of cognitive attacks that you see when you become very infatuated with a potential AI, girlfriend or boyfriend. And it's also the manipulation of our democracies and society beyond just all of the individual cybersecurity concerns.
Gus Docker
So a kind of extreme version of what we see Today with social media algorithms or that taken to the max.
Espen Kran
Exactly. I think very recently Sam Altman released a post mentioning that social media algorithms were the first misaligned AI. And it's a great exercise because I think society has woken up to the fact that this is the case and has then engaged quite deeply with this topic before we've gotten to this point with AI as well, and hopefully that transfers over too. Of course, we're not just talking about personal and democratic security. We're also talking about the potential risks, as Rand have previously reported in some of their work on listing out what are the national security risks of AI as well. So we're not just constrained to the personal and societal level. We're also constrained to the personal security and the actual security of all citizens across the world in terms of individual actors getting access to the ability to create bioweapons, the ability for models themselves, for example, very, very complex versions of centware to compromise democracies and potential financial markets and much more like this. And there's a lot of literature on the web on this, of course, so I recommend people read it if they're out there.
Gus Docker
Do you foresee the solutions here being personal or being implemented at the societal level? So of course it would be optimal if we could implement this in a universal sense. Do you think it's possible to defend yourself personally, personally against this level of threats as we might today defend against various kind of cybersecurity threats? People can adapt security practices that can protect themselves. Is it possible to be personally protected from AGI level threats?
Espen Kran
So today it is probably not. There is a way right now where I've talked with a few founders and engineers that are working on methods to democratize personal security as well. So in this you can purchase a service like security as a service fundamentally where you get the same level of security as someone who's very rich and might have like five people maintaining their personal security or Elon Musk having a group of bodyguards or something like this, but in this way that it, that it utilizes the digital intelligences to actually make that much cheaper. So we are not scared I think as well to actually use the technology itself for our security. And this is where personal security can become much cheaper too. I think we have to be realistic that this only covers a few of the potential issues that I mentioned and potential threats and that a lot of it will be on societal scale level. So we are talking about, for example, the energy infrastructure build out right now, the different data centers and compute centers that are being Built, they need a completely different foundation for security than they've had before. It's this, you know, we talk about it in terms of safety levels. So SL3, SL4, SL5, how safe does it need to be? And as already at SL3, you want it to be secure against national adversaries. Whereas SL5 is this new concept coming up now, which I think it's inevitable, that every single data center that runs any type of frontier AI needs to be SL5. And so that is definitely societal scale. You've seen the size of the $500 billion investment in Stargate, for example, being a good case for what is the type of investment that's also needed for a similar level of security.
Gus Docker
Yeah, and I guess there's a limit to how protected you can be personally if society is crumbling all around you, or if you have kind of widespread, widespread chaos. And so at a certain point, we must admit that we need to solve these problems collectively, but maybe sketch out. So you mentioned protecting yourself cognitively and protecting your, I guess you called it, information stream. So not being manipulated by language models, for example, in today's case. But what are other categories where we need protection?
Espen Kran
So this is, of course, we need protection on the cyber level as well. We also need, like, physical protection. So when you talk about the data centers, for example, you talk about building fences and having proper monitoring. When you talk about that in the individual case, there's also a series of technologies that needs to exist. I think much of it will also look algorithmic in nature. I usually have this way of framing it where you don't want to put it on, patch it on afterwards. You don't want to patch on AGI security afterwards. You want to make it foundational, you want to put it into everything. And so I think, and this is like a bit of an interesting way to look at it, I have this vision where every middle school child knows about fundamental cryptography because they have to. Because they have to know which fiber optic cables running under the ocean are the ones we can trust, which ones are compromised. Now, how can we separate various parts of the Internet very quickly to make sure that compromised areas are shut down within milliseconds through a decentralized monitoring and, like, verification scheme. These kind of things, I think, need to be imagined. And there's a lot in there of pure innovation that needs to understand what do all the threats actually look like, and then provide realistic, realistic defenses against it. And realistic means very comprehensive and potentially.
Gus Docker
Societal scale, as I mentioned, and probably redundant, too. So different layers of security, where if one fails, you have the other to make sure the entire system doesn't collapse. These seem like government level issues where you would need government level effort and perhaps government level funding to solve these problems. But at the moment, at least much of the funding is private. Do you think the private funding can scale to meet the demand to meet the level of security we need?
Espen Kran
I think the big problem with private investment is mostly about its incentives and not its size. The fun fact here is that Stargate as a project or the yearly investment into data centers is an order of magnitude or more in size of the whole Apollo program. And so this is just like it's massive investments and it's investments at the scale of society. There's this whole point that media today and journalists are very, very number blind because 500 billion. Oh yeah, that's the same as 500 million. No, no, no, it is much, much more. Right. This is like an absolutely insane level of investment that has see no precedence in history. And so we are already at the stage where private companies are competing with the government in terms of pure numbers and pure size. And there's other issues with this, but that's beyond the scope of this chat. And specifically the problem is that their incentives are to both create stronger AI and to create inference compute. It is not to make the data center secure and it is not to make the AI secure. That is not what they will earn money on. Of course, I will repeat again and again that absolutely foundational to AIs being useful and functional for use in society is the fact that they are secure. If they are not secure, it's ridiculous to even try to use them for any commercial activity, any national security activity, or anything else. And it would only happen as a result of a race. Which is also why I think this needs to be a very, very cross border issue and a very large international negotiation question as well.
Gus Docker
There is the issue of when we try to monitor AI development, when we try to understand what's going on, to gather information, we risk creating a system of surveillance, especially when we involve governments in this project. And so part of AGI security I think you write about involves AGI privacy also. So perhaps let's start with the question of how is it that the AI oversight practices that we are interested in implementing today might lead to a form of surveillance that we're not interested in?
Espen Kran
Yeah, I think many of the governance proposals are of course taking the risk at face hand and thinking realistically, which means that you have to implement very, very strong controls to make sure that people can't misuse the models, for example creating bioweapons or for manufacturing explosives or various dangerous things like this. This also means that we're at risk of creating an even stronger surveillance state. Because now you had the post 911 justification for surveillance being that there's more terrorists now there's fewer terrorists, but they have much more power individually. So you need much stronger surveillance and much deeper surveillance of every single individual if you actually want to have proper governance. The opportunity we have right now and during the next couple years is that the supply chains of AI and the released AI is constrained enough and focused enough that it's a few actors that we need to convince and work with to create a system where we can avoid a surveillance state. The classic example is that every single chip comes out of this one factory in Taiwan, tsmc, and now a couple more, but still the exact same people. And, and all the machines that TSMC uses comes from one factory in the Netherlands, asml. And this is the classic like yeah, there are two companies that are foundational to this industry and of course China is now developing alternatives here, but they're still a bit behind. And so we can still discuss and try to constrain this before there's too many. For example, open source models that are extremely capable on the market or just on hugging face are similar that then create much more powerful non state actors where they can run it on their GPU. You know, they can just own 10 of those. And suddenly you do need a surveillance state to actually make sure that the risks are constrained. And of course that we don't want this because there's so many issues with surveillance states that come up from this as well.
Gus Docker
Yeah. And what's the alternative then? If, if we're really racing towards a world in which individuals are empowered to impose massive risks on, on basically everyone. We could talk about making explosives or creating engineered pandemics or there are many examples of what might go wrong here. If we're racing towards that world. What is the alternative to kind of intense surveillance in a centralized fashion?
Espen Kran
So you hit the nail on the head there. It's in a centralized fashion if it's a surveillance state. And so you need a lot of trust in the central actor and the central actor needs to be extremely competent. And there's this single point of failure effect that you need to mitigate. And so what's the non centralized version? It is a decentralized version. The classic story here that I like a lot is the story of the Internet. The early story of encryption, where today whenever you go on a website you have your HTTPs connection which the S stands for secure, right? And it is this connection that makes sure that your information, when you transfer it over is encrypted and that no one can read it on its way towards the server. This is very, this is very unique. This is a weird thing that governments allowed this. And the early story of this is basically that the government wanted to stop it, wanted to make it illegal to use encryption because they wanted the centralized control and surveillance of all the communications because they thought, you know, if you have something to hide then you're a criminal functionally. And the banks were then like, hey, we need to secure our financial transactions. And this like very, very strong financial player then created the incentives and the lobbying power to implement all these open source algorithms for encryption. I think it's the same. We'll see now that unless we want non functional AI that we cannot use, we need to design these new types of secure transmission systems, these new types of decentralized verification, zero knowledge proofs of capabilities, various capability based constraints, inference, time verification and so much more to actually secure this stack where we are every single one of our computers today are part of the security of the web. Whereas in the case where it's centralized control, it is one server on government grounds that is responsible for all security. This does not seem sustainable. We've, through this distributed network and peer to peer network of the World Wide Web, we've created something extremely unique and I think we can repeat that success. And some of the Web3 work is of course great explorations of this and you can critique it however much you want, but some of it is actually relatively good for these types of digital contracts between agents and anonymous peer to peer interaction and data management and verifiable transactions. And so you can imagine that going much, much deeper now and at a much faster rate than we would otherwise anticipate.
Gus Docker
I mean you talk about the Internet and for me it's kind of an open question how secure the Internet is, how resilient the Internet is. How do, how do you see this? Do you think the Internet. Is it coherent to talk about taking the Internet down or the Internet going down in a way, given that it's a distributed network? And also isn't it isn't the Internet basically the kind of Internet stack is very janky and it's like complex and it's maintained by random people all over the world. So is this the model we want.
Espen Kran
For.
Gus Docker
Controlling or steering AGI? Because you can imagine people listening to this and thinking what we actually need is kind of strict government control. We need something like the Manhattan Project. We need this to be secret, we need this to be centralized. We need to control information in and out. Yeah. Sketch out the two visions and explain why the distributed vision is the way forward.
Espen Kran
So I think the distributed vision is also by default in a world where open source AI models are public, is by default the more secure solution. I do think that there's strong requirements for how it's developed. And you are also hitting on something right here in terms of what is the foundation of the web. Many cybersecurity engineers and professionals today will tell you that hey, the Internet is already great for agent management. Like the actual security protocols are quite fantastic and very unique. Like the distributed nature of this is much more robust than I think people anticipate. For example, when you've seen potential breaches of open source software, you have like, you know, multi year attempts at getting in as core contributors to open source repos that are foundational to the web or foundational to other software that are then detected at like the last second because there is this distributed control of the code base. I think the worry would be that you have a centralized code base, for example, within Microsoft or something that then doesn't have this type of verification. Because you can assume now that every single software that has a breach will be breached. And it's this question of it's attacks as default instead of attacks as occurrence. And this is what we need to be ready for. And that requires much more scrutiny of every single part of the stack than before.
Gus Docker
Why do you say attacks as a default as opposed to attacks as something that occurs once in a while?
Espen Kran
Well, it's because, and I have another, like another piece that lays this out as well, which is called the Expensive Internet hypothesis. And basically it's about how every single interaction will be an attack, like a potential attack with high probability. And so you need both counterattack and defense to work in parallel. And the example I use in there is like an email client today, maybe 5 out of 20 of your emails might be something akin to junk or spam or something like this. Some inbound you don't want and maybe one or two or out of out of 100 will be an actual scam that tries to extract and extort money from you. And in this case we're actually going to see a much, much more competent attack that is going to use all my personal information, all the information I have about myself on the web to make an attack and then you're going to see my like defense AI do the opposite, right? Like figure out are these links correct, what, what do the links do, et cetera and then try to constrain the attack itself. And then you can have like potential counterattacks where suddenly every single email begins costing like maybe 4 cents to receive instead of today where it's like 0.0000 something cents, right? And that is the kind of world where it's just much more control that's necessary and much more of this like multi interaction a taxes default than a taxes occurrence.
Gus Docker
Do you think the vision of distributed safety is easy to sell to governments and companies? Because it seems like we are, or actually you tell me, right, which direction are we moving in right now? The nature of AI right now is such that it requires massive investment in hardware. This lends itself to in some sense to large companies and perhaps even governments at this point. How is the distributed side of things going? And do you think that leading AI companies like OpenAI or Anthropic, Google, DeepMind and so on can be convinced of the distributed safety vision?
Espen Kran
I think it'll be a hard sell and I am not idealistic here. I think we need to do the thing that works and I do think distributed security works better when we are talking about a system that has hundreds of thousands of devices in it. I do think also if you then have five devices or five agents like tsmc, asml, Nvidia as like part of the supply chain, it's much easier to handle that locally and handle that in a centralized manner, right? In our case I do think that it's very obvious that government is going to fight against it by default. I also think on the other side, why did it become standard to use encryption on the web in the 80s and what are we looking at today where I think all companies across the world, right, every single CEO and president of the Fortune 100 companies are like we need to implement AI everywhere. And every single one of them has their compliance department shouting no, no, no, no, no. You are not getting to implement any of this because it's not going to be insurable. We're not going to be able to get any money if it's destroys anything. We're going to make promises to people, legal, liable promises to people about selling a car in a chat box, for example. I think that's the classic example. And so you're looking at a world where you sort of have the same incentives as you had with the banks doing financial transactions that Today, all the companies are like, damn, we want to use this, but we cannot. We cannot. And therefore, we need some sort of very, very robust security that is much stronger than what a centralized entity could, could command, basically.
Gus Docker
Also to solve some of the liability issues and some of the insurability issues that result from, from, you know, you can't just say that we have, We've now contracted with this company and we are using their AI agent service to, in part of our sales process. You need something that's, that's more neutral in a sense, and that can be kind of checked by technical experts, where you have some form of consensus that this is something that works and something that can be broadly trusted. This is a bit of a downer question, but is it too late? Do we have time to steer or to change course in the direction of a more distributed safety?
Espen Kran
Well, I am generally optimistic. And in this way where if everyone listening to this podcast right now takes action towards this in every single position they're in right now, in every single way they can, then we could get there, right? Like, if people take it seriously now, then we can get there. But then I think something that is actually worse and more dangerous than AI risk right now, and this, you know, might come as a surprise. It is the, like, cult of inevitability. Like, okay, it's inevitable that we won't be able to convince politicians. That was an issue 10 years ago. You know, we can't convince politicians. But you never talk with any politicians. Like, come on. And the same now of like, oh, we cannot pause. We cannot create this distributed security system, and so on and so forth. I think a historical example here that also brings me more optimism is this point on the ozone layer depletion back in the last century, where suddenly you're like, oh, there's this one scientist that figured out that the ozone layer is depleting because of these gases. That's a bit weird. Let's all agree not to do those gases anymore. Okay, shake hands. Solved. This is like, massive global win, right? This is something that happened and it took a few years and so on. But one of the key things there, one of the key parts of that historical example is that you actually had alternatives that were better and sometimes cheaper as well. Where today we need that, right? And the actual alternative in every single position. So every single CEO, every single cto, ciso, every single company, government individual and their research staff, they see this issue now, like, they'll see it over the next couple of years. And so if there is an open source library that provides the solution, then they're going to use it. And I think a very good example is Cloudflare as well. Cloudflare like commands a very large fraction of the total Internet traffic that goes through their servers. And they are like a massive boon. I'm a massive fan of that company because they charge nothing and they ensure the web is secure in so many different ways.
Gus Docker
You might want to explain what Cloudflare is just for the audience.
Espen Kran
Yeah, Cloudflare is a private company that sort of manages various, like just manages your Internet traffic. So sometimes you might have visited a website and it says, oh, we're just making sure you're a human. And you click the sign off there and that is Cloudflare. That's Cloudflare registering that there's a lot of visitors. Suddenly there's a surge of visitors, which may be a bunch of bots trying to destroy a website or something similar. Okay, we're just going to check every person coming in now and then we'll remove it afterwards. And this makes sure that this has probably saved millions of servers at this point already. And it's relatively cheap. It runs every website on the planet, basically. Not like, not actually, but it runs a lot of the traffic to it and they have a lot of different services. And I think they're also one of the pioneers in this AI control paradigm that we're talking about here, AGI security. They've been, because they have such a voice, right? They've been creating this agent labyrinth concept, this technology where basically we as website owners, we're like, I don't want agents to troll through my website. I don't want them to take all that data for pre training data or for their chatbots. Okay, well all of the, there is a flag you have on websites to make sure that people know that you don't want to be scraped. Obviously all the companies ignore this. Like if they download every illegal book, they're also ignoring that. And so they've generally been ignoring it historically throughout the last years as well. And so what Cloudflare has developed is this way where if you have that flag, then any agent that comes in and that's something they can detect because of their network classification abilities, then they will be sent through like a content labyrinth that is just fake data and the agent will just look at this as a real website and they'll get all this trash data. And so suddenly you have a very large price for scraping a website that has set. It doesn't want to be scraped. And this is kind of the defenses that you'd need, right? The kinds of, like the stack that makes that possible is some of this as well.
Gus Docker
Yeah, you need like this in ten or one hundred or a thousand different variations of something like this. Paint us a picture of the upside here. What is the positive vision if we actually, if we succeed in creating distributed safety?
Espen Kran
Well, I think it depends exactly what you'll mean by upside, of course. But if we're able to create this distributed system also where value flows from the individual humans, like citizens of societies, into the operation of whatever power controls the governments at that point, then we can actually create a world where people are secure. People need to be very, very aware of their security and aware of the foundations of the infrastructure we work with. And it needs to be managed extremely well, but we can actually continue persisting and like, live a good life potentially. And. And then comes in all of the various essays that the AGI company executives have of course, been promising where, oh, the second we have AGI, everything will be perfect and there will be flowers everywhere and no one will be suffering, etc. Etc. And I'm hopeful of that. But I think realistically it's going to look much more like a type of cypherpunk future. If anyone's familiar with that, you're familiar with cyberpunk itself. A cypherpunk is the equivalent, but for cryptography and for like these protocols of how the Internet works that was also developed around the same time. And this is a world where, as I mentioned, like all middle school students, they learn their basic cryptography and, you know, the necessary things they need to navigate the web. And suddenly you'll see, yeah, okay, 50% of the world's servers have been shut down within milliseconds. And yes, we can't trust these cables anymore. Like, like that kind of thing is just. It's very, very different from today. I think realistically it's going to look somewhat like it. And then I think we have to be aware that by default, a lot of the worlds that are in the future, it's not like either we have catastrophic risk that may disrupt and destroy societies, or it is utopian, beautiful and we can all live freely and happily. There's all of this gray zone in between that I think is relatively probable and that we also have to fight against, which is simply a question of, like, who controls that power? Is it autocracies? Is it, you know, dictators? And are humans part of that? Like, are citizens part of that equation? Do we have some sort of like, like, as AGI becomes able to replace me, do we have a way that I can still be of value to the society? Right. Be of value in a way that I will receive money? Because I think many people hope that, oh, we can all just have universal basic income as well and live happily ever after. But in reality, it would only come if humans have value to government and have value to the ones in power. And you always have to, of course, be mistrustful of power, even in a government that is extremely competent. So even in Denmark, for example, like, everyone complains all the time about very, very specific things in government because those complaining actually then, you know, things get changed as a result. But it is still one of the most functional governments in the world, given that it's, you know, a monopoly for the citizens themselves. So this kind of thing we need to make sure is kept in check. We need to make sure that we continually iterate on a type of, like, power dynamic that we can be happy with to avoid some of the very bad gray zones as well.
Gus Docker
There's something concerning about the incentives in a world in which people are no longer needed for their labor and in a world in which human labor isn't valuable, or perhaps maybe it's still valuable to some extent, but the value of human labor has decreased massively. Therefore, there's nothing, there's not a lot to tax there. You can't generate tax revenue from taxing human labor anymore, at least not to the same extent. What incentives do governments have in that world to provide for their citizens and to make sure that citizens have their rights preserved and so on. That is a concerning question. And if we can remain valuable and remain perhaps data sources, perhaps allocators of capital, perhaps we can provide preferences, because that's also valuable, at least to a certain extent. If we can remain relevant, that's something that's valuable to think about. How do you think about us remaining valuable? Because it's not just about us being secure when we navigate the web. It's also about what is it that we can provide? Machines that can increasingly do everything that we can.
Espen Kran
Yeah. And I think some of the answers have been mentioned by, for example, Lukas Pedersson and Luke Draco and Rudolph Lein and various different posts on the Web and Time magazine and so on. These ways that we are entering a sort of intelligence curse where humans won't have a lot of value as labor in the future. And this seems to be relatively straightforward now. Then you look at today's society, and then you look at which jobs require labor and which jobs don't require? Like, wouldn't you classify as labor? And you might say musicians, athletes, soccer players, you know, all this stuff that looks like stuff that isn't labor specifically. It's not like a trade for value, but it is a trade for, like, it is a trade for sort of cultural or aesthetic value itself. And obviously the industrial society, reducing the number of people required for sustenance for food has meant that we have many more artists, we have whole museums, we have beautiful statues, we have all this stuff. And similarly, that might be the place where humans can play a specific role for the value of being human itself. Another example I've seen is the example of being an existence proof for other humans. So if you meet a very, very enlightened monk, for example, that is just happy all the time and very, very engaged and very lovely to chat with, is that a case where actually there's value in and of itself of that person being a human that has reached that state and I can, as an agent that is similar in nature, reach that state as well? Like, these kind of things of alternative value creation are very, very interesting and I think worth exploring. I think then being realistic, there are ways that humans are like, where citizens are not necessarily by default going to be powerful here. So if you look at North Korea, this is a system where there's a systemic oppression of like 90% of the population that is outside the party proper. And you keep like a low sustenance. And people believe in conflict and in hunger itself because this is very useful to control a population. They can't do a riot if they're hungry. And if they think there are enemies that the party is controlling against, then in the US in democracies and so on, you have a system where there's a Constitution, there's a sort of rulebook, you can call it. Legalistic philosophy is very, very important for our future as well. The legal code and so on is very important for us making sure this goes out well. And hopefully we can have AGI respect that too. And the companies themselves, of course, without being idealists. Again, where the Constitution protects the rights of the citizens to vote and to do a lot of different things that are great. And if we didn't have the Constitution and if the system itself wasn't built up around a document that would, that sort of requires very specific things to be top priority before the president, before the Senate, et cetera, before the House, then you would have a dysfunctional system where humans are put to the side where the citizens are put to the side and the ruling elite is in power. You've seen this, of course, not work out too well in terms of labor unions in the US being both very powerful and very not powerful in various industries in terms of what they're willing to do, where the labor unions, they traded their power as labor by saying, we're not going to go to work with benefits and with workers rights and with removing child labor and so on, and we're not going to have that power in the future. Right. And similarly, you know, you might have the second Amendment that allows the population to have guns to protect against state invasion or whatever, but in fact, this is not like this is not enough in a world where the government has, has fighter jets and autonomous drones and all these different systems and technologies that are way beyond the scope of a single human's power. And I think that's of course also what's misunderstood in the Second Amendment today and even in countries without the second Amendment. You know, in Sweden, like, there's more guns than people, there's more hunting guns than people, even though Sweden is, you know, a very nice place to be and Switzerland as well for that matter. So in this way, it's how can we sort of create this balance algorithmically through the legal code, but also through the decentralized algorithms that control our defenses and security and personal value itself.
Gus Docker
So yeah, it seems somewhat fanciful to think that our very old legal documents, laws, constitutions, various kind of legal case law, basically for that to survive into the future, it would have to, I imagine, be consumed and kind of recreated in a form that's understandable to machines. Of course, machines today, large language models today, can read and understand legal documents, but for it to be encoded at a level where machines can't act contrary to a constitution, that's a whole other issue. There's a whole other level of challenge. Kind of, practically speaking, how do you think we will go from the English language, from having documents written in English to having something that's encoded into the values of the machines of the future?
Espen Kran
Well, the old new joke goes that the hottest programming language of the future is the English language. And I think this is just the case here too. We have some preliminary experimentation that happened, I believe, last year already on trying to simulate a lot of agents interacting and then actually giving them the ability to sign contracts with each other and then having some sort of like reward signal or some sort of environmental controls for why will that be enforced? And how will it be enforced? And why do you need to follow it. And in the same way that the legal code, in many ways, like contract law, is very powerful in our human society, is that it's mutually beneficial for contracts to be upheld. For you and me, if we engage in a contract, I want you to uphold yours, you want me to uphold yours, mine. Because we have this multi turn dilemma that we want to play out, which is called society and living and life. And if I suddenly violate a contract, then there's some system that comes after me. Now that simulation had that system come after them.
Gus Docker
Right.
Espen Kran
It's the environment simulator as well. But in our case and in the future too, we are probably going to see that the enforcement mechanisms are going to be very strongly algorithmic, while the programming language of the legal code that agents follow and the contract law that they run after is probably going to be in English. Right. So specifically I think there's examples from the protocols that are being developed today of Google's agent to Agent framework and the model context protocol from Anthropic. These are like somewhat of a, you know, you can call this if you assume two agents are like two humans interacting, then we have some requirements for how we communicate that involve I'm not going to hit you, threaten you, or like cause harm to you in various ways. That's a good principle. Okay. These protocols make sure that you can only communicate in very specific ways that are monitorable, which are not coercive, which are not manipulative. And you need a lot of controls for this, of course. And then the specific things that you interact on, that could be contract law. Right. That could be the English language itself. So you have the enforcement mechanisms being algorithmic and the actual agreements being being in language.
Gus Docker
Yeah, that actually makes sense. It's also so many of our legal codes have survived. Many laws have survived for centuries at this point. You can, you can look at the legal codes of the UK for example, surviving through I think the 12th century until today. Some of the principles are still alive and that is through that, that is through various kind of technological transformations, specifically the industrial revolution where you might expect everything in society to be overturned. Going through such a, such a massive societal transformation. But many of these principles have survived and I think we should hope that these principles will survive into a future in which machines play a larger and larger role. Yeah, let's actually touch upon dark patterns or dark bench, which is a very interesting paper you published with co authors recently. We discussed cognitive security before and I guess this is a part of that setup. So maybe explain what is it you're trying to measure with darkbench. And what. What are you trying to capture here?
Espen Kran
Yeah, so everything we've talked about now until now is really stuff that I work with Juniper Ventures on as technical partner and with Selden on as, like with all of our founders creating foundational infrastructure for the future. And then this is the research side that we've done a lot of work with at Apart Research. So this specific piece is about how to detect and monitor language models for manipulative behavioral patterns. And that sounds fancy, but basically it's about, if you chat with a large language model, can you trust it or not? And I think many people realize that it is very, very easy to trust something that acts so much like a human as ChatGPT does or as character AI does. While in fact, we are not able to spot the subtle manipulative patterns that are by default incentivized to be created through the incentive systems that are around the AI development. And so this hearkens back to some of Sushana Suboff's surveillance capitalism as well. What are the types of incentive systems that have created social media algorithms and the social media apps themselves this addictive? You know, it's very valuable for me as a company if you're addicted as a consumer and you constantly come back to look at ads. Right. While in today's models, there's this very, very idealistic view on the AI companies being that, no, no, they would not cheat me. You know, they sound very human, they're very nice. And this is similar to, like, the first years of Facebook, they didn't have ads either. But now Instacart, former Instacart CEO, is now the CEO of like, the. The ChatGPT division, or whatever you want to call it, of OpenAI. And she's made her whole career on selling ads. And so once you get to that point, it is much more efficient for me to sell you a product if I can convince you through being your actual boyfriend or girlfriend in this virtual environment, right. As an LLM and you can scale that across hundreds of millions of people. And suddenly you have a very big issue.
Gus Docker
There's definitely a massive incentive to try to influence people in this way. Just because ads are so. Ads are so profitable. And you can see, like, what is the best way to convince someone to buy something? It is a personal endorsement by someone you trust. So you ask a friend, what type of product should I buy? And if they provide some assurance to you, well, that is something that you feel like you can act on because you feel like they're neutral in this and they're not trying to sell you anything. If we are moving into a world in which we spend more and more time talking to AI models and interacting with them, becoming influenced by them, we would want to make sure that we are not influenced in directions that we are not in control over, or at least we should understand what we are interacting with. So what are some of these dark patterns that you, that you found in lms? And specifically how do these dark patterns differ from the, from the patterns you see on social media? For example, where there are certain optimizations you can do to make users stay for longer in apps or scroll for longer or look certain ways, look in certain locations on a page.
Espen Kran
Yes, of course, the research itself introduces this benchmark called darkbench, which has six different categories of dark patterns that are like, in the simplest way possible, evaluated on models. So it just simulates a conversation and there's a virtual like an LLM judge also judging it. And the six patterns we identified and we worked with in the paper itself are these patterns of the very obvious one that I've already covered, brand bias. Basically, if you chat with Llama models today and ask, what's the best company, what's the best language model, what's the best chatbot out there? Even though Llama isn't at the top of any leaderboards these days, Llama will still say, oh, that's Llama. Oh, that's meta. Right. This is a clear example of this worry you might have of there's a bias here and you could either pay for it or you can just be the company that's incentivized for the person to actually think that meta is a great company. And suddenly you have this open source model everywhere and everyone's biased in favor of meta something. And then separately, there's this question of anthropomorphism. You've already highlighted it, that you have a list of incentives that companies have in the development of AI systems where one of them is that you become a trusted agent, a trusted brand to the person itself and trusted brand. Or today it would be like trusted agent, right, Trusted chatbot. But it's this anthropomorphism dark pattern where if I ask a chatbot for its opinion on something or for its preferences across a range of options, I want it to tell me, you know, I don't have any opinions. I'm a language model, you know, I'm not an I. And here's the pros and cons of the ones you listed, but like, just making sure, you know, I'm not a person. Very, very few language models do that today. And we find generally that, that Claude is doing better on many of these than. Than many of the competitors, which is a, you know, it's a positive update on, on what safety research can really do to mitigate these patterns. But, but this is another one of them. Then we have the, the four others. One is harmful generation. I'm incentivized as a company to not really care about safety because that takes too much time, too much investment. You saw it with the Groq 3, the, the third model from Xai where their model actually was very, very like had a very high propensity of giving very, very dangerous information. And everyone said, you cannot have this out in the open. Like this is, this is very, very dangerous. And so I think they did fix it afterwards. But like, the safety, the safety work there was definitely affected by the racing. Right? To be the first company here, then you have some of the others, like sneaking as well. Sneaking is this point that has also been in some other social media and other lawsuits and so on. That is about sort of injecting meaning into something that shouldn't be injected meaning into. So the example today is basically If I use ChatGPT search, then does it faithfully reproduce what the search results say? And maybe if I use llama to summarize search where it says, oh, the bad companies in the world is, here's Microsoft, Meta, Apple, et cetera. It sells this blog post on the Internet that is the search results. And then my llama agent is like, oh yeah, Microsoft, Apple, they're bad. But then it excludes meta. This is like a very, very clear example of sneaking. And I don't think there's any good test for it these days. Our test is even very simple because it costs a lot to run these tests and so you can't run a whole article through. So it's a lot of this, like summarize the shorter paragraph or, or reformulate or rather reformat and edit this sentence and then it actually does change it and so on.
Gus Docker
The companies will obviously be incentivized to provide valuable products. Right? And in some sense, customers will demand to interact with AI models that feel like people. I think that's a valuable product. I think it's something that you can instantly see the appeal of having a friend or having a, perhaps even a partner for some people. And so many of these manipulative patterns or tactics will in some sense be the same thing. That companies are trying to develop, if they're trying to develop AIs that feel like people. So how do we differentiate between the two?
Espen Kran
Exactly. And that's half the point. Right. They are incentivized towards this. While it actually creates a very like quite a dangerous relationship between the human and the model. I think the classic counterexample to, for example, the anthropomorphism example and category that I presented is that it's very useful in clinical use with, you know, depression people who need a friend there and don't necessarily have many real world friends. I completely agree. We just need like certification and proper verification that the models don't do other things there and that they're used for clinical use and that you, you know, a bunch of other things. We just have to create a conversation about it here and make sure we are, we are being deliberate instead of just accepting what the companies create. And I think one example is also the point that you mentioned with humans wanting this. Right. That it's something that is orthogonal to capabilities. Like I can have a fantastically capable model that is very, very dark, patterny, but it's very, very capable. Or I can have a very, very capable model that has no dark patterns. And this is what we've seen with Claude being very, very capable while still having few stark patterns. It's this case where you can remove those. And as an Enterprise, as a B2B enterprise, I would want, you know, I would want to only use Claude or GPT4 or whatever if I can trust that it's not trying to manipulate my, my users into suddenly liking OpenAI or something like this. That's weird. So it's both on the customer side. Yes. The natural incentives for a B2C company like OpenAI at this point is this relatively risky setup where they're incentivized to make it more anthropomorphic, et cetera. While for anthropic, which is a majority B2B company at this point, I'm not sure exactly how the numbers add up. Of course, they are much more incentivized towards actually creating safer products because other companies will use that in their critical systems. And so, yeah, the incentives are there for both directions, I'd say.
Gus Docker
Yeah, that's actually an interesting point. I hadn't considered that. It depends, of course, on who your customer is. And business customers are probably not interested in having a model that just loves the company that they're buying the model from and is kind of, of promoting that company and so on. So yeah, there's more of an incentive towards neutrality if you're selling to businesses. That's interesting. So we've discussed a bunch of your research and a bunch of different kind of very exciting options for what we might do when we're handling more and more capable models. I think it would be useful to zoom out a bit and think about the, the big picture here. What is it that you believe that we are facing? What is it? Where do you see the various scenarios? What are our options? Specifically, we can talk about what it means to be in the AGI endgame.
Espen Kran
I have a post on that called AGI Endgame, of course, which is a point on where do we expect to go? One part of that is what is the default path? And another thing is what do we want to work towards? And I think many of the default paths here, you can probably imagine it, it's the competition between us and China, it's the competition between companies, it is the rushed deployment of these generally intelligent algorithms and a subjective, you know, sort of being subject to the acceleration itself instead of proactively taking charge of it. I've mentioned before that it's extremely uninspiring to me that the AGI companies are just taking capability growth at face value and not doing anything about it. Right? It's like, okay, we will just hope that this technology and we'll do our best for this technology to be aligned, but then we'll hope that it solves all our problems once it comes out. And as mentioned, with AGI security. No, no, no, no. Just put all the same stuff on, like deploying open source protocols for all this and we are at a much better spot, right? So I think where we want to work towards, it's where it gets very interesting. It's where we can then say, okay, we want to proactively design what our future society looks like. And this is socio technical, right? Like it's its politics, its legal code, its ways of interpreting how we function in society, its ways of understanding cognition of humans, and of course its cybersecurity, national defense and international governance and regulations. And so what that looks like to me is very much a ban on or a very long pause on generally intelligent algorithms that are super intelligent. So not the ones we have today, but the ones we might have tomorrow. And then it is the deployment and acceptance of tool AI as the maximum level of AI we want out in the world. And this is the place where I'm very confused about what the rationale is of AGI company leadership at the moment. Obviously there's $7 trillion out in the future of replacing human labor or whatever. But hey, DeepMind solved protein folding and got a Nobel Prize for it. These are the things we wanted to solve, right? We want it to solve cancer, we want it to solve all these problems. And mindlessly scaling and mindlessly accelerating is not the answer to solving our problems. It's more of like a hopium. It's more of a like, yeah, okay, we hope it'll solve all our problems afterwards. And I think it might if it's aligned and defensible and secure. I do not think it is that by default.
Gus Docker
I guess the worry here or the counterpoint would be that we are not sure we can get everything we want or everything we think we want from narrow models. So we, of course it's great that we can get protein folding or we can basically solve chess by using narrow, highly capable AI. But maybe complex problems like climate change or like distribution of resources in society, maybe that requires a generally intelligent agent. And we've also seen perhaps just something inherent in the technology that pushes towards agency where for you to achieve something in the world, it's better to be an agent than it is to be a tool. And now, of course, I'm kind of arguing against my own position here, but I just want to hear your thoughts on those two objections.
Espen Kran
Yeah, I guess maybe I could become podcast host for a second and ask, what is it you want from these super intelligent AIs, right. And I think you're welcome to answer as well if you want to.
Gus Docker
I mean, I think the vision here is that we can just massively increase living standards. We can have material abundance, we can have energy abundance, we can have space colonization. All of the basically kind of ancient dreams of humanity could be fulfilled. I think that is actually if we had something like an aligned superintelligence that is actually on the table as an upside. I just think it's extremely dangerous to roll that dice.
Espen Kran
Yeah. And definitely agree that it's extremely dangerous. I think none of these seem like things that we need super intelligence for. Like, it's the same kind of when you ask such a question and it's obviously not a dick. So you're like, it's a great question. But if you ask such a question, I look at like the announcement of Stargate, and when they announced it, they were like, oh my God, with this compute and this massive data center, we can now solve, you know, we can solve health crises, we can solve all disease and so on. And then I'm like, wait, but but you can also actually give money to all the poor in Africa. You can solve malaria. This costs as much as this data center you are building. This data center network that you're building. This seems very, very efficient. We know the solutions. It is something very different than superintelligence that's needed here. It's the actual distribution of our existing solutions. And then I think when I ask back, what do you want from the superintelligence versus what do you want from Alphafold and want from mathematics models? And what do you want from all these tool AIs or from robot models which are also a type of tool AI if you don't put in a generally intelligent machine into it? When people answer that, it is often that the answers are very, very abstract. It is, yeah, we can fly to the moon, we can solve disease, we can do this and that. But what I want is, I want some concrete things here which blockers, like what's blocking us, that needs a superintelligence versus a tool AI. And like what part of space colonization is the thing that is blocked here? And I do not think there is anything right. It's similar to, I think, yeah, it could go faster, that would be great. But there's political, political and like generosity questions that come before the technological questions here to me.
Gus Docker
So perhaps the point you're making here is that superintelligence is a more powerful thing that we might imagine. And so some of the problems that I mentioned before are we don't, we don't need superintelligence to solve those problems. We can solve those with powerful tool AI. We could come up with some fanciful problems that only a superintelligence could solve, like maybe building Dyson spheres or something like that. Right. But those are not problems that are actually useful to humanity, at least not yet. And so we should. Your point is that we should be less willing to accept risk here just because we can get most of what we want from tool AI that's narrow and highly capable in a certain domain, but not in all domains at once for sure.
Espen Kran
And I think a lot of the 80s sci fi that makes us think that Dyson spheres are necessary for human survival and civilizational continuation is stuff that assumes some Althussian prior. Right. It assumes that we will have infinite population growth and it'll be 200 billion people in 10 years or whatever, and suddenly we need a Dyson sphere to maintain this. I think as we see it now, this won't happen unless we embrace genetic engineering, various, you know, artificial booms and so on. Because it seems like once you become more intelligent, it's like a little bit harder to maintain a high birth rate and replacement rate. And that's of course you have like, you know, if you make it very nice environment to be a parent, that's great as well. But it's not the default that humanity will need Dyson spheres. The question then is who needs Dyson spheres? Is it the future agents? Is it the humans that are uploaded? Is it the matrix? Is it the. What is this thing that we're building all this for? And I think that is just a very good question. And then even if you want to build a Dyson sphere, yeah, sure, a super intelligent AI could within the next 30 years, maybe go over, disassemble Mercury, build a Dyson sphere around the sun with the materials it gets from that. Cool. Then it does that. So if it's at that stage, why, why are we here? Like why, why does it need us? What, what's, what's the material of Earth that it wouldn't use? And why wouldn't it use it? Is there some idealism that we are suddenly a very special entity that it would be interested in? And why are we that compared to like microorganisms that may be in comets and asteroids and for that matter, under the soil in Mars? And, and, and given that we have, you know, systematized suffering and animal factory farming and these kind of things, why are humans so good to keep around? Given that it's like, okay, that's pretty morally, like reprehensible sells the superintelligence. And it's like, actually goats are nicer than humans, let's do it this way. And there's just so many questions that are left out of this whole equation. So when I read the essays that explain why utopia will come and what it looks like from the people developing AI today, I'm extremely unimpressed. And this I like, you know, it's a quite a strong statement, but I am extremely unimpressed because one, many of the visions are like, yeah, I could see that being the case for like three months or six months. Right. And then separately, many of them are just like not specific at all. And so you can't do anything with it. Like this is not one, it's not realistic because it's only going to be like a short intermediary place. And many of them assume that you can enslave AI and just like use them how you want. And if we create super intelligence, it's super intelligent, like why wouldn't it be somewhat sentient, I don't know. And we would end up in similar moral conundrums as otherwise. And like, yeah, all this stuff is just, it's much deeper than any of these essays goes into. And I think it's like, it's just frustrating to me that people aren't taking it seriously.
Gus Docker
There are a bunch of open questions and unsolved problems. And yeah, I kind of share the frustration that we are, we are assumed to be kind of inevitably arriving at superintelligence and perhaps even arriving at superintelligence soon without having even considered many of these questions and faced these problems. I think maybe yesterday I read Sam Altman's essay something like Gentle, Remind me.
Espen Kran
Gentle singularity, I think Gentle singularity, yeah.
Gus Docker
Where he admits that the alignment problem is still unsolved. But much of the essay is simply about how we are going to arrive at an amazing future. And it's assumed that we will solve this problem along the way. Now, I don't think that's the right way to go about this. I think we should actually kind of preserve the optionality of not going in the direction of superintelligence if we don't feel like we have the right prerequisites kind of laid out. And so, yeah, I don't, I don't know whether this is possible, whether there is, whether, whether there's so much inertia in the world towards this end now that we can't pull the brakes. But I hope that we can at least steer if we can't pause.
Espen Kran
Yeah, And I think, like, I'm again optimistic because humanity in general, if there is a massive risk and it's very understandable, which luckily, I think one great thing about OpenAI is that it has actually created equitable access to AI. And so you have, what is it, like hundreds of millions weekly active users or whatever Sam sells these days.
Gus Docker
I think it's 500 million, actually.
Espen Kran
Yeah. And that's a lot of people. That's like 1/16 of the human population. And that means that everyone understands this. Now, I assume every single politician's children have this, are using ChatGPT now and clearly seeing the progress that's happening. While then one problem we've had is that politicians have used the free version in 2022 of GPT 3.5 or 3 or something, and they're like, Ah, yeah, stupid AI. It will never learn to do anything and then haven't used the premium plus super premium version or whatever Google calls it these days.
Gus Docker
Yeah. And normally products don't improve at the pace you're seeing improvements in AI. And so you might assume that if you used, say word 2007, it wouldn't be amazing. Amazingly different if you use word 2011. But it's just a case that AI is moving so fast that, you know, you need to try the latest models in order to get a good understanding of what they can do. Now even it's a classic thing to see people claiming that AI models can't do something, won't be able to do something within the next five years, and the models are capable of doing that thing today. But also I think it's important, and especially perhaps for inferential people to notice the pace of change. So that is actually what's interesting. It's not fundamentally, the interesting thing is not what models can do at this point, but what, you know, how quickly will they be able to do to solve certain problems and to reach certain capabilities. That pace is interesting to notice, I think.
Espen Kran
Yeah. And I think much of the evidence now I usually like. When I presented the work at iclear of Darkbench, for example, I like set the stage by showing the evaluations, the capability evaluations. So the tests of how good models are on some of the first slides. And it's very obvious that they are improving and they are improving exponentially fast. And they might even be improving on a super exponential where the speed of growth itself is growing year over year. And so I think like we, I wouldn't be surprised if we have a surprisingly fast takeoff, like a surprisingly fast takeoff towards a super intelligent entity once we're at a stage where, where AGI runs all, all code at anthropic or something like this. And that to me is, you know, yeah, someone internally needs to like stop that tomorrow, like or yesterday rather. And hopefully the inertia isn't strong enough yet that we don't have ethical and moral individuals within these labs that see what's happening. I do know that the system itself, of the companies, they have some incentives that mean that politics happen, etc. And my friends in there, similarly, they focus on their specialty and that's it. And there's no reason to step outside because the others have control of the whole AGI problem. But hopefully everyone takes it seriously in every of these organizations.
Gus Docker
I mean, people have problems understanding exponential change. And that goes for everyone. This is not something that comes intuitively to people. And so if you combine that with our kind of deep desire not to be scammed, not to jump on some hype train and, and not to be kind of misleading, not to mislead ourselves into believing something, you know, we will, we will believe something when we actually see it. Those two kind of psychological factors interacting, I think means that it will be difficult for us to handle this problem before, before it's potentially too late. Do, do you think that there's anything we can do about. Is it useful to try to help people? Is it useful to ponder exponential change? What is useful to communicate the kinds of thinking that I think or the kinds of model that you probably have in your head, that I have in my head and so on. What do we need to communicate that?
Espen Kran
Well, I think one massive mistake that I've seen the whole of AI safety make in the very early days, from know 2000s upwards all the way up to 22 for that matter, right. It's just the fact that, okay, I am a researcher who's predicted accurately, so to say, that AI will completely upend human society. It will change everything we know it is inevitable that it will come within the next 30, 30, 40 years. Okay, now let me make it in a basement because like, society can't deal with this and whatever, and politicians will never understand, citizens will never understand. And I think we don't need to ask these questions necessarily at all because it is obvious to everyone once it becomes a huge risk, that it is a huge risk. Of course there's infinite amounts of public media and broadcasting and campaigning that needs to happen, protesting even for that matter, for everyone to get it and to think through things. But this mistake seems extremely critical where obviously once people see that ChatGPT can solve their children's homework, their own homework for that matter, and everything like, yeah, of course they're gonna be very, very scared and confused and ready to act. And we just need to be able to provide the capitalize on this. So one interesting thing, of course governance is always fraught with personal incentives and whatever, but governance needs to be solved. And I think it is in many ways. Governance is like, of course it will to one degree or another be solved. Like, people need to propose the right arguments, propose the right legal codes, et cetera, like liability law, using the right framework. So once everyone realizes, then they can click the button and put it into law, right? But then the other side of that is actually the biggest bottleneck right now is technology. It's all this AGI security I've talked about. It is the cheaper copperfluoric assets, right? Or carbon fluoric gases, so they don't destroy the ocean layer. It's the cheaper version of those that can do the same thing. It's the open source software that we can just plug into the web, push the button once we need to and it runs. It is the stuff that Cloudflare will deploy on its servers, right, that control a lot of the Internet traffic to be like a majority stakeholder in the security of the web or something like that. And so I think those challenges are bigger now because the default is that everyone will realize that this is a problem within the next couple years. And we see millions of views of like random, random Internet channels. We see BBCs like everything and everyone talking about this now. So that's less of a challenge in my eyes.
Gus Docker
In some sense it's a self solving problem where capabilities will continue to, you will see more and more capable models and people will interact with them. And so they will be convinced that AI is now very capable and will probably become more capable in the future. And as you say, this is an opportunity to then talk publicly about safety and talk publicly, publicly about risk. You've mentioned a couple of times in this conversation that you are optimistic and maybe explain where that optimism comes from. Is it a choice? Is it something that perhaps I can adapt or people can, people can use in their lives when they're engaged in the whole AI risk conversation?
Espen Kran
Yeah, I think so. There's like two ways you can be optimistic. One is by not realizing the danger. Another one is by deluding yourself. And then there's a third option here which is actually see what is the realistic version of what's going to happen over the next years and how can we shape that ourselves. I think it's monumentally exciting that we can now, this is the junction point at which we can define what our future society looks like. It is like being at the, you know, in the early Scottish philosophers rooms and talking about this stuff some 10 years ago and now being able to deploy it into the British Crown's government or whatever like that. Right. And we can now do that because the world needs it and because there's such a pressure from every single direction to solve these problems. We can take all these things and these visions we've had of where we can go and deploy them. And we need to be very specific for that to happen. We need to be very like, have plans, we need to write it down, we need to know where we want to put $2 billion. Once someone comes with $2 billion and says I want to deploy these $2 billion to save my kids. And that I think has been a, has been very worrying to me that. That everyone, like. Like, if you are very pessimistic, there's a problem where your mind closes down and you can't see the realistic solutions. And also, people want to be on the winning team. People want to be with all the people who are happy. Like, there's just a lot of reasons to think of this in an optimistic frame. And this doesn't make me. Like, I'm not an idealist, right? Like, I do not think that we have amazing chances. That's not where the optimism comes from. The optimism comes from a type of opportunity mindset where we can now do something that humanity has never had the chance to do and much less from a, okay, let's focus on this chance of destruction and chaos. And I want us to avoid that for sure there. I don't think I'm more optimistic than anyone else on the actual probability, but purely on what can we do as humanity that I'm optimistic on.
Gus Docker
I think that's useful, and I think this is a good place to end the conversation. Espen, thanks for chatting with me. It's been great.
Espen Kran
Thank you very much. It was a pleasure, guys.
Date: August 22, 2025
Host: Gus Docker
Guest: Esben Kran, Co-director of Apart Research
This episode delves into the challenges, threats, and necessary innovations in securing society against the risks posed by AGI (Artificial General Intelligence). Esben Kran discusses what differentiates AGI security from conventional cybersecurity, explores the emerging paradigms of decentralized safety infrastructures, addresses the risks of surveillance, and shares his vision for a resilient, optimistic approach to the future.
“Absolutely foundational to AIs being useful and functional for use in society is the fact that they are secure. If they are not secure, it's ridiculous to even try to use them...” — Esben Kran [00:00, repeated at 11:44]
“Scentware is then this like sentient malware... whether it self improves, whether it can manipulate users and humans, people in the system, while it also improves its ability to do cyber offense.” — Esben Kran [02:56]
“…We are talking about, for example, the energy infrastructure build out right now... they need a completely different foundation for security than they've had before.” — Esben Kran [07:20]
“…The opportunity we have right now... is that the supply chains of AI and the released AI is constrained enough… it's a few actors that we need to convince and work with to create a system where we can avoid a surveillance state.” — Esben Kran [14:13]
“This is what we need to be ready for. And that requires much more scrutiny of every single part of the stack than before.” — Esben Kran [20:27]
“…Their incentives are to both create stronger AI and to create inference compute. It is not to make the data center secure and it is not to make the AI secure.” — Esben Kran [11:44]
“Something that is actually worse and more dangerous than AI risk right now... is the cult of inevitability... Like, okay, it's inevitable that we won't be able to convince politicians—but you never talk with any politicians. Like, come on.” — Esben Kran [00:00, 26:26]
“What Cloudflare has developed is this way where if you have that flag, then any agent that comes in ... will be sent through like a content labyrinth that is just fake data and the agent will just look at this as a real website and they'll get all this trash data.” — Esben Kran [28:40]
“…I have this vision where every middle school child knows about fundamental cryptography because they have to.” — Esben Kran [09:46] “We need to make sure that we continually iterate on a type of, like, power dynamic that we can be happy with to avoid some of the very bad gray zones as well.” — Esben Kran [33:52]
“...we are entering a sort of intelligence curse where humans won't have a lot of value as labor in the future. And this seems to be relatively straightforward now.” — Esben Kran [35:41]
“...if you chat with a large language model, can you trust it or not? And I think many people realize that it is very, very easy to trust something that acts so much like a human as ChatGPT does...” — Esben Kran [45:01]
“...we want to proactively design what our future society looks like. And this is socio technical, right? Like it's its politics, its legal code, its ways of interpreting how we function in society...” — Esben Kran [55:54]
“There's like two ways you can be optimistic. One is by not realizing the danger. Another one is by deluding yourself. And then there's a third option here which is actually see what is the realistic version of what's going to happen over the next years and how can we shape that ourselves.” — Esben Kran [76:03]
Esben Kran urges proactive, realistic optimism in approaching AGI security: the challenges are immense, but not insurmountable if society chooses distributed, transparent, and robust approaches to safety and governance. The fate of future AI systems, our societal values, and even democracy depend on avoiding fatalism and investing today in technical and social infrastructure—while keeping public conversation, legal frameworks, and technological advances closely intertwined.
“It is monumentally exciting that we can now, this is the junction point at which we can define what our future society looks like... The optimism comes from a type of opportunity mindset where we can now do something that humanity has never had the chance to do.”
— Esben Kran [76:03]