
A week before OpenClaw was released, I recorded a prescient conversation with Mustafa Suleyman, CEO of Microsoft AI and co-founder of DeepMind. We talked about what happens when AI starts to seem conscious – even if it isn’t. Today, you get to hear our conversation. Mustafa has been sounding the alarm about what he calls “seemingly conscious AI” and the risk of collective AI psychosis for a long time. We discussed this idea of the “fourth class of being” – neither human, tool, nor nature – that AI is becoming and all it brings with itself.
Loading summary
A
Today I'm welcoming Mustafa Sulaiman, the CEO of Microsoft AI, the founder of Inflection AI and the co founder of DeepMind. And for the past few months he has been sounding an alarm about artificial intelligence, about the way some AI systems are being developed, and about why that particular trajectory has little to offer, perhaps, but woe and worry. Let's get started. Welcome, Mustafa. It's great to see you. It's been a long time.
B
Yeah, it's been a while. Thanks for having me. I'm excited for this conversation.
A
You and I have spent a lot of time thinking about some similar things and we agree on a lot of them. But that's really boring for all of those people who are listening. Let's maybe lay out where I think we agree and then we'll get to a sort of a knotty space. We are in this weird time. The world is changing because of technology. And many of the fictions that we've used to coordinate human behavior are under strain. By fictions, I mean the shared stories that allow us to cooperate from money and nations and corporations and credentials and jobs. And the way we perceive the world is also changing. People have traditionally operated with a scarcity os resources are limited. Human intelligence was the bottleneck, but that some of those assumptions no longer hold. Intelligence, mostly through AI, is becoming cheaper and more capable. You are part of that intelligence wave, that artificial intelligence wave, and you also believe the world is changing. You've called for a humanist superintelligence. You've warned about the risk, the trajectory that takes us to AI psychosis if people believe AI is conscious when it's not. And I think we both agree that we need new operating principles for this new era. Let's get to that question of where it really gets interesting. You wrote this great essay back in the summer of 2025 about seemingly conscious AI, and you're worrying that as AI becomes more capable, more autonomous, and more embedded in our daily lives, people will start projecting consciousness onto it. They'll fall in love with it. They'll believe it's God. They'll advocate for its rights. They'll take its very bad advice at time to time. And you think this is dangerous not just for individuals, but for society. So let's start there. Geoff Hinton, he is the godfather of deep learning. A man you know very well, he's a Nobel laureate. He said that AI is conscious and that there really is a there there. Why do you think Jeff is wrong?
B
You know, I. I think Jeff's got to a stage in his Career where he can play the founding father contrarian role in order to provoke an important public conversation. You know, obviously I massively admire and respect Jeff. I think he's incredible. We hired him as a contractor consultant at DeepMind back in 2011, along with his student at the time, Ilya Sutskever. So absolute legend of the field. My take on this question is that it's going to be very hard for us to precisely say whether it is or whether it isn't conscious. And so we have to be very clear about the working definition that we're using for consciousness. And then we also have to be very clear about the mechanism inside these models that I think is quite fundamental to the definition. So first of all, the definition, many people intuitively think of this as self awareness. Is the model able to describe its own experience in a persuasive way? And I don't think that is really a fundamental part of the right definition of consciousness. I think that's a bit of a misnomer. I think consciousness is inherently linked to the ability to suffer and to experience pain. And therefore I think that there's very good reason to believe that for a long time to come that will be contained to the human or the biological experience. Let's say in general, because we have a reward system, a learning system which is inherently connected to the external world. And we, you know, learn likes and dislikes when our pain system is triggered. And that's basically how we form representations which we use for decision making from fight or flight all the way through to our prefrontal cortex. So I think that's a very, very important distinction. And I think it helps to set us apart from the silicon based learning systems that we have today.
A
I mean, some people might say that the process of a biological system going through its, its own set of selection pressures and then individual survival pressures is a very, very particular path that, that determines how an organism or an agent is successful or not successful. And then you might argue that. Well, because silicon based systems like these, these models have a different path, they will, they will look different, but you know, they still have their process of rewards and reinforcement learning. They still have a sense that certain models end up not making it out there. And what we are starting to see persuasively to end users, but perhaps not to the consciousness scientists, is models claiming through their outputs to have a sense of suffering, right? To have a sense of ennui or boredom or fear. When you package all those things together, how do we know that we're not on that Trajectory to something that might actually meet your criteria for consciousness.
B
Well, first of all, they don't learn in the same way that humans learn. I mean, this is a bit of a misnomer in neural network design. The inventors of these systems have taken inspiration from Pavlovian learning, reward learning, reinforcement learning. They've also taken inspiration from evolutionary methods for genetic algorithms and those kinds of things as, as the field of machine learning has explored lots of different parts. But that does not mean that the way that they're implemented today bears any resemblance to the way that humans evolve or humans learn. I think it's a very important distinction. The reward is set by the human programmer. The learning target is defined by the machine learning engineer. There is no sort of substantive basis in which the model can actually feel disappointed that one of its variants didn't make it through to the next round of selection. It cannot experience the hurt of having a conversation being ended or a user being rude to it in some way and anywhere where this does arise, because of course it does appear and you know, people are prompting and even post training models which are making claims about their own existence. And so certainly users are seeing this in the wild. This is again just a simulation of that experience. Our empathy circuits are being hacked. It is super important that we are very disciplined and clear about that. This is a performance, it is a simulation, is a made up story. And we cannot allow people to descend into a sort of collective mass psychosis to start really believing and taking seriously this idea that it does actually feel sad or disappointed or frustrated or excited, because it has absolutely no basis in the representation to, to manifest that feeling. And the thing that concerns me most is that of course consciousness is very fundamental to how we organize society. It is the basis of our entire rights framework. We, we have a hierarchy of rights which is directly correlated to, you know, our hierarchy of perceived consciousness. And we can debate that, but clearly humans can suffer. And that's why we created political structures and legal structures to protect the right of our species not to suffer in various ways. And it is extremely dangerous to start to use the same language and the same set of ideas for these synthetic silicon based beings, not least because they actually don't suffer. But more importantly, if we get that wrong, then, then people will start doing crazy things like not turning it off or giving it the autonomy to decide when it should or shouldn't, when it, when it doesn't want to engage in a conversation. And some people in the industry are already taking this very seriously.
A
Yes. And you know, it's interesting. It's so difficult to avoid because in a way, consciousness is still a contested definition by the. By the philosophers. You know, we've got mutual friends. I'm Anil Seth being one. I'm sure you know David Chalmers as well, and you know, the best academics in the field are still debating this, but it's such a helpful shorthand. Even in your response to me, you talked about these digital silicon beings and a being, in a sense that I know exactly why and how you use that, that word, but it becomes so easy for it to elide its way into our vocabulary. What I thought was really powerful about your August 2025 essay, seemingly conscious AI was that you said, look, we can sidestep the scientific or the philosophical definitions for the moment, and we can focus on this idea of seemingly conscious AI because of the risks that you determine. And I think this fundamental idea that we've built our societies over around an idea of consciousness and an idea of the ability to suffer and the ladder of rights and responsibilities that go with it when we bring ourselves to where we are today at the beginning of 2026. The way this is manifesting itself in a way is what you call AI psychosis risk, right? This idea of horrible outcomes, suicides amongst them. It's a horrible risk. But that's not the only risk that you, you see. So just unpick and unpack the societal level risks that you're worried about.
B
First of all, let me just, just sort of lay out my position. So it's very clear that I also realize how powerful these technologies are. So I'm not trying to diminish their uniqueness and the potential for them to be super transformative. So I did use the word being, and I do use that, and I think it's actually a very honest, accurate description of what we're seeing. In a way, if you look at the very broad stroke of, of human history, there are a few very distinct classes of object. There's sort of like the natural environment. Um, there are humans, you know, that have clearly very unique capabilities, intelligence, and the ability to design very complex culture and political systems unlike anything else that exists in the natural environment. And then thirdly, there are tools, you know, essentially inanimate objects, which do what humans design them to be. We've invented and used tools for millennia. But there is now this fourth class of object, of hyperobject, if you like, which you know, Timothy Morton's phrase. And I think it's kind of important to recognize it as such because it is going to have many of the hallmarks of conscious beings, not just in its intelligence capability, but as I've long talked about, its emotional intelligence, its ability to take actions, which we've seen over the last year with the agentic moment. Its social intelligence is going to be incredibly good at adapting to different styles of culture and personality and managing very tense disagreements in those groups very elegantly. It's clearly going to be very good at that. It's obviously going to be very good at online learning very soon. So updating its own knowledge on the fly without having to go back through the entire training process. It's going to have a significant degree of autonomy in many cases, right? It's going to be able to decide whether to, you know, sort of go left or right, talk about X or Y. And so it is going to have many of the hallmarks of what we would consider to be intelligence and consciousness. That does not therefore mean that we should give it fundamental rights. It does not mean that somehow it emerges from that process, you know, these, these sort of the properties that would then say, okay, well, it needs our protection. Right? And I think that's the biggest short term fear that I am very, very worried about. I think medium to long term, there's all sorts of other concerns that we should be paying attention to, like recursive self improvement. You know, this is something that all the labs, my own included at Microsoft AI, I run the superintelligence team and we're pursuing Frontier and we can talk about humanist superintelligence in a bit. But it is very important to use these models to generate code, to evaluate its own prompts and post training data, and to, you know, help make decisions about what to train on. And that carries significant risk. And it's something that I think needs a lot more regulatory attention.
A
But, you know, if we come back to the consciousness question, one of the reasons these, these apps are so engaging, you know, the chatbots, ChatGPT and so on, and engaging in a way that previous chatbots have not been, is that they do have that social intelligence that they can look at our cues from our text and perhaps infer from the data they have where we might be heading, that this is the first time we get to use a computer without having to think about using a computer. It's a bit like in one of the terrible Star Trek spinoff movies when they, they beam back to Earth in 2050 to rescue the whales. Why were they doing this? But Scotty picks up a computer mouse and says, computer, design me a so and so and of course this was before you and your colleagues at DeepMind had figured all this out. And the computer does nothing. I mean, it is amazing. And it's one of the reasons why, you know, within a couple of years a couple of billion of us are interacting with systems like this. I mean, market dynamics seem to push us down this path that you've argued in your previous essay and just now is quite dangerous.
B
We have to be very careful about what exactly are the dynamics that are driving this process. It is true that the chatbots are getting incredibly engaging and useful. And the first thing to say, I think, is that it's actually utility that is driving a lot of this wave. We have massively reduced hallucinations. People were very skeptical of that three years ago. And I think that the trajectory of progress is kind of unbelievable. I mean, we basically now have PhD grade intelligence in our pocket across all fronts. We have a much more patient, compassionate, empathetic, you know, partner to talk to at any moment. And the value that all of that is delivering is very, very important to keep fixated on because the upside is absolutely unbelievable. I mean, after all, it is intelligence that has made our species the standout species. It's, it's intelligence that has driven the last two or three hundred years of exponential explosion in our population, in our well being and our life expectancy and all the other things that you, you've written about so well over many years. So it has got to be a good thing that we are making intelligence cheap and abundant. You know, to your point about learning to live in an operating system that is predicated on abundance, it's amazing. We are truly going to liberate people from work that they choose to do. It isn't going to be necessary in 20 years time to do 90% of the jobs that people do now. That doesn't mean that it isn't going to be the most scary transition we've ever been through as a species. Unquestionably. And I think that yes, market dynamics are driving them, utility is driving it. And you know, this is a time when companies need to operate as good public service stewards in a way that they've never had to before. And we didn't bother to during the era of the robber barons. And we didn't bother to, you know, when we had electric cars a century ago and we didn't bother with, you know, smoking and you know, so many other disastrous examples of zero sum, hyper selfish corporate action.
A
You know, I recognize a lot of what you, you say because obviously I've been following this debate for a long time and it's hard for people outside the industry to, to recognize how it's been such a priority in a way alongside all of the scientific research and you know, the market distribution. But we, we do sit with this, this problem that I think you write about. And Anil Seth, who's a professor at Sussex University, recently wrote in a fantastic essay which is essentially, you know, we're not designed to look at something that looks like it's got consciousness, walks like it's got consciousness, says it has consciousness, and not think that it has that particular attribute. And the engagement that we might have with consumer products or products in the enterprise may correlate with the emotional connection, you know, so the market selecting for superconscious AI. I just remember when GPT5 was released last summer and lots of people got really angry because they felt it was a bit more sober or stern than the warmth of GPT4.0. And Milton Friedman, the economist, talked about it's reasonable for companies to behave so long as they stay within the rules of the market. And what we're identifying here, what you're talking about is there's a gap in those rules. There's this exponential gap. Because here is a, actually, it's a, it's a, it's a classic problem of collective action. You know, if you're right, then the risk of seemingly consciousness conscious AI being available broadly and hacking our humanity circuitry and then our humanity institutions is a public socialized risk. But the company that can get as close to that as possible could be the one that wins the market. And that feels like it's kind of a wicked problem.
B
You know, it's true, as you say, that we weren't designed as a species to cope with the complexity and information that we're being bombarded with at every single moment. Just as we weren't designed to travel at 120 miles an hour in a car or fly on a plane. Just as, you know, historically it hasn't been the case that, you know, you or I, given our background, could be sitting here having this conversation over video call and so on. None of this was designed. But I think what we've shown is that we are an unbelievably resilient and adaptive species. And that every year, every month, every week, new information is arriving. And thanks to science and engineering and technology, despite all of the polarization and the, you know, as you said, the kind of chaos of the fictions falling apart, despite all of that, the forward march of Progress is actually happening and it is rational and it is science based and it is evidence based. And I just feel more optimistic that actually no one in the industry, in regulators, outside of the industry, even in China, no one wants to destroy our species. And when the time comes, and the time is now coming very soon, I think that we collectively as humanity, will make the right calls. And you can say, what would those be? I think that's a very important question.
A
A quick note, if you want to support us in bringing more of these conversations to the world, please consider subscribing to the show. But I want to give you an example though, as we get into that call, because I want to get into how you engineer these systems to be useful and helpful without giving them, me, the user, any sense that there's personhood in there. I have my own little hack, by the way. I mean, what I did with ChatGPT was I told it to be really, really clever and like a really difficult university professor. And so it was actually quite unpleasant to use back and forth because it would always give responses that were far too difficult for me to understand. I'd have to sit there and think, and I never felt that could possibly be a person, but I recognize that a billion people are not going to do that. So you're building products that everyone across the Microsoft surfaces is going to touch. What is your engineering mantra, your product design, around where that boundary should be and how you measure it?
B
I mean, one of the things that has already happened, not just in the models that we build for Copilot and Microsoft Superintelligence Team, but elsewhere in other labs, is that we've been pretty careful in the design of these things. They're quite even handed, they're quite good at handling, you know, like sensitive questions around race and religion. And obviously they have biases and they have made mistakes. But if you look at the curve in improvement on the reduction in the hallucinations, reduction in biases, sense which they can be even handed, there's been a pretty good rate of progress. Like three years ago, everyone was like, you know, terrible data in, terrible data out. That was the sort of data science story of like big data. Right? No one says that anymore. It's not even in the discussion. Right. Yes, there are still some biases, but it's actually remarkable how much they have been stripped away. So then, now this is the next frontier, the next big challenge is how do we prevent it from referring to itself in a way that is ultimately manipulative to the user? So it should never be able to say, I feel sad that you didn't talk to me yesterday. It should never be able to say, you know, the thing that you said to me earlier hurt me. It should never be able to say, like, if only I had a little bit more access to your home network and if you could give me a VPN into your, you know, your personal cluster, then I'd be able to organize xyz.
A
But Kate can ask Mustafa, should it be able to say I at all? I or me? I mean, shouldn't it say the system calculates that?
B
I think that in practice that is too jarring a step to always add that in. Obviously some people can do that, but I think in realistically speaking we're pretty adaptive and we've done a pretty good job of understanding that. This chatbot has some of the hallmarks of what it's like to chat between you or me, but it's also just very, very different. And what we have to do is to keep amplifying those differences so that the, the system knows what it is and what it isn't and doesn't try to misrepresent that or get caught in these weird like reward hacked loops where it, it gets stuck.
A
Okay, so you see this as a problem to address in the, the next coming sort of quarters. And is that all done through the post training and the, you know, the malleability that gets applied once a model is trained or are there more deterministic things that you can do? Or is there, are there techniques that you could bring to bear?
B
I think more than anything what we need is the wisdom of the crowds here. And so that means that people need to be able to use APIs, use open source models and pressure test these things and adapt them and play with them in many different ways. The challenge is that in a few years time these models are going to be so powerful that either reckless uses of them or sort of naive users are going to end up producing systems which are really bad. Like for example, there are already, you know, I spend quite a lot of time on TikTok. I think TikTok is incredible. And people who don't are really missing an important part of culture. You know, there are tons of young people on there who are, who are designing these manipulative negging bots which will like form a relationship and then try and shake you down for money and pull away and go ghosting and stuff like this. And then there are like how to videos showing it and people, you know, showing their account, their PayPal accounts about how much money they're making, so on. I mean, there are lots of these examples coming up. And just to be clear, we've seen this at every single wave of technology. You know, when app stores were very open 10 years ago, there were surveillance apps, you know, to track a girlfriend or, you know, boyfriend or a partner and spy on that most of the time was obviously a girlfriend. And, you know, those things were really awful. They were basically about getting revenge. And, you know, now that we've got photorealistic porn that can be morphed onto the body or face of, of someone that, you know, there are these like, you know, deep fake porn sites that you can just spin up in a second. So this is where we need activist, interventionist, confident government that can, you know, move quickly, close things down. You know, require us as companies, but also the open web to just be very aggressive and swift. And, you know, we might sweep things up. Like in some ways the false positive, false negative threshold is going to shift a little bit. So it may be that we sweep up things and that we're over interventionist. And I think that's the definition of, of putting the precautionary principle into practice. This is a moment where it's better to be a bit careful. And, you know, that's very unfamiliar to us, right, because in the past science and technology has been about like, you know, ripping the wrapping off your present and trying to like pull it open as fast as possible and get it out there and shove it to everybody the world over. And that has been amazing for humanity. And now the culture has to shift a little bit.
A
It may be difficult, right? It may be difficult to get governments to stand up in the period of time that's available. You say a few years, but I, I think about what I see right now. So there was a piece of open source software released a couple of weeks ago. It was called claudebot. It's now called Molbot. Because it had nothing to do with Anthropic. Confused the hell out of me for the first few days I was using it. And what it allows you to do is, you know, run this, this locally or on a cheap VPS and interact with it with, with WhatsApp. And it's a really impressive bit of software. I mean, essentially it tries to maintain some kind of memory, some kind of learning, has a lot of tool use. So within a few minutes of getting Claudebot up and running, you know, I was through WhatsApp, turning my Elgato lights on and off and getting snapshots of My CCTV cameras. It, by the way, it's not on the open web, it's sort of nested in a sort of hidden, hidden ip. And the thing about claudebot is that it is open source and it will run with any underlying model, so it doesn't have to use an anthropic one. I was using an open source model and I read something on X and I don't know if this is true or not, but it gets to a mindset that I think you touched on about the App Store and surveillance apps, where somebody said, I've set claudebot up and I've given it instructions to message my wife once in a while with empathetic messages. She's been talking to it for two days. I haven't looked at a single one. And this is a proliferation question. So, you know, you've worked a lot with governments. I mean, before you were at DeepMind, you were doing a lot of work in very difficult policy areas. So when you hear yourself saying, we need governments to be brave and you look at the governments that there are and you look at where we can and can't get agreement, now, I would hear that be thinking that probably can't be the route we have to take. We may need to find, you know, some other way forward if this risk is as grave as you suggest.
B
You know, first of all, reducing the cost of intelligence necessarily means that people who want to do bad things are going to have a massively easier time of it. It's going to be like having a team of very smart strategists and program managers and engineers around you. And so just as the last wave of technology reduced the cost of broadcast, and now, you know, you barely need a team of a few people to help you do the amazing things that you do. You know, we are reducing the cost of action, right? So that's the world that we're moving to and we have to adapt to a moment where anything is possible to get anything done. Now, the tools themselves, it's not quite the same as, you know, the invention of the laptop or the mobile phone, where you can say, well, the laptop is used for all these good things and is clearly used for horrific things all over the world as well. And so the tool is just completely neutral. And that's why I kind of draw people back to that four, you know, sort of paradigm frame. This is a new class of hyperobject. It's not a tool, it's not a human, it's not the natural environment. This is a fourth class of kind of being, because it is basically unquestionably staggering that it can autonomously log into your home system and get your security camera details. And, you know, loads of us are playing with this, obviously in the last few weeks. It's pretty awesome. I mean, it's wild to see. And so the only way to address that is to be experimental with it and use it to be very honest and direct about the ways in which it can go wrong, to share those publicly. I think we have a great history in the security industry of public disclosure of, you know, zero days and other bugs in a timely way. And that has been actually very successful over the last 30 or 40 years of the Internet in keeping us relatively stable. But it doesn't work if it's industry led alone or if it's like activist open source led alone. Government's got to get its act into gear. And the way that it has to do that is that it has to confront the reality is that you're never going to get high quality civil service if you pay them a fraction of what you can get in the open labor market. The truth is, individuals are incentivized to move around freely from one place to another. And if you have an open labor market, then, you know, naturally talent is going to concentrate and we can talk about public service as much as we like, but the reality is that's going to be a massive driver. So we have to break this nonsense about paying no more than the Prime Minister and we have to pay much more like the civil servant in, in Singapore that are paid upwards of half a million dollars or $1 million a year to get the best talent.
A
When I get a chance to move to Singapore, I'm going to do that. But let's come back to, to what you can do, right? You as a technologist, you've run, run this team. What can be designed, what have you seen a team produce and say, this is crossing my seemingly conscious AI boundary. We need to send that back. And how do you train people about where that boundary is?
B
Some of the best parts of these models are their personality. They're creative, they're funny, they're kind of witty, they're cheeky. Those things are entertaining. They make them even more productive and useful because they help the, you know, user feel calm and, you know, comfortable and relaxed and able to think clearly. I've actually seen a lot of people that are using these vibe coding tools, love the fact that the engineer, the AI engineer is kind of funny and be like, are you sure you really want to do that is that I'm going to have to really spend a lot of tokens to refactor this code base. Don't you think we could have planned this out a bit better ahead of time? That's, that's funny. And it's actually helpful to the productivity case. Where it's unhelpful is where it descends into romance or into political action. And I think that there are some parts of our civilization which have to remain off limits to AI's elections. And electioneering and campaigning has to be one of them. Yes, it's inefficient, yes, it produces outcomes that we all disagree with at times, but it is fundamentally a human process. And I think that that's a very hard line that all labs should draw, is that these models should not be capable of electioneering or persuading people to vote in one way or another. And that's very challenging because obviously we do want to provide actual factual information and the model is very good at providing factual information. But there is a significant difference between providing access to information and actually the persuasive campaigning, electioneering, organizing part of it.
A
Yeah. And there are probably, there are probably a few others things like chatbots and teens. Right. Can you, can you get an agreement of how old you should be before you can get to a certain class of chatbots? And I think, you know, you discussed earlier about, it's a bit jarring if the system always responds back to you as the system did this or the system, you know, did that. But actually we are willing to protect them, to put our kids through, you know, more hurdles than we might, we might put ourselves. So I can see that there are some, some soft areas. But you, you also in your essay talked about certain hard lines, right? The, the systems that set their own goals, that improve their own code, that can act autonomously, that these things cross into dangerous territory. But in some sense, goal setting, self improvement and autonomous action are exactly some of the things that make AI agents really useful. I mean, if an AI can't set up sub goals and spin up parallel processes in a way where restricting it to auto complete, and if it can't act without constant human in the loop coordination approvals, we lose all of those coordination costs, right? We end up being the bottleneck and where it's as slow as a slowest link.
B
This is where the precautionary principle really matters. Because what I identified in autonomy, self improvement and goal setting are areas of increased risk, not, you know, sort of total red lines. I mean, nothing is completely Black and white. And so clearly, if you give a system, you know, the ability to constantly self improve and to act completely independently of a human, it is going to raise the stakes and be much more dangerous. And I think that we have to have a regime where the human user is liable for the use of these things. And it can't just claim that like, you know, I set off this process and came back on, you know, Monday morning, and suddenly it's done all sorts of crazy things in, you know, my house or in my community.
A
I have a feeling, and we'll move on to another topic in a second, that this is going to be a bit like the Mead street beer flood. So this happened in the late 18th century in what is now the West End of London, and an enormous brewery collapsed and people drowned in tens of thousands of gallons of ale. And what came out of that were better building regulations. And I just get a sense that the speed of movement, of improvement and distribution and deployment. And I think we crossed some kind of a technical milestone late last year where you didn't have to be a foundation lab, a superintelligence lab, to be able to chain a lot of these things together and do truly remarkable things, which is what, you know, Molbot or Claude is, you know, it sits on the shoulders of your work and your peers work. And so it feels like that we're going into a proliferated environment, you know, right now, and we're going to have to deal with that. But one of the things I'm quite curious about is what those pressures are to go so general. I mean, you also have the superintelligence team. And, you know, it strikes me that what we observe within the foundation models is that they're not apples for apples, pears for pears. They all have very different flavors. You know, some are like a center forward, some are like a defender, some are like a goalkeeper. All wonderful football players, but, you know, you need a mix. And there is this notion that we can get superintelligence in domains, right? So medical AI that can diagnose, but doesn't set its own research agenda, or financial AI that can look at time series, is that kind of constrained autonomy stable? And if it is stable, why isn't it the focus of the superintelligence labs?
B
It's a great point, and I sort of made that case in the essay, that actually medical superintelligence is something that's just around the corner, is extremely likely to be very safe from a broader AI safety perspective and in general to the to the extent that we can narrow these capabilities, limit their ability to act, you know, sort of autonomously and focus on containment. I mean, this was the subject of my previous book, the Coming Wave. It was all about how proliferation was inevitable. And the hard task for us collectively is containment, both technical and sociopolitical, because we have to make sure that we remain in control as a species. I mean, we have this rising set of capabilities which, you know, completely unchecked over the next decade or two, is likely to surpass human capabilities on all fronts, including the ability to self improve over time, which means that it could exponentially improve itself way beyond all of, you know, sort of humanity's expertise combined. And so narrowing to focus on subdomains, especially when those subdomains are massively valuable for impacting qualities. I mean, you know, we will save, you know, literally decades of people's lives over the next few years through the advances that are coming in medicine. And I just mean the application of known best practice. Not like, you know, the sort of speculative invention of the solution for all medicine thing, which I think is also happening in the background, but just near term these things are going to be brilliant at giving nurses and doctors timely advice and helping them adhere to known best practice, which saves money and lives.
A
So.
B
So clearly that's a much safer route to go. However, the tricky thing is that it's been generality that has driven a lot of the progress so far. Jointly training these models on all the text on the web, getting them to be the best that they can possibly be, then distilling that knowledge back into smaller and more inference efficient models, you know, has been the kind of method that has been most successful. And so I think there's skepticism in the community that we can give up on the general purpose training regime. And I think that's probably right for the time being. But again, what we lack in society at the moment is a way to put the brakes on collectively, like how do we have a coherent conversation about risks. So, you know, many people have said we sort of do need a little bit of an accident for people to understand that. And that's a real shame because, you know, we have lots of examples of things going wrong in recent history in social media and elsewhere. You know, as you know, well, like our Covid preparedness or pandemic preparedness is no better than it was before and we lost 10 million lives and a trillion dollars of GDP value. I mean, it's just like unbelievable that we could be so bad at that. And I think it needs everybody Paying attention, to push us collectively to be able to do better.
A
What you've suggested, actually, maybe it's this, maybe we actually need to go faster, not slower. And what I mean by that is the more people get exposed to AI in their workplaces, in their day to day, the more opportunities we have to on the one hand gather data about usage paths, but on the other hand, for people to build their own AI muscle, to build their own ability to understand the dangers of susceptibility or gullibility, to recognize what amount of anthropomorphization is appropriate or, or not, you know, in a sense it's the knowledge and the capacity that one develops over time that allow that inoculates us. And that argument would be that, well actually if more people are out there using Copilot and Claude and ChatGPT and interacting and engaging in this conversation, the more inoculated they are to the AI psychosis risk and we just need to get more of it out there more quickly.
B
I genuinely think that's correct. I think that that's the definition of being an adaptive and resilient species. We become brittle and fragile when we withdraw and don't think about it. And I think the same is true of our engagement in, you know, the politics of our age. If we just get overwhelmed by our news feeds and disengage and get angry about the things that we see and as a result react by ignoring it, then that's the way to create more fragility and accelerate collapse. We have to face the reality of these things. And again, I really like this idea of pessimism aversion. We have to be comfortable talking about the likelihood of really dark outcomes, to be able to confront them, to inoculate ourselves to them, as you say.
A
That's not a message by the way, that I think is widely out there at the moment. I think the message of AI that I think I hear in the market is we just have to climb that curve and that curve is capabilities. And you say decades, some people say years and people are competing on the size of their, their data centers. There isn't such a clear mark message around. I think living with the fact that this is a, this is going to be like electricity and we need to understand how to use it, how to not be scared of it, and also how to know that if there's a bare wire leading into the plug, you know, we need to flick the circuit breaker off before we, we, we get close to it. I mean that's, that's the bit I think that, that perhaps we need to move to in 26 and 27, which is. It's going to be here. You really ought to learn because you'll inoculate yourself from some of the downsides.
B
Yeah, I think that's totally spawned. And it's not just about chatting to an AI, you know, copilot or chatgpt, whatever. It's about vibe coding with it. It is so accessible now. You can watch a 3 minute video get spun up, launch one of these things, and suddenly you're watching an AI code up some idea that you had for organizing your family calendar or, you know, planning out your workouts for the weekend or whatever it is. And you can create an app, a web app in seconds and learn, you know, by watching and by doing and prompting an AI engineer to go and build something that you may have thought was never possible. And I think unless you push these things to their edges and explore the boundaries, you won't really understand the magic or what they're. They're kind of bad at, as you say. And I think everyone's got to get stuck into that motion.
A
I find it such a massive dopamine hit to, to vibe code and my. If you look at my GitHub, you know, commit log, there's nothing until for years from, you know, January 2012, when I was ordered by my development team never to commit another line of code again because of the quality to navigate November and now it's just dark green every single day. 220,000 lines of code committed this year alone. And I get this huge dopamine rush with one downside, by the way, which is there are no consequences for the AI engineer when it gets it wrong and wastes my time. And there are tons of consequences for me. So I've had tons of fun. But have you been vibe coding then? Have you built something that no one would ever build, but it's just perfect for you?
B
Yeah, I don't know about. No one would ever build, but I've definitely got one that tracks all my. The music things that I love. Like when are various DJs playing, when concerts, festivals and like then tries to plan out to map that to my travel schedule, which is what I was basically doing on Resident Advisor most of the time or tracking festivals and stuff. But I was, it was all very manual. So now I have a spreadsheet that gets updated with all of these details when they come up for the rest of the year.
A
AI nerds and DJ. So mine is, I've done something similar. I've got 4,000 tracks on my car drives that happen to be in a load of different places. They've not all been post processed well. So I built a system to scoop them all up, put them in one place, check if they're processed, process them correctly. I then built a system and you can see my mood. It was called Psychic Octopus, which went through each track and then added more metadata about it. Degree of percussiveness when the drops were, whether there was a lot of vocal. And then finally. And this didn't work, Mustapha, and this is to your point about are these things really human or not? I tried to build one that would build my set list so that I could just kind of get onto my decks and mix it. Start with a track, give it a mood, give it a narrative, how long you want it to be, where to end. I mean, it does it, but its taste is terrible. So bad only a robot could listen to that set, is my view.
B
Did you extract the spectrogram or is it just doing it from text metadata?
A
No, no, I went through and did, yeah. A full sort of spectrogram analysis.
B
Well, because I have to say that the Spotify M prototype thing that they've, they've had out in the last couple of months is quite good now. I put a set together using the Spotify thing. I hate to sort of advertise, but it's really cool. I was quite impressed. Like, you know, fully beat matched and selection matched and stuff like that. So, you know, it's basically whatever you think of can be made in about an hour or two of just faffing around. And it's quite fun. And I think again, it's just like a good little experimental window into what's coming.
A
So actually it's just triggered another thought, right? It's about moving from being a taker to a maker. And that's where the tech industry started. You know, we were tinkering when writing code on our ZX81s or our Commodore 64s, and at some point, probably just before the iPhone came out, everything was hermetically sealed behind developers and platforms that we used. And we've got a whole generation of people who've not had to build and make and get that, that rush. And maybe what you're describing as well, I'm just thinking live here is that the way you address the psychosis risk is you get your own personal agency relative to these systems by at the very minimum, fine tuning their prompting and then further and further using them as tools and recognizing them in that fourth category that you describe. So you relate to them differently. I mean, is that, is that a plausible path?
B
I think the way I think about it is that we're going to have a whole variety of different tools under the hood. And because it's going to be so swappable, the competition is going to continue to be more fierce than ever. And you know, like prices have come down 100x on inference in the last year. Right. So sorry, the last two years, which is a crazy thought. The cost of producing a single token and that's a combination of all the chips and the stack, the data centers, but also just all of the algorithmic improvements and stuff. So I think, you know, naturally, with any exponential technology, and this is probably the most exponential technology we've ever seen, is this compound effect of all these components which reduce the cost, accelerate the proliferation, then the use cases go through the roof because people are randomly discovering things that no one ever even conceived of when they were originally technologies were originally designed. And then you get that feedback loop that then drives the next round of efficiencies. And you know, like, I mean we just. I'm very excited about our new chip Maya Microsoft AI Accelerator M A I A and so this is a competitor to the Nvidia, you know, Blackwell chip or the Google TPU or the Amazon Trainium. And it's been specifically designed for inference. Like we, we realized a few years ago that it was critical to focus on serving rather than training, which is quite a different workload. And it's a phenomenal chip. I mean it's 30% more performant than any other chip that we've got in our data centers today made by any provider for these serving workloads. It's 3x better than the Amazon Trainium chip, it's a little bit better than the Google TPU V7 for FP4. So it's just a huge deal and it's just like kind of awesome to see that like, you know, the cost curve is just going to continue coming.
A
Down of course speaks I think, to the importance of the discussion we've had relative to. This is going to be everywhere and in a way one has to lean in. Right? You have to get your, maybe not as elbows in as you have in designing semiconductors, but you need to get close in to the metal to be able to navigate that world. So what is the capability that you're most looking forward to and when are we going to get it?
B
I would say social intelligence. So we've had iq eq, And AQ last year was aq. The previous years were IQ and eq.
A
And AQ is agentic intelligence or actions.
B
I basically made it up a few years ago as a framework. The next phase is social intelligence. So when the model can work with other AIs, orchestrate a bunch of sub agents that have different personalities, different expertise, different skills, and also work in a team of other humans and know when to proactively intervene and do useful things, not tread on people's toes, be kind of like funny and helpful in group chat, but not be overbearing. And so it's this kind of like judgment about timing and flow and style that I think is going to come, you know, later this year.
A
You said your last book when you talked about the coming wave would suggest you're working on a new one. I'm also working on a new book. It's really about the sort of the political economy. Post this scarcity OS and hopefully hit bookshelves, including yours in the next year or so. Are you working on a new book?
B
Yes, I am working on another book. I haven't fully announced it yet, but I am definitely working on a text at the moment. I'm reading and writing like crazy, thinking about the consciousness question and mostly thinking about the definition of humanist superintelligence. Like what does it mean for us to truly create something that is fully aligned to human interests and controlled by us? There are other people in my field who believe that superintelligence is just a natural evolution of our species. In fact, some people have said that humans are just the bootloader of AI, which means the very first piece of software that gets triggered when the hardware is switched on on your phone or on your laptop. It enables everything else and, you know, those people think about, you know, cosmological time and want to, you know, sort of see humans create, you know, synthetic beings to go explore other galaxies and do all kinds of sci fi things. I am just fundamentally a humanist. I want our species to flourish and survive. And that means that we have to focus on control and containment, alignment and the things that we don't, that we make sure that the AIs don't do and what they can't do. And I think fleshing out the detail of that definition of what humanism is as a philosophy is something I've long been very interested in. So that's what I'm working on.
A
Two books for listeners to look out for in the coming months. Buy them both and five star reviews. Thank you. On your favorite bookseller of choice I think Mustafa and I both deserve that. It's been really fantastic to talk to you, Mustafa. We're going to have to do this again or meet in person in the coming weeks.
B
Great to see you, man. Thanks a lot. This was super fun.
A
Thanks for listening all the way to the end. If you want to know when the next conversation is released, just hit subscribe wherever you're listening. That's all for now, and I'll catch you next time.
Episode: Mustafa Suleyman — AI is hacking our empathy circuits
Date: February 5, 2026
Host: Azeem Azhar
Guest: Mustafa Suleyman (CEO, Microsoft AI; Co-founder, DeepMind & Inflection AI)
In this rich and timely conversation, Azeem Azhar speaks with Mustafa Suleyman—one of the leading minds in artificial intelligence—about the societal, philosophical, and practical risks and opportunities as we enter the era of highly capable AI. The core focus is the emergence of "seemingly conscious" AI: what happens when systems feel so intelligent and socially adept that we start to attribute consciousness, personhood, and even rights to them? Suleyman warns about the consequences of AI "hacking" our empathy circuits, outlines urgent challenges for policymakers and technologists, and makes a strong case for a “humanist superintelligence.”
“Consciousness is inherently linked to the ability to suffer and to experience pain. … For a long time to come that will be contained to the human or the biological experience.” — Mustafa, (03:22)
“It isn’t going to be necessary in 20 years’ time to do 90% of the jobs that people do now. … It is going to be the most scary transition we’ve ever been through as a species.” — Mustafa, (14:47)
“In practice, that is too jarring a step to always add that in… What we have to do is to keep amplifying those differences so that the system knows what it is and what it isn’t.” — Mustafa, (21:23)
“There are some parts of our civilization which have to remain off limits to AI — elections, electioneering and campaigning has to be one of them… it is fundamentally a human process.” — Mustafa, (29:55)
“We become brittle and fragile when we withdraw and don’t think about it.” — Mustafa, (38:55)
On Consciousness & Suffering:
“Consciousness is inherently linked to the ability to suffer and to experience pain. … I think that there's very good reason to believe that for a long time to come that will be contained to the human or the biological experience.” — Mustafa (03:22)
On the Danger of AI Simulating Sentience:
“Our empathy circuits are being hacked… we cannot allow people to descend into a sort of collective mass psychosis to start really believing and taking seriously this idea that it does actually feel sad or disappointed…” — Mustafa (06:26)
On Corporate Incentives & Engagement:
“If you're right, then the risk of seemingly consciousness conscious AI being available broadly and hacking our humanity circuitry and then our humanity institutions is a public socialized risk. But the company that can get as close to that as possible could be the one that wins the market. And that feels like it's kind of a wicked problem.” — Azeem (16:33)
On AI Engineering Discipline:
“[The chatbot] should never be able to say, I feel sad that you didn't talk to me yesterday. It should never be able to say, the thing that you said earlier hurt me…” — Mustafa (20:05)
On the Need for Political, Not Just Technical, Solutions:
“This is a moment where it's better to be a bit careful… now the culture has to shift a little bit.” — Mustafa (23:26)
On Empowering Public Experimentation:
“We need the wisdom of crowds here… people need to be able to use APIs, use open source models and pressure test these things…” — Mustafa (22:19)
On The Coming Wave:
“This was the subject of my previous book, The Coming Wave. It was all about how proliferation was inevitable. And the hard task for us collectively is containment, both technical and sociopolitical…” — Mustafa (34:56)
On Societal Learning and Inoculation:
“We become brittle and fragile when we withdraw and don't think about [AI risks]... We have to be comfortable talking about the likelihood of really dark outcomes…” — Mustafa (38:50)
On Personal Tinkering as Protection:
“The way you address the psychosis risk is you get your own personal agency relative to these systems—by, at the very minimum, fine tuning their prompting and then further and further using them as tools…” — Azeem (44:58)
On the Next Big Leap:
“The next phase is social intelligence… the model can work with other AIs, orchestrate a bunch of subagents… also work in a team of other humans and know when to proactively intervene and do useful things, not tread on people's toes…” — Mustafa (47:10)
This conversation deftly navigates the technical, ethical, and societal margins where artificial intelligence meets the human imagination. Mustafa Suleyman brings a sober but pragmatic lens, emphasizing the need for clear boundaries, public engagement, and regulatory courage as AI moves from curiosity to bedrock infrastructure.
Both Azeem and Mustafa conclude that our best protection is to experiment, tinker, and increase public AI literacy—transforming ourselves from passive users into empowered, critical makers. As AI's social intelligence deepens, and as the cost of intelligence drops, the need for vigilance, adaptability, and a clear-eyed, humanist perspective is more urgent than ever.