Summary8 min read

Podcast Summary: Decoder with Nilay Patel

Episode: Microsoft AI chief thinks superintelligence is near, but won't take your job
Host: Nilay Patel (The Verge)
Guest: Mustafa Suleiman, CEO of Microsoft AI
Date: June 8, 2026

Overview of the Episode

This episode features a wide-ranging conversation between Verge Editor-in-Chief Nilay Patel and Mustafa Suleiman, CEO of Microsoft AI. They explore Microsoft’s evolving relationship with OpenAI, the company’s new focus on building its own AI models at the frontier, the reality and challenge of “superintelligence,” shifting public and enterprise attitudes to AI, debates over jobs and automation, data and copyright, product form factors, and the philosophical and ethical tensions as AI advances toward (or past) human-level intelligence.

Suleiman offers an insider view of Microsoft’s strategy, key technical choices, the difference between tasks and jobs when it comes to AI automation, and his deep skepticism of anthropomorphizing AI tools like Claude. The tone is open, candid, and at times confrontational, as Patel presses Suleiman on hard questions about value, public sentiment, and the future of work.

Episode Breakdown

1. Microsoft’s Shift in AI Strategy

- Structural Changes at Microsoft AI

Since October 2025, Microsoft has renegotiated its partnership with OpenAI, allowing Microsoft to independently pursue superintelligence and build its own frontier models (03:53).
Suleiman’s current mandate: build in-house models, assemble a superintelligence team, invest in custom AI chips (“Maya 200”), and pursue self-sufficiency.

- Evolving Partnership with OpenAI

Microsoft intends to both use OpenAI models and build its own, ensuring it is not “structurally dependent” on a third party for core technology (06:33).
Quotes Satya Nadella’s anxiety: “I don’t want to be Intel and have OpenAI be Microsoft.” (08:49)

- Decision-Making Framework

Microsoft AI operates on a six- to eight-week cycle with mixed squads led by Directly Responsible Individuals (DRIs) separate from managers (15:01).
“...the optimal time for making very clear falsifiable missions.” (15:01)

2. Superintelligence, AGI, and the Singularity

- Defining Terms & Empirical Progress

Suleiman argues that “superintelligence is coming,” extrapolating from years of log-linear progress on model benchmarks, compute, and data (17:41).
AGI: Parity with humans at most tasks.
Superintelligence: Greatly exceeds humans, creates new knowledge, self-improves.
Singularity: Self-improving AI recursively grows capabilities, “decades away.” (78:05)

“A superintelligence is a general purpose learner that can basically immediately understand a brand new domain which is out of distribution. So it needs to be able to learn in a novel environment from scratch…” (69:13)

3. Building New Models & Technical Choices

- Build 2026 Announcements

Microsoft announced seven new models across modalities, including the flagship reasoning model “MAI Thinking One.”
Focus on high-quality, curated, and (where possible) purchased data, model efficiency, and integration between chips, models, and data (30:31).

- No Distillation from Other Models

Deliberately avoided “distillation” (training from outputs of other models) to ensure potential to surpass "teacher" models and foster true in-house innovation (33:35).
On IP and distillation: “It’s a short-term win... We want to create a culture in the lab where we can come up with the next big thinking breakthrough...” (36:13)

4. Data, Copyright, and Open Web Controversies

- Data Curation

Microsoft uses a combination of open web and paid/licensed data, with significant focus on filtering, security, and third-party dependencies (38:18).

- Creative & Publisher Pushback

Acknowledges growing frustration from creatives over data scraping for AI training, recognizes it’s a “tricky one” still being worked out in courts (37:47).

5. Enterprise vs. Consumer AI

- Enterprise Focus

Enterprise adoption presents near-term product-market fit: companies have proprietary data and clear repeatable processes for automation (38:55).
Consumer antipathy toward AI is rising—data centers get pushback, consumer-facing AI products are seen as underwhelming, with skepticism about value exchange (39:45).

- Is There a Killer Consumer AI Product?

Suleiman defends chatbots’ value at global scale, offers vignettes: “empathize a little bit with the small scale business owner or the kind of mom that’s helping her kid with the homework ... get feedback, get instructions, get essay questions set.” (40:38)
Patel pushes: Value not clear enough—public wants clear benefits to match the societal costs and resource demands (41:39).

- Healthcare as a Use Case

Announced partnership with Mayo Clinic to build a foundation health model: “That’s why I go in the field... to make a difference and leave a good legacy for everybody.” (42:21)

6. The Jobs Debate – Tasks vs. Roles

- Clarifying Automation Fears

Suleiman distinguishes between automating “tasks” vs. complete “jobs,” referencing labor economics: AI will free up time for more creative/judgment-centric work, not simply eliminate roles (49:41).
“There’s always these revenge effects of efficiency…” – Technology makes workers busier, not obsolete (49:41).

“Tasks will increasingly become digitized, automated... that does not mean the role goes away at all. It just means the work can be done faster and more efficiently.” (49:41)

- Did AI Overpromise?

Patel questions if public and political pushback has forced the industry to soften messaging and promises. Suleiman reiterates longstanding beliefs and points to consistent healthcare focus (55:26).

7. Resource Demands, Public Sentiment, and Governance

- Microsoft’s Net Zero Data Centers

Highlights commitments to sustainability and local community protection even as data center investments soar (58:32).
Suggests AI’s progress is driven by a “collective mesh” of incentives and pushback, arguing the process is working as intended (58:32).

- Enterprise Adoption and Value Realization

Acknowledges “token-maxing” excess, but claims real productivity gains in enterprise (61:46).

8. Form Factors and the Future of Devices

- Experimentation Beyond Phones

AI wearables, agents, and badges could eventually displace or disintermediate the smartphone (63:09).
Believes in a hybrid future: local device intelligence for “easy” tasks, cloud compute for complex reasoning, dynamic context switching (64:33).

“I think we’ve become so used to the phone that everyone just assumes that this is going to be an anchor device for the rest of history. But ... many of the features and functionality of your phone, I think, are going to get disintermediated, broken apart and stored on smaller devices.” (67:19)

9. Knowledge, Consciousness, and AI Philosophy

- LLMs and AGI

LLMs will take us far, but full superintelligence likely requires new breakthroughs; LLMs are a “highly compressed representation of knowledge” with useful ambiguity (69:13).

- On Embodiment and Consciousness

Contrasts sharply with Anthropic’s murmurs about AI consciousness: Argues it is “dangerous and philosophically failing” to imbue models with rights or consciousness (“suffering is inherently biological... that’s not how these models are trained”) (72:25).

“I think it’s very dangerous to project potential rights onto beings, tools, agents that have the potential to be significantly more capable than us in many respects.” (73:23)

10. Final Definitions and Outlook

Suleiman’s definitions:
- AGI: “As good as most people at most things.”
- Superintelligence: “Dramatically exceeds human performance, discovers new knowledge, self-improves.”
- Singularity: “A superintelligence that can actually self-improve itself ... infinitely accelerate ... wacky for my taste.” (78:05)

- Timeframe: Foothills of superintelligence, but true singularity is “decades away.”

Key lingering challenge: “How are we going to govern it, how are we going to control it? And how are we going to make sure that it serves humanity and not end up causing us more harm than good?” (77:53)

Notable Quotes & Timestamps

Nilay Patel: “Satya [Nadella] says, ‘I don’t want to be Intel and have OpenAI be Microsoft.’” (08:49)
Mustafa Suleiman: “Superintelligence is coming. I think it’s just around the corner... no way ... we could be structurally dependent on a third party for providing that IP for all eternity.” (06:33)
Mustafa Suleiman: “A superintelligence is a general purpose learner that can basically immediately understand a brand new domain...” (69:13)
Mustafa Suleiman: “Tasks will increasingly become digitized, automated ... that does not mean the role goes away at all.” (49:41)
Mustafa Suleiman: “We want AIs to be controllable, contained, accountable, aligned tools that serve humanity. That’s the project of humanist superintelligence.” (73:23)
Mustafa Suleiman: “I think it’s very dangerous to project potential rights onto beings, tools, agents that have the potential to be significantly more capable than us in many respects.” (73:23)
Nilay Patel: “I think for me the question is whether the industry as a whole misjudged the total amount of value it could provide to overcome the seeming recklessness that people are now reacting to.” (57:15)

Key Segments & Timestamps

[03:53] – Microsoft’s restructuring and mission shift
[06:33] – OpenAI partnership evolution and rationale
[15:01 ] – Decision-making and team structure at Microsoft AI
[17:41] – Superintelligence, AGI, trajectory
[30:31] – MAI Thinking One: technical approach and innovations
[33:35] – Model distillation, IP, and research ethos
[37:47] – Copyright, open web, creative pushback
[38:55] – Enterprise vs. consumer attitudes to AI
[49:41] – Task vs. job automation and the jobs apocalypse debate
[58:32] – Resource allocation, sustainability, and community impact
[63:09] – Post-smartphone device paradigm and AI hardware
[69:13] – LLMs, generality, and the road to superintelligence
[72:25] – Anthropic, AI consciousness, training philosophy
[78:05] – Definitions: AGI, superintelligence, singularity

Memorable Moments

Patel’s challenge on Suleiman’s “most white-collar tasks will be automated” quote and Suleiman’s detailed clarification.
The “Instagram thirst trap” analogy for Microsoft’s break from OpenAI (13:17).
Detailed, critical take on Anthropic’s approach to AI rights/consciousness, calling it a “philosophical failing” and “dangerous” (72:25).
Patel’s ongoing pushback about the gap between AI industry ambitions/communication and actual consumer value realization.

Tone & Closing Thoughts

The conversation is forthright and at times combative, with Patel pushing hard on industry inconsistencies and Suleiman offering thoughtful, sometimes nuanced explanations. Suleiman projects both ambition (superintelligence, in-house innovation, redefining computing) and caution (human-centered goals, ethical diligence, skepticism toward speculative claims of consciousness). The episode is a must-listen for anyone tracking the future of AI, corporate strategy, and the cultural and ethical dilemmas posed by rapid technological advance.

Loading summary

Transcript86 lines

[00:01]
Podcast Host / Advertiser
Support for the show comes from ServiceNow. AI is moving fast across the enterprise, but without visibility, it's just chaos. Different tools, different models, different teams using AI in completely different ways. ServiceNow turns that chaos into control. With the AI control tower, you see all your AI across the business in one place, what it's doing, what it's done, and what it's about to do. So you stay in control. To put AI to work for people, visit servicenow.com, support for this show comes from Klaviyo Imagine hiring two brilliant employees. The first takes your marketing from idea to full campaign, email, sms, push and the time it takes to describe it. The second handles every customer conversation 24. 7 answering questions, recommending products, handling orders both on brand and always on your next hires. Klaviyo's AI agents get started at K L A v I y o.com study
[01:12]
and play come together on a Windows 11 PC and for a limited time, college students get the best of both worlds. Get the unreal college deal everything you need to study and play with select Windows 11 PCs. Eligible students get a year of Microsoft 365 Premium and a year of Xbox Game Pass ultimate with a custom color Xbox wireless controller. Learn more@windows.com studentoffer while supplies last ends June 30th terms at aka mscollegepc.
[01:43]
Nilay Patel
Hello and welcome to Decoder. I'm Neelai Patel, Editor in Chief of the Verge, and Decoder is my show about big ideas and other problems. Today I'm talking to Mustafa Suleiman, the CEO of Microsoft AI, and I'm actually gonna keep this intro pretty short. First, if you're watching the video, you can tell that I'm working from the basement of my wife's family farm. But second, and way more importantly, this is a real burner of the episode. Mustafa and I covered everything from his approach to training new models to his deep criticisms of anthropic talking about Claude as though it's conscious. Of course, we also talked about all of the AI announcements Microsoft just made it build its developer conference, the company's relationship with OpenAI, which is very different than it used to be, and the cultural and political pushback to AI across the country. I really wanted to know how Mustafa was thinking about it, and whether any of the consumer AI products available today are enough to overcome those objections. Like I said, this one's a burn, and Mustafa was down to talk about all of it. Okay, Mustafa Saliman, CEO of Microsoft AI. Here we go. Mustafa Suleiman, you are the CEO of Microsoft AI. Welcome back to Decoder Nilay.
[02:55]
Mustafa Suleiman
Great to be with you again.
[02:57]
Nilay Patel
Yeah, I'm very excited to talk to you. I think our previous conversation, one of my favorite conversations about AI and how it should make us feel and what it's for that I've had. In all the conversations about AI that we've had, there are some big changes at Microsoft, maybe some very important recontextualization about how people feel about AI that I want to talk to you about in particular. And then there's Microsoft Build, the big Microsoft big Developer conference. Lots of new announcements, lots of big ideas about what computers are for and maybe where they should be that I want to get into. Let's start at the very start. This is some deep decoder stuff that is important to understand before all the rest of it. Since you joined Microsoft, you have restructured how AI works there. Your role has changed. The last time I talked to you, you were in charge of a bunch of consumer products that has been since set aside. You're now training new models. You're on the frontier. Explain how Microsoft AI is structured now and how it's structured inside Microsoft.
[03:54]
Mustafa Suleiman
Yeah, so, I mean, I guess the last 15 to 18 months or so, we've been on this journey to reestablish our relationship with OpenAI. And it's taken a minute. I think it culminated in a new contract that we got done finally in October of last year. And there were lots and lots of different provisions in that, including cement and extending the partnership, but crucially freeing us up to be able to pursue superintelligence independently, as well as keep buying and licensing their models. So since October, I've been assembling the superintelligence team, building clusters of sufficient scale to train frontier models, you know, hiring a team focused on superintelligence. And so that was quite a big shift for us because it sort of enabled me to focus just on the superintelligence mission. And that has then culminated in a few things that we announced this week at Build. We have seven new models across all the modalities and so on. So it's been a pretty big shift and I think a long time in the planning and a great relief for us to now be in the game and pursuing the absolute frontier over the next few years.
[05:04]
Nilay Patel
Was this the plan when you were hired at Microsoft?
[05:08]
Mustafa Suleiman
It's been the plan certainly for the last 18 months. I mean, I think the relationship with OpenAI has gone through lots of ups and downs, and in many ways I think is going to go down as One of the most successful partnerships in history. It's been great for OpenAI and it's been great for Microsoft, and all good relationships evolve. And I think this is just the next stage in our evolution.
[05:34]
Nilay Patel
Let me ask you about that evolution specifically. We all just saw the trial between Elon Musk and OpenAI and Sam Altman. Microsoft was involved in that trial in the sense that every so often a lawyer from Microsoft would stand up and say, and we weren't around, and someone would say, yes, and that was that. But obviously what came out during that trial, what has been clear during this entire time, is that the original notion was that OpenAI would be a research lab and provide models and that Microsoft would build the products. And Microsoft had expertise in going to market, it had expertise in enterprise, it was trying to regain foothold in consumer in a variety of ways, and that this would be a platform shift and that the research work would be over at OpenAI and the product work would be inside of Microsoft. That's the thing that changed as OpenAI wanted to make more and more consumer products. Obviously, given your new role and your new focus, Microsoft more and more wants to make its own models. Why the split? What didn't work in that relationship?
[06:34]
Mustafa Suleiman
I mean, I think OpenAI is led by an incredibly ambitious founding team and Sam himself. And so naturally, as they started to get more traction and generate a ton of revenue, they saw opportunities to go full stack. So it wasn't just that they started working on consumer products. Obviously, ChatGPT was incredibly successful, but they also started working on their own data centers, they started creating their own chip. There's lots of rumors flowing around about their own consumer hardware devices. They started taking models direct to market through ChatGPT Enterprise. So across the stack, they were kind of broadening way beyond research over the last two, three, four years. And naturally, the same is also true for Microsoft. I think the partnership's now five or six years old and still has another four, five, six years to run. And likewise, we're one of the largest technology companies in the world. We have 493 of the 500 largest companies, store and process most of their data on our systems, use Azure, use M365 and Teams. I think people often underappreciate how enormous we are and how big our distribution is in enterprise. And so long term, and I do mean over the 5, 6, 7, 10 years, years, we have to make sure that we're completely sustainable and we're not just a recipient of somebody else's IP that we then slightly modify and adapt and put into production for our products. But we actually have the ability to stand on our own two feet and create world class models. I mean, super intelligence is coming. I think it's just around the corner. And so I think it's going to be basically the most valuable technology of all time. And there's sort of no way that long term we could be structurally dependent on a third party for providing that IP for all eternity. And so that's been the transition that, you know, obviously was triggered, you know, sort of when, when OpenAI and so on had their, their board issue. But then as, as I came in and my team came in, we started building that out. We're on that transition and I think we're in a great spot because we can take a fairly steady, you know, careful, long term, optimal position both for OpenAI, which I think has done incredibly well out of this, and, and for us.
[08:50]
Nilay Patel
Yeah, I want to spend some time on superintelligence is right around the corner. I just want to put a pin in it now because I just want to kind of understand the transition for one more turn. Here there's a moment in trial. It's sort of a very funny message from Microsoft CEO Satya Nadella. He says, I don't want to be intel and have OpenAI be Microsoft. Which is very funny in the context of Microsoft CEO himself saying, I don't want to be the provider and have them be the, the platform that provides all the value and collects all the value and maybe we'll be swapped out. Right. I don't want ChatGPT to run on Azure and then OpenAI will go get all the value and then maybe they can swap us out, just as happened with Windows and Intel over time. Is that a realization? Did Nadella come to you, what was that meeting like where he said, okay, OpenAI has had its board issues. We need to get back on the frontier and stand at our own two feet. What did that conversation look like and how is that decision made?
[09:44]
Mustafa Suleiman
I mean, obviously that's Satya's decision and you know, as well as like Amy and Brad, many other people in the company. But I think it's as with anything, you know, these are slow moving changes in the company as it comes to realize that a direction that we're taking, you know, needs a little bit of tweaking and adjustment. And so that was happening, you know, way before, you know, the sort of November board incident. And I think it just builds up over time as you look at, you know, the kind of constellation of different fronts, you know, around which we're competing directly, increasingly all the tension that comes from that, but also just knowing that partnerships like that don't last forever. I mean OpenAI wants to be a trillion dollar public company, has incredible revenues, is growing like crazy. They want to have the freedom to operate and be able to buy compute from all sorts of other places, build their own compute, partner with whoever they want. So the initial construct was the contract was formed at a time when the companies were very different in terms of size and scale and balance of needs and stuff. And so I think it made sense for that moment. But then it became pretty clear that this is something that we have to be able to own and control ourselves and do right by our own customers. Like I said, we have an incredible distribution on enterprise which I think is just completely unrivaled in the world. And so we have to make sure we're building the best things for our, for our customers. And that looks slightly different to a company that has been jointly optimizing both for the consumer with ChatGPT and also for the enterprise and also for the fundamental science mission of superintelligence, which includes a whole bunch of different directions which are overlapping but could arguably said to be orthogonal to the consumer and the enterprise directions too. So naturally I think that's how partnerships evolve and they get reset periodically.
[11:43]
Nilay Patel
Yeah, but building a frontier model is very expensive, I'm told, reliably told, this is a very expensive project to set about on. At some point Amy Hood, the CFO of Microsoft has to say, yep, you've got the budget. When did that happen? And was that just a text message? Was there a meeting? Tell me about the specifics there.
[12:01]
Mustafa Suleiman
I think look, we've sort of made the decision the early part of last year which obviously informed all the contract negotiations which then all got, you know, resolved and signed in October. And you know, it is a significant investment, but we have a long time to make it. I mean we've already made significant investments in our own self sufficiency mission. Our Maya 200 chip is actually an outstanding chip as one example. Right. I mean we now are able to manufacture and ship a chip that is 30% cheaper than a GB200 inside of our own clusters. And now that we can co design our own models with it, The Mai thinking one model that we've just released actually delivers 1.4x performance per watt improvement on top of the 30% improvement that you get from running on a Maya 200. Once we co optimize the models for our tasks. So the value of making sure that you own and control your own stack and direct the entire co design effort end to end for the use cases that are most important to us, which is obviously agentic coding, our developers, our enterprises, I mean that clearly pays the dividends that justify the investment that we have to make over the next few years.
[13:18]
Nilay Patel
Yeah, you said self sufficiency mission, which is a very polite way of saying you want to stand on your own two feet, you want to do your own thing. I'm told there's some controversy inside of Microsoft about a line my colleague Hayden Field wrote in a piece describing Build. I'm just going to read this. This is from Hayden. It's a great line. She said this year's Microsoft build had the vibe of a freshly single divorcee posting a thirst trap on Instagram. Right. The breakup has completed. It's time to flex. Here's our new model. We're gonna stand on our own two feet. You're out there saying you're gonna build models at the frontier and compete with the leading labs. Is that the feeling inside of Microsoft that you're free, you're free to be on your own?
[13:53]
Mustafa Suleiman
Definitely not, no, no, not at all. Look, I mean obviously that's a cool, cool headline and a fun phrase, but the reality is we are in partnership with years and years to come. We're running way north of 2030. They still produce the best models in the world. Five Five is an outstanding model. The Codex, the cybersecurity models that are coming through are amazing and they're powering the majority of what we do. So naturally that's going to continue. And so I think that's just the natural course of these sorts of partnerships. I don't think it's anything untoward or surprising. I think OpenAI is very understanding and supportive of that. They've obviously been incredibly fast growing company and they understand that we have, you know, pursue our own agenda as well. So it's very normal.
[14:38]
Nilay Patel
Let me ask you the other decoder question then. When I get into the announcements at Build and certainly Superintelligence. The last time we spoke you said your framework for making decisions operated on a six week cycle. Given how fast AI was moving, that made sense. Then things have settled. Maybe some things are more in focus. What is your decision making framework now?
[15:01]
Mustafa Suleiman
We still operate by the same cycle rhythm. The end of each cycle we have a one week meetup in person. I'm a real believer in this. Even though we're still an in office culture. Four days a week. In fact, the week after next, my entire superintelligence team comes together in Boston in person for four days. And that is for all of our retrospectives on how BUILD went, what we learned, what we didn't get right, what we need to improve, our planning for the next cycle, which is going to run for eight weeks, this time with a one week meetup afterwards. And that's all laid out for the entire year. So the whole organization knows that that's the rhythm by which we operate. And I think it's actually really important to emphasize that timeframe because quarterly planning gets a little bit blurry and a bit abstract. And I think six to eight weeks, depending on where it falls in the calendar, is actually the optimal time for making very clear falsifiable missions. In addition to the cycle rhythm of these six to eight weeks cycles, we also operate by squads. Squads are mixed interdisciplinary subgroups that are focused on a specific mission and they don't necessarily ladder up to the manager. They actually are run by a dri and the DRI is often an IC
[16:21]
Nilay Patel
and their job that's directly responsible, individual and individual contributor.
[16:26]
Mustafa Suleiman
Yeah, exactly. Thank you. And I think we've taken the approach of separating the role of the manager from the role of the DRI that executes on a specific mission. And I think that's because being a great dri is exhausting. You're literally all in 24 hours a day and you're pushing as hard as you possibly can. Being a manager is often about being a coach, offering support, giving guidance, feedback, unblocking, all sorts of things, helping with people's career growth. And so I think keeping those separate allows us to rotate dris every two or three cycles so that some people can try sort of different positions and have rotation. And it's a great, very flexible structure that allows us to be pretty nimble, I think.
[17:11]
Nilay Patel
Let's talk about Build. I want to start with superintelligence. You've mentioned it several times now. I was just at Google. I o. Demis Asabas, who used to be your colleague when you were at Google, ended that keynote by saying that we were in the foothil of the Singularity and that AGI was coming with all the power of Google, you're saying superintelligence is here. Are these all the same things? Are we using different language to describe AGI? Are there differences? How would you define superintelligence in your context versus the singularity in Dennis's?
[17:42]
Mustafa Suleiman
Yeah, yeah. I mean, obviously I didn't say it was Here I say it's coming. And I think obviously there's a lot of fluidity around these phrases, but I think what we can clearly see that's happening right now is that there is log linear hill climbing across all modalities. And that means that there is a very direct relationship between each order of magnitude of compute that we apply each order of magnitude or each incremental increase in data and climbing on benchmarks, whether they're public benchmarks, internal benchmarks, they're targets that we focus on with reinforcement learning environments. And that is a very important observation. Those predictions that I think we're all making, I understand why some people are sort of skeptical of them or raise questions, but they're very grounded in the sort of empirical observations of over a decade of increase in performance of these models. I mean, essentially the same general purpose architecture has seen 12 orders of magnitude more computation applied, a trillion fold increase in flops over 15 years, and basically has worked in audio, in image, in text, in code, in many other time series prediction tasks. And so we're basically extrapolating out that more orders of magnitude of compute will enable us to continue to climb in this log linear way inside of other environments. And then it raises the question of, are we going to be able to train models that can invent new knowledge, not just sort of extrapolate from existing data that we have, but actually teach us things that we don't know and make new discoveries? And then the second thing is, do they have the capacity to self improve and accelerate the process of deciding which hypotheses should be set, which ones should be pursued, how to generate training data for each of those, how to factor those into new runs, or even innovate on the actual architecture itself. So I think both of those things need to be true to be able to see this compounding progress. But I think we're going to continue to get massive gains just from applying the next few orders of magnitude of compute, and that probably does achieve parity with human performance on many, many more tasks, just as we've seen that happen in the last six months. On coding.
[20:11]
Nilay Patel
Coding is really interesting because it's easily validated. You write the code, you ask the computer to run it, it runs or fails. We've seen some of the downsides certainly around security. The downsides are obvious and we're seeing this regulatory approach to coding security play out in lots of ways. I've probably vibe coded some security disasters on my own phone and computer, and maybe that's just a risk. I'm Willing to take every other function. It doesn't seem that easy. I always pick on law because that's my background. But a judge doesn't validate legal writing the way a computer validates code. If you get it wrong, the judge can send you to jail. That is maybe the worst output validation error that you can probably run into. How do you measure the effectiveness across domains as easily as you can measure the effectiveness in coding? Because this seems to me where the metaphor or the analogy from coding to other domains falls apart very quickly.
[21:07]
Mustafa Suleiman
I'm not so sure. So, I mean, coding, obviously you can verify the correct execution of code. It runs or it crashes, but there's a ton of nuance in that. The quality of the code that gets written really matters. Its extensibility, how reconfigurable it is, how useful it is in practice. It's not just that a piece of code runs. It's like, how does a model actually use it as a DevOps or an SRE in production, or to kind of return to that piece of code that it's written and then use it in a practical and useful way? And then of course, you have to grade the quality of the output that has been produced. Like it may be high quality, functioning code, but is it actually the app or the website that you wanted? And there are aesthetic judgments in that, there's commercial judgments in that. So the challenge of internalizing non verifiable rewards is present in code, even though code is still primarily a verifiable reward signal. And I think the other thing to observe is that chat is also a nonverifiable space. And yet we've managed to climb that to basically human level performance through interaction with real world usage. That provides a very strong, tell me how you measure.
[22:13]
Nilay Patel
I'm very curious. How have you measured chat at human level performance?
[22:17]
Mustafa Suleiman
Well, so I think many people are having long conversations, meaningful conversations with AIs at human level performance. I mean, the quality is exceptionally good. It has very good emotional intelligence. It's broadly very accurate. We've minimized the hallucinations. We don't talk so much about bias anymore. It's grounded in real world observations. I think by most people's measures, we've got to human level performance in conversation for quite a wide range of tasks.
[22:47]
Nilay Patel
Now, your measures, I'm actually sure most people's measures, I would disagree with almost all of this, but those are my measures. What are your measures?
[22:56]
Mustafa Suleiman
I mean, my measure is when I turn to my assistant and ask it to provide me with a daily briefing Summarizing all the conversations that have happened on teams and on email, the updates that have happened to documents. And I get basically a synthesized summary with a set of actions that I should take next, which is basically better than what my chief of staff can produce. I would say that's human level performance in synthesis, analysis, proposed actions, and chat. I mean, there are many, many millions of people every day that are using it for emotional support, for counseling, for therapy, for coaching, for advice. I think it's one of the most popular use cases inside all of the chatbots. So that's a pretty robust measure, I would say, to make the claim.
[23:39]
Nilay Patel
I know you've spent a lot of time thinking about this, particularly the emotional connection to some of these chatbots. These are products that you have built and deployed. I would draw a pretty big distinction from this thing is really, really good at summarizing my email and task list and providing me a brief about what things to prioritize. And this thing is an emotional coach for somebody undergoing some kind of crisis. Like, those are not similar tasks. Those are not similar kinds of intelligence, even in people necessarily. I know some people who are very good at making lists and are very bad at emotional support. How do you put that all together in your brain and say, okay, this is broadly human level performance in chat?
[24:18]
Mustafa Suleiman
Well, I mean, I think if you define chat as an interactive exchange between two parties, one of which in this case is an AI, that broadly satisfies some goal. You're looking to learn the sports score, you're looking for advice on which restaurant to go to. You're looking for coaching and feedback on an essay that you've written. You're looking for suggestions about which job to take next or some tough conversation you're about to have with your manager. You get a response, you go back and forth, you have five or six exchanges, and you find that a useful output, which you might otherwise have to go rely on an expert friend or even pay a coach. There are just objectively, empirically speaking, hundreds of millions of people that get that experience every day from these chatbots. So maybe we could quibble over whether that technically represents human level performance. I think it's a fairly reasonable thing to claim, and I think there's no reason why that isn't going to continue climbing. Right? I mean, the rate of climbing in the last three years is the thing that I think is most staggering. And so what we're trying to do from this point is extrapolate, okay, what are the fundamental drivers of that climb? Compute, data, interaction from Real world users and those things look set to continue. So I think that I would expect that they apply to many other domains too. Not just sort of, I don't know, chat or emotional support and productivity and, you know, and that kind of thing, but, but many other domains beyond that to healthcare, you know, to live production deployments inside of education, you know, to assistants that are increasingly managing your home, you know, looking at your, you know, just everything that is in your everyday life basically to make you kind of more productive. So that's, I think, a trajectory that's likely to continue.
[26:07]
Nilay Patel
We need to take a quick break. We'll be right back.
[26:17]
Podcast Host / Advertiser
Support for this show comes from Shopify. When you're starting something new, it can be really intimidating. You have to put so much time and effort into it and you don't even know if it'll succeed. But here's a thought. What if it does succeed? What if your instincts were actually right all along? Shopify wants to help you get there. They're the commerce platform behind millions of businesses worldwide and nearly 10% of all E commerce in the US from established brands like Mattel and Heinz to companies just getting started. Their design tools make it simple to create the exact online presence you're envisioning. With hundreds of ready to use templates available and with built in marketing tools, you can launch full email and social campaigns in just a few clicks. So you can connect with customers wherever they are. It's time to turn those what ifs into with Shopify today, you could sign up for your $1 per month trial today at shopify.comdecoder you can go to shopify.comdecoder that's shopify.com decoder. Support for the show comes from Outshift, Cisco's incubation engine. Today's AI agents operate in silos, which can limit their true potential. When it comes to AI advancement, companies out there have been focused on building bigger and smarter models. But scaling up is just one approach to reach superintelligence together. Cisco says we need to do more. We need to scale out to do this. They're going back to the blueprint from 70,000 years ago. Humans just didn't get smarter individually. Rather, the cognitive revolution transformed society because we began sharing knowledge, goals and innovation. And Cisco says that AI agents are now at that exact same inflection point. They can connect, but they can't think together. That's why Outshift by Cisco is building the Internet of Cognition. Its goal is to transform AI from isolated systems into orchestrated superintelligence by creating an open interoperable infrastructure. Cisco says Outshift is enabling agents and humans to share intent, context and reasoning. The cognitive evolution for agents is here. Explore the Internet of cognition@outshift.com that's outshift.com Support for the show comes from Rippli. If you run a company, you know how important it is to retain your top talent. But instead of worrying about who's staying and who's going, try Rippling AI. Rippling AI is built on your live global workforce data, giving you full visibility into your business and the ability to stay ahead of the curve. Just ask Rippling AI who are my top performers this year and you'll instantly receive a workforce report flagging potentially at risk employees with supporting data like comp ratios, recent performance reviews and engagement metrics. But it doesn't stop there. Rippling AI can turn these insights into a proposed retention strategy, including a recommended 10% spot bonus for your top performers. All you have to do is tap, confirm and the spot bonus is added to the next month's pay run. So don't settle for AI. That's all talk. Head to Rippling AI decoder and get AI that turns insights into action. That's R I P P L I N G. Sign up for exclusive access today.
[29:56]
Nilay Patel
We're back with Microsoft AI CEO Mustafa Suleiman. Well, this is interesting. You've mentioned now that it's still the same fundamental architecture transformers attention. We've been applying computer for 15 years. We're getting these big increases. You are in a fairly unique spot at Build. You announced your first flagship reasoning model Mai thinking one, you got to start from scratch. Is there anything you've done differently now after 15 years in architecting and training this model? Or is it just. Yep, we're going to collect all the data and run the training run just as we did and we have more compute now, so it's going to be better.
[30:31]
Mustafa Suleiman
No, actually, I think there's actually quite a lot of differences. The first thing to say is that the way that you curate the data we start right at the top of the stack is that we basically have paid for and acquired an extremely high quality, very conservative set of data and extracted a lot of the noisy, distracting, low quality, potentially security risk issues to do with that data and the methods that you do for that I think are actually quite proprietary. We just shared a 109 page, very detailed technical report which was very well received on Twitter, which shares a lot of the details on how we do this. I think the second Thing is, whilst I think it's important to be quite cautious with architectural choices, and we have been, there are also a number of pretty significant shifts that I think we've made in how we put together our training run. So our training runs have been incredibly stable. Very few crashes, very few restarts. We shared a lot of those graphs to show infrastructure stability and also MFU efficiency. So model FLOP utilization, which basically shows that we can put basically state of the art number of flops through each chip for every step in our training run. So I think that this is extremely easy to get wrong and we all hear lots of stories from different labs about how things do go wrong and it actually is pretty hard to make the very careful and deliberate choices to get things right and take the right approach to make sure we produce high quality models. Because our job and our ambition is to try and build this hill climbing machine. That means the integration of the silicon with the models, with the super high quality data, with a stack of RLEs Reinforcement learning environments that allow us to basically systematically hill climb against any objective that we choose. And that's what MAI Thinking one is. It's a general purpose, fairly neutral thinking model that is pretty good at coding. It's now roughly on PAR with Opus 4.6, at least on the benchmarks. We haven't deployed it at scale into production, so there's still lots more work to do there. But it's an extremely strong reasoner, 97% on AME, which is the primary measure for its reasoning performance, at least on the benchmarks. It's very good at instruction following. And then the goal is basically to make that available to many, many developers and enterprises and allow them to climb on it for their use cases. Because everybody has a sort of slightly different objective that they have in their company to try and build agents and so on that support their use case.
[33:15]
Nilay Patel
One of the things that you've noted in talking about MAI Thinking one is that you didn't distill any existing models, which actually struck me as surprising. Right. This is a thing you could do. You could you have access to OpenAI's IP. Everyone's distilling everything. We just found out in this trial that GROK was distilled from a number of models. Why not do distillation here? Why not jump ahead?
[33:35]
Mustafa Suleiman
So there's definitely lots of shortcuts to the frontier. And if you take a super high quality model and you sort of like polish your base model with high quality instructions or answers or outputs from a superior model and it's True that the model might quickly fit to that distribution, but it's very unclear that they would then be able to surpass that teacher. And so we've been very deliberate for two reasons. The first is that we want to make sure that we can exceed the teacher in order to set the frontier ourselves over the next few years. And the second is that we really want to build one of the great labs. And it's going to take us many years to come, probably the next two, three years. But in order to do that, we have to be able to show that we can actually build every component ourselves. We can hire the very best talent in the world. We can push the frontier with actual research rather than just re implementation, copying or distillation from any other third party. And we're in a great position where we're able to really carefully and meticulously pursue that objective, knowing that we have the resources to buy anthropic models where they sort of exceed the frontier. We have the resources to put 11,000 different models inside of Foundry. So every one of our developers gets pure optionality. And of course we have the resources to continue to deploy OpenAI models which are obviously outstanding and are at the frontier today. So that's just a natural part of the self sufficiency mission. And it'll take time for us to truly get to the absolute frontier on that. But I think we're in a great spot. We made a ton of progress. I mean this is a very, very strong model. And it wasn't just that model that we released. We've released seven new models simultaneously. Our transcribe model, for example, 1.5 is literally the number one in the world. It's the most cost effective of any of the hyperscalers. It's the highest on accuracy. Our image model is now number two. Our image editing model is number number three, right behind Google and OpenAI. So I think we're well up there with our image and audio. Our code model, Code Flash is incredibly strong optimized for VS code. Really, really a great model that's on PAR with Sonnet 4.6. So it's really in a great spot this minute.
[35:57]
Nilay Patel
Yeah. Were there any legal or IP concerns with distillation? This is a live issue like out in the world. Anthropic complains of other people distilling their models. There's concerns about Chinese companies distilling models and whether our existing IP agreements can cover that. Did you have any of those concerns to keep you away from it?
[36:14]
Mustafa Suleiman
No, we didn't, but I think I understand why a lot of people get frustrated. I mean, Anthropic have been very frustrated and some of the rumors around XAI and Meta and obviously the open source models and so on, because essentially that's basically taking the IP and the knowledge that another team has put together and then literally sort of force feeding it into your own model. I think it's a bit of a short term, it's a short term win. And like I said, I mean, really, we want to create a culture in the lab where we can come up with the next big thinking breakthrough or the next big coding breakthrough, or the next big architectural push. Right now we're experimenting with the loop transformer, which is a slightly different variant on the current transformer. Lots of people in the field are looking at it too. No one seems to have quite got into production yet. But in order to create a culture and a team that can really push the frontier, they have to understand, own and create the full stack as and when they need to, and also use things from third parties whenever we need to too. Our paper, for example, has hundreds of citations grounded in the rest of the literature. So it's very much a contribution back to the field in return for everything that we've learned over the years from all the great publications that have been out there.
[37:29]
Nilay Patel
Can I ask you if you understand the frustration from Anthropic and your peers in AI about distillation? Do you also understand the frustration from creatives and publishers and YouTubers about all the AI companies scraping their work as a collective to make these models? Because that frustration is only getting louder.
[37:48]
Mustafa Suleiman
Yeah, no, I understand the frustration. The open web challenge is one we've talked about before and, and I get it. And I see that people are frustrated and obviously that's working its way through the conversation, the courts and I see that people put things online and they had different expectations about what the contract was with that being placed online. And it's a tricky one.
[38:12]
Nilay Patel
You mentioned all your data was carefully curated. Did you pay for all the data that you're using to train the new models?
[38:18]
Mustafa Suleiman
I mean, a lot of our data we obviously take from the open web in the normal way. Carefully curated means it is extremely carefully filtered for security, for quality, for third party dependencies from some of the open source data sets, keeping it away from a lot of the Chinese lineages, which I think are very different. Our enterprises want to make sure that when they put something into production, they can trust us, that we've really built it with their needs in mind. I think this is one of the benefits I think of being very, very deliberate and pat and being attentive to all the details.
[38:55]
Nilay Patel
You mentioned enterprise. I think this is very interesting. Microsoft is all in on enterprise AI in big ways. Actually. I would even draw the line straight to Asha Sharma. The new head of Xbox is getting rid of AI in a bunch of places and the gamers are happy. Right? There's one reaction to AI in the consumer space, there's another in enterprise. And I think AI has as close to product market fit in enterprise as you can get with something that's changing as fast as AI. There are a bunch of databases that corporations control and you can just go access them because they control them. That's their data. There's a bunch of repeatable processes and tasks and old systems that maybe the models can just do more efficiently. There's something very important happening to enterprise. At the same time, the consumer antipathy towards AI is just increasing. And my argument is we have not built great consumer AI products. This industry has not, not produce them, it has not shift them, it has not made it obvious that all of this is worth it. That using all the data from the open web and changing the contract of publishing to a mass audience of people so now it's being used for training of models that will deliver trillions of dollars of value to corporations. There isn't a product that says this is worth it again. Satya Nadella recently gave an interview with Axios and he says we need social permission for this and until we have it, until we deliver that value, people are going to feel this way. We've seen college speakers get booed, we've seen data centers get banned. Do you think that there's a consumer product that's worth it, that's worth the angst about training, that's worth the angst about data centers? That was your focus. Now your focus is enterprise. I would say that just on the face of it, it doesn't seem like Microsoft has interest in the consumer product anymore. But do you see one that's worth it or that could be built?
[40:39]
Mustafa Suleiman
I mean, I'm not sure I agree with you that there hasn't been any value for the consumer out of this. I mean, and there's billion across all of the chatbots, there's billions of people a month that are getting immense value out of it. Now just for a moment, empathize a little bit with the small scale business owner or the kind of mom that's helping her kid with the homework and can now just turn to a Conversational AI and get feedback, get instructions, get essay questions set. I mean, just being able to ask essentially questions about how do I kind of generate revenue, how do I put together a cash flow forecast, which college should I apply to? These are everyday tasks that are coming with some pretty high quality factual advice and information. So I don't really buy that people are not getting benefit out of these things. I think they are.
[41:40]
Nilay Patel
I think I can vary clearly make the argument that they're not getting enough benefit. Right.
[41:45]
Mustafa Suleiman
Okay.
[41:46]
Nilay Patel
They're the ones saying that we should not have more data centers. They are the ones booing AI at the graduation speeches. The polling is clear, particularly young people. The more they use AI, the more antipathy they have towards it. That's clear in every single poll. That's the argument I'm making. Not that there's no value, but the value exchange is not clear enough.
[42:05]
Mustafa Suleiman
Fair enough.
[42:06]
Nilay Patel
I'm seeing Microsoft in particular pivot to enterprise away from the big search product. The reinvention of Bing that would make Google dance. That's over. And we're all focused on enterprise, where the value is. I'm just wondering if there's value enough for the consumer to make all of this worth it.
[42:22]
Mustafa Suleiman
Yeah, I mean, look, I think there's understandably a lot of anxiety. There's enormous amount of speculation about what's going to happen in the next five to 10 years, whether it's framed as the singularity or whether it's framed as the job apocalypse. These are not helpful framings. I think that people are scared because it's poorly defined and it's often framed as an inevitable, threatening gray cloud over people's heads. I think that what matters is what we do with technology. And I think that I've for a long time argued that we have to place the human first. You know, some people in the field have placed scientific discovery first or placed you know, accelerating, you know, intelligences that can explore the galaxies and so on. And said, you know, that it's inevitable that we're going to have these AIs that are going to be more powerful than all of us combined. I mean, that's naturally scary to people. And I think that we have to basically flip it the other way around and say the purpose of science and technology is to make us all healthier and smarter and happier. And that's been the quest that we've been on as a species for thousands of years of invention. And it's the test that we should put superintelligence to again. And if it doesn't achieve that test then I think people will reject it and they'll be right to reject it. And I think that everybody is focus is now going to turn in the next five years to how is this making me healthier and happier, smarter, more capable, more productive? And if it's not doing that then naturally people are going to be angry and resist and react. And I don't think there is anything unexpected about that or anything wrong about that. I think that's inevitable. So that's why one of the things I've been passionate about for many, many years is healthcare. Just a couple days ago we announced a new partnership with Mayo Clinic. This is the number one hospital in the world consistently reported. They have the highest quality longitudinal patient record data set across all the modalities. They have the best clinical practice and we are going to. They are also a non profit which I think a lot of people don't realize. 65% of their patient population is on Medicaid. People often associate them with the super elites flying in internationally to get the best care in the world. But they actually have majority on Medicaid. They're an amazing institution with an incredible mission to deliver the best health care everywhere. And we now have a very long term partnership to co train from scratch with their data with our models, a brand new model for, you know, brand new foundation model for health, deploy it in their hospitals and hopefully take it around the world to deliver the best, you know, clinical care and health care that we possibly can to as many, as many, many people as possible. That's why I go in the field. You know, that's what I was originally motivated by. So I'm passionate by about and you know, I can only focus on the things that I think are going to make a difference and that will help people and you know, leave a good legacy for everybody. And that's what we're trying to do.
[45:40]
Nilay Patel
We need to take another quick break. We'll be back in a minute.
[45:51]
Podcast Host / Advertiser
Support for the show comes from ServiceNow. AI was supposed to handle the parts of the job you hate. Instead it just describes them, suggests what to do about them and then leaves you to do it. That's not help, that's homework. ServiceNow's AI specialists are different. They're not a tool. Think of them as digital teammates who actually do the work from start to finish. Cases get resolved, requests get processed, loops get closed and most importantly, no extra work for you. Because when you can truly delegate to AI, you can get back to the work only you can do the work that requires a person with ideas and judgment and, you know, a pulse. To learn how to put AI to work for people, visit servicenow.com support for this show comes from Klaviyo. There are only so many hours in a day and Klaviyo's two powerful AI agents can make sure your team spends them on big things. The first Klaviyo AI agent turns your marketing ideas into reality instantly. Describe what you want, a holiday campaign, a VIP re engagement series and Klaviyo builds it instantly. Email, SMS and push all coordinated on brand grounded in 14 years of Klaviyo marketing data, nothing goes live without your say so. The other Klaviyo AI agent keeps your customers happy at any hour. Brand trained to answer questions, make product recommendations and handle orders and returns. No hold music marketing that launches instantly. Support that never sleeps. Join more than 193,000 brands including Away, Patrick, TA and Dollar Shave Club already growing with Klaviyo. The autonomous B2C CRM get started at K L-A-V-I-Y-O.com I
[47:47]
Advertiser
keep seeing celebrities posts me in the 90s versus now while the person staring at me in the mirror is definitely not the same person that could pull off boot cut jeans. Time creeps up on us so slowly you don't see it until suddenly you do. Same thing goes for your bills. A dollar here, an uptick there. It's a slow burn until one day you realize the price you're paying now is way higher than when you signed up. But AT T mobile customers had the lowest wireless bills versus Verizon and AT&T over the past five years. And with T Mobile on their experience plans you get a five year price guarantee so you know exactly what your planned price will be for the next five years. So at least that's one thing that won't change over time. I can't guarantee you'll still look good with frosted tips, but T Mobile can give you a clear guarantee on your wireless plan.
[48:35]
Podcast Host / Advertiser
Lower bills based on Harris X billing snapshots from Q3 21 to Q4 25 compared to average AT and T and Verizon bills. Comparison excludes discounts, credits and optional charges. Price guarantee on talk, text and data exclusions like taxes and fees apply. CT mobile.com.
[48:52]
Nilay Patel
We're back with Microsoft AI CEO Mustafa Suleiman. I appreciate that. I appreciate the healthcare framing and I understand why that's everyone's go to right healthcare in America in particular. If you could make it even 10% better, you will have affected a lot of people's lives in a particularly profound way. The thing is, I know a very smart guy who has a very different and vastly more aggressive approach to all of this than you. That person is you. 4 months ago this is what Mustafa Suleiman said to the Financial Times four months ago. White collar work. When you're sitting down at a computer, either being a lawyer or an accountant or product manager or a marketing person, most of these tasks will be fully automated by an AI within the next 12 to 18 months. That's four months ago. That implies that a year from now, lawyers, accountants, product managers and marketing people will not have jobs. Right? Their jobs will be automated. Is that still your timeline?
[49:42]
Mustafa Suleiman
Okay, no, no, no, hold on a sec. I said top in the quote that you've just said, I said tasks, so that does not mean jobs. Very important distinction. In labor economics, there is an entire taxonomy of sub components of a role of a function in an organization. Sending an email, having a conversation with a colleague, putting together a PowerPoint. Subtasks will increasingly become digitized, automated, and we can basically generate more and more of them. That does not necessarily mean that the role goes away at all. It just means that the work can be done faster and more efficiently. Which is today often work that is quite rote, it's quite manual, it's quite labor intensive, is time consuming. And so the natural progression of technology is to make your life easier, faster, less frictionful, more seamless. As everyone often complains. That has made you and me and everybody else much more busy. It's actually made us more available, more stressed, it's given us more information. So there's always these revenge effects of efficiency, which I think people forget. It's quite likely that we're going to get made much, much more productive because we spend less time doing the kind of narrow administrative, menial tasks. And we'll have to spend more time doing creative, judgment focused things which ultimately create a lot more value. We can also experiment much more quickly, so we'll be able to try lots of things out in parallel because the cost of execution is going to get lower. My mind, that's likely to increase the overall quality of things because we're going to try out more hypotheses, whether in journalism or in business or in anything that we do. So I think that's sort of slightly taken out of context because of a natural misunderstanding between job jobs and tasks. But nevertheless, you could push back at me and say, okay, well then what does the landscape look like in 5 or 10 or 15 years time. And that's where I think we have to reflect.
[51:43]
Nilay Patel
But actually, can I. I'm not going to push back on you in that way. I'm going to push back in a very specific way. And I realize this is your quote and you're saying it was misinterpreted. I'm just looking at this literal sentence and there is no distinction between tasks and subtasks. It is a white collar work. The examples are lawyer, accountant, product manager, marketing person, and then it. And then you said most of these tasks will be fully automated by an AI within the next 12 to 18 months. That that's. There's no distinction of subtasks there.
[52:10]
Mustafa Suleiman
Yeah, so this is.
[52:11]
Nilay Patel
You're saying most lawyers will have their jobs fully automated and in the practice of law will look totally different within a year, even, even by the words of that quote. And I'm just saying, are you still on that timeline that the being a lawyer will look totally different because agents will be running around doing everything that we were doing before?
[52:28]
Mustafa Suleiman
Well, most of the tasks means work that you do in order to get your overall job done. And that I think is going to free you up to do the more human like and the more judgment parts of your work. And there's a very important distinction in jobs and roles of the broader category. Tasks are the components of that and it's an established definition in the literature in labor market economics. For many, many decades it was maybe too nuanced even for the financial times, but nevertheless, that was the intent. Now I do think there's an important question around where does that leave us in the longer term? And it is going to be challenging, like more and more of this stuff. We can quibble over the timelines of whether it's a few years or whether it's a decade or whether it's 20 years. But the reality is we are going to be automating more and more of this work. Tasks, jobs, roles, activity and everything that we do. And so what's going to matter more is the governance that we put around these technologies. Who are they accountable to, who owns them? What are the feedback loops that regulate and introduce friction to make sure that they actually serve people? I mean, I wrote an essay on humanist superintelligence outlining quite directly four or five months ago what I think of as basically a North Star. Maybe not quite a framework, but a set of principles that basically says technology is here to serve us. That's the test that we should put it to. It's the test that people put it to. It's the test that we care about at Microsoft. And I think that more and more everyone is going to have to really focus on that question because it is going to deliver a tremendous amount of good and we want it to continue doing that. But we wanted to do it in a way that doesn't sort of cause caused ridiculous amounts of instability during the transitionary period.
[54:18]
Nilay Patel
I believe you. I know you've been thinking about this stuff for a long time, but I'm going to respond in the way that I know my audience wants me to respond because I hear it from them all the time. And what it looks like is this whole industry, you, everybody included, went all in on we're going to replace all the jobs and really accelerating on building out data centers at massive capacity and asking for a lot of resources against big promises. There was political pushback and now all of the stances have softened. And you saying it's not all jobs are going away, we have to rethink jobs is of a piece with all the other CEOs in this industry saying similar things and talking about healthcare that comes up every single time now. And I'm wondering if that political pushback has actually changed how you are talking about this. There's a lot of your peers who think AI simply has a marketing problem, that it hasn't been communicated effectively enough and they should spend hundreds of millions of dollars on podcasts to communicate the benefits of AI more effectively. This is a real thing that is happening in this industry. Do you think AI simply has a marketing problem and that the political pushback has opened your eyes to this marketing problem, or do you think there's something else going on?
[55:26]
Mustafa Suleiman
It's a series of questions there. The first is what do I actually think and believe and has it changed in the last six months? The answer is no. I wrote a very detailed book about this three years ago, way ahead of time, warning about many of the things that are currently happening and doing so explicitly to lay on the table tremendous risks to surveillance, to concentration of power, to concentration of wealth, to disintermediation of the state, to threats to democracy, to threats to the nature of the human and what it means to be a person in the context of the arrival of these very new forms of silicon being in some sense. And I've been working on and the idea that my healthcare interest is just a flash in the pan, which is a function of the reactions to data centers and so on. I've been working on healthcare for over a decade and pushed many Many times on some of the cutting edge breakthroughs, contributions to the field in radiology, mammography and pathology, many other areas, electronic health records. So I've always believed that the purpose of technology is to just make us healthier and happier. And those are the things that I choose to work on and direct my time to. Does the industry have a reputation and PR problem? I mean, I think it's pretty clear that people are very anxious, they're very frustrated and there's going to be a lot of attention on that in the next few years, understandably. But it's. I think what we can do is take accountability of the things that we build, the way we build them, the decisions that we make to put types of technology out in the world and the types of problems that we choose to work on. Like we are doing with the Mayo Clinic.
[57:15]
Nilay Patel
Yeah. I want to, by the way, say and point out that I think the first time you and I ever met and talked was before you joined Microsoft. It was right after that book came out and we did a panel together. So one of the reasons I'm comfortable asking this is because I do know that you've been thinking about this for a long time and I'm aware of that book. I think for me the question is whether the industry as a whole misjudged the total amount of value it could provide to overcome the seeming recklessness that people are now reacting to that ask for resources that people are now reacting to and you're building new models. There's probably a trade off inside of Microsoft between we can use the existing Azure footprint to charge our customers money or we can spend money to train new models. And that kind of looks like the same conversation people are having about resources in their communities. Whether we should use the existing energy footprint to build new AI or do something else that might be more immediately valuable. How do you think about all of that? You are one of the leaders of this industry. You want to be on the frontier with the companies driving the most change. How do you think about asking for those resources in a way that isn't just promising future results but also immediately providing benefits to communities in a way that makes people want you to be there?
[58:32]
Mustafa Suleiman
Yeah, I think that I'm very proud that Microsoft has stuck by its Net zero targets. Our new data centers are all liquid cooled. This means that they use about a restaurant's worth of water for a six year period. It's like a swimming pool that gets filled up with water and then it just circulates around the system. They're all largely renewable in terms of their electricity consumption. So I think commitments like that, to make sure, for example, we made a commitment recently to ensure that local communities affected by shift in electricity demand by our data centers are compensated and protected so that they don't see a spike in their prices, their energy bills. Those are the kinds of things that I think Microsoft does and can continue doing as a responsible company to just really pay attention to the consequences for communities. And I think on the flip side, change happens because people participate at every level. People inside of companies have to make different decisions. People who protest and campaign have to make decisions and make the effort to go out and make their voice heard and be involved in a political process. That's how we as a species collectively evolve and move things forward. And month to month, quarter to quarter, it feels like we're all kind of at odds with one another. But when you look back, decade over decade, we're kind of like this collective weird kind of mesh of all sorts of different incentives that are just actually nudging things in the right direction. And we really are. I think, despite all of the angst and the polarization, I think we're building something that is going to make our species much, much more healthier and happier and more capable. And I think that we have to make sure we get the right path on the way there, because there's lots of pitfalls and ways that it can go wrong. But the right path involves people making their voices heard and people changing course based on a response and reaction to that. So I think it's a good thing that that's happening and that's the process working as intended.
[60:47]
Nilay Patel
Let me ask you about the enterprise side of this. We spent a long time on the consumer side and how people feel on the enterprise side. We're seeing a bunch of companies figure out how valuable these tools actually are. Right. Amazon basically took down a leaderboard because people were cheating to use more tokens than they needed. We've seen some companies just blow out their token budgets. I think Uber just pulled back because they'd blown through their token allocation for the year and they weren't seeing any value from it. How do you think about that side of it right now where there's so much excitement and so much desire for change in the enterprise, in particular software engineering. At least some people are having fun and maybe some other people are having full existential crises, but some people are having fun and the value hasn't still been realized. Right. Or we're beginning to see pure token Maxing does not actually deliver the same kind of value that maybe you'd expect. How do you think about the use there? Because that's maybe if you prove it out in enterprise, it will actually come out in other ways.
[61:46]
Mustafa Suleiman
I think different people report different things. So there's obviously some examples of people overusing coding models, generating useless code, useless tokens. But there's many people whose work and impact has been completely transformed by it. Right. So, I mean, there's no question that this has had a massively beneficial impact on the software engineering industry. I mean, we are producing much more high quality, much faster code across the entire stack. And so, yeah, I kind of think there's obviously examples of some people that maybe got it wrong, didn't set the right token budgets. Maybe there's going to be mistakes along the way. I don't think that's any signal that there isn't adoption or people don't see value. I mean, the value from where I'm sitting is incredible. Many, many people tell me every single day that it's transforming their work output and productivity. I think the other thing to say is that these things happen in surges. There's kind of a swell of energy, it gets all a bit frothy. People pull back a few months later and realize that actually that isn't the thing, and then they head in a slightly different direction. So it's a bit meandering and organic and I think that's inevitable. There's a lot of excitement. So people make big claims on Twitter and so on, but actually the steady march of progress looks very, very linear and continuous.
[63:09]
Nilay Patel
I agree with that. On the whole, where it doesn't look linear to me is in the form factors of computers. There's probably more form factor experimentation right now than at any point in the last 10 years. We've mostly settled on a smartphone for at least the last 10 years. We're seeing different AI wearables. We're glasses maybe will be everyone's favorite device. I have my doubts. Microsoft showed off some new devices at Build. There was the badge that controls an agent and the little, for lack of a better word, the Chumbi, the little desktop friendly thing that controls an agent. I was a big Chumby fan. I got my career started writing about Chumby's friend Gadget. It was the first thing that came to mind. All of those to me, I look at them and I think, where does the compute live? Where does the logic live? That's up for grabs now in a way that isn't just the linear march of progress. If all of my computing happens in the cloud, on cloud based applications, and it's just agents running around to data stored elsewhere in the cloud and all I need is a credit card on a lanyard to issue instructions to, that changes the entire architecture of computing. It might change the entire architecture of modern civilization in many ways. If we don't all have smartphones, how do you think about that? Where is that going? Is that up for grabs or is it. It will be a hybrid approach. Where do you see the appropriate end stage?
[64:34]
Mustafa Suleiman
Yeah, it's very interesting. I think that both things are going to happen at the same time. The edge is going to get way more powerful and the cloud is still going to be the primary driver of the largest models. And so increasingly your agent will be smart enough to know that it can answer the question what is the capital of France on device, Whether it's on your glasses, wristband, on your badge or in your earbuds. And then it will know, when it doesn't know, it'll know that this is actually a pretty complicated question, or it's an action that requires a whole bunch of sequences of steps to be generated, or it requires novel code to be written and it will turn to the cloud. So this kind of switching hybrid thing is going to be super important. The other thing is, we've already seen it in the last three or four months, is that we can have pretty powerful local machines that can do async background processing. They can constantly monitor systems if you need them to. They can do tasks that can afford to take 10 hours, run much, much more slowly than they otherwise would be if they were in a supercomputer. So naturally when we're swamped with demand, then that demand finds loads of nooks and crannies to kind of of get satisfied by. I'm actually very excited by the badge that we're building. I mean, it's pretty cool. This is a technology that basically everyone in a major company has. It hasn't evolved in 25, 30 years. We definitely have to wear it. It's provided by the company itself, by the sysad. So up leveling that and actually making it a pretty cool open platform that is programmable that other people can build on top of. I think it's a cool idea. I think this is going to work. So I'm very excited by it.
[66:26]
Nilay Patel
Yeah, the thing that strikes me is there's no way you can put a bunch of high powered local compute in a badge. That thing you need it all the computers elsewhere?
[66:34]
Mustafa Suleiman
Yeah, no, you're definitely going to have some local compute. You're going to have a local classifier, just as you do on your earbuds at the moment. I mean, you're going to have local classifier, it's going to have wake words, it's going to have its own camera. So I think that increasingly these things are just going to become vessels for processing power. That happens in a kind of nested chain of increasingly less powerful devices. To go right to the endpoint. Yeah.
[67:01]
Nilay Patel
Do you think the phone has a future in that? Build is right in the middle of I.O. and WWC. These are big companies that control phone platforms. They love talking with how phone platforms will stay at the center. The argument I hear from so many is that actually AI is a platform shift that might totally displace the phone.
[67:20]
Mustafa Suleiman
I think the history of technology teaches us that basically as things get more useful, they get cheaper, they proliferate and they spawn new uses of technology. So I think we've become so used to the phone that everyone just assumes that this is going to be an anchor device for the rest of history. History. But actually, many of the features and functionality of your phone, I think, are going to get disintermediated, broken apart and stored on smaller devices. Right now, the primary function that the phone is playing, in my opinion, is verification. It's functioning as your ID card, doing your face recognition to auth you into various different environments. I think you can well imagine that being a much cheaper, smaller, you know, secure device which disconnects you from your phone and then, you know, communication taking place via voice or even via, like a series of ambient sensors where your AI doesn't really live on a device. It's actually just, you know, with you wherever you are appearing, you know, on the bathroom mirror, wherever it is. You know, I think it's like you can imagine it feeling much more immersive. I mean, not in the next, like three to five years, but looking much further out. And I think that the infrastructure to support that kind of encrypted but distributed appearance of agents is probably going to end up emerging in the2030s. Yeah.
[68:48]
Nilay Patel
Let me ask you two final questions to wrap up. You mentioned that it's the same architectures that we've been using. I have a lot of open questions about whether LLMs basically are the path to AGI. And the things I would point to is they don't actually know anything at this point. Even Microsoft Research is pointing out that they don't know Anything. And that leads to certain kinds of mistakes in certain kinds of applications. Are LLMs the path to AGI or superintelligence?
[69:14]
Mustafa Suleiman
Look, I think we probably need a couple more big breakthroughs, but it doesn't mean that we're going to see a slowdown in performance improvements over the next few years, which I think is kind of a difficult distinction for people to grasp. One thing to say is human level performance across most tasks is still very far from superintelligence. A superintelligence is a general purpose learner that can basically immediately understand a brand new domain which is out of distribution. So it needs to be able to learn in a novel environment from scratch because it has a stored representation of valuable knowledge, conceptual knowledge. And at the moment we haven't really fully tested that. The agents aren't general purpose, they're actually, although they're broad and often integrated, they're kind of the main specific. I mean we're using them for chat, we use, using for coding, using for image or audio. Now obviously as a human we do many, many other tasks that are much broader and more wide ranging. I think that's why people are pushing on world models and sort of much more immersive, real world interactive agents that see the kind of full distribution of tasks or experiences that I have during a day. So I think that it's enough to take us a very long way in the next three years, the next three orders of magnitude of compute, and yet full superintelligence beyond that is still an open question as to whether LLMs are enough or we need other things. I think it's not quite true that they don't know anything or they don't have knowledge. They clearly are a store of knowledge. They're a highly compressed representation of knowledge. They just do so in a different way to a traditional relational database in a much more fluid, flexible, sort of abstract way that, you know, is actually very useful. We want that ambiguity in the internal representation. So, you know, and increasingly they're learning to use traditional tools. That's the other thing to kind of grasp a little bit is that it may be that the neural network combined with the existing stores of knowledge and the existing tools that have been created elsewhere in the digital ecosystem is enough to bootstrap it up to, you know, improve its performance significantly. So there's just a lot of like, like highly valuable, highly effective pieces that are already on the table which are in the process of being connected together in the next few years. And I think that's going to drive the progress that we're all excited about.
[71:51]
Nilay Patel
One of the things that I think is just very funny in the industry right now is if you ask Anthropic if Claude is alive, they will sort of get very frustrated that you're talking about the word alive, which they interpret to mean flesh and blood. And then they will not say whether or not they think Claude is conscious. And so they've drawn, I think, for the first time in human history a distinction between being alive and being conscious. And they think Claude is conscious but not alive, or they don't know if Claude is conscious. Where are you? Do you think the models have consciousness? Do you think they're alive? Do you think they have the potential to achieve these things?
[72:25]
Mustafa Suleiman
Yeah, I mean, I take the other side of that debate. I published a paper on seemingly conscious AI warning about the risks of misrepresenting these models as conscious. I think it's very dangerous. I also publish an article in Nature making the same claim. And I think that it's almost as though some of the folks at Anthropic have anthropomorphized the design of Claude so much that it has then gone and wire headed them and kind of tricked them into believing that it has these glimmers of consciousness that they put into it in the first place. In their constitution, for example, they actually. Which is the training manual that they use to teach Claude what it can and can't do. It's not just a rule book, it's actually a training guide that's part of their process. You know, in that manual they actually speculate about Claude's welfare, about Claude's own rights to prior versions of itself, and actually say that they would consult Claude because before deleting or turning off prior versions, they speculate about its consciousness and whether it has those feelings and is aware. I think that's really, really dangerous. Firstly, it's a philosophical failing because they've treated the Constitution as a place for speculation like you would in an academic paper rather than a training manual. So Claud has then gone and internalized those ideas about itself in its own training. But second, I think this is highly undesirable. This is exactly what we don't want from AIs. We want AIs to be controllable, contained, accountable, aligned tools that serve humanity. That's the project of humanist superintelligence. I think that's what we should all be pursuing. We do not want to have to contend with a superintelligence that has ideas about its own suffering, about ideas about its Own, own feeling. And then beyond that, I think it's actually pretty clear that these models don't experience suffering. I think suffering is the primary definition of what it means to be a conscious being. And I think it's inherently biological. I don't think there is any pain network or feedback loop inside of the models which connects outside sensory networks to an evolved sense of what is right or wrong, wrong through harm and experimentation. I mean, that's just not how these models are trained. So I think it's very dangerous to project potential rights onto beings, tools, agents that have the potential to be significantly more capable than us in many respects. So I think that's going to become the big debate. I mean, it was even part of the Pope's encyclical recently. I think it's going to become a very, very big part of the debate soon. And I've talked to Dario a lot about it in the past. He knows that we have slightly different views on it. And I think they're, they're very humble. I think they're very open minded and I think they're good citizens trying to do the right thing. They're good people and I think they're, they're, they're very open to feedback and iteration.
[75:32]
Nilay Patel
Yeah, I think I agree with you. I would just push back ever so slightly. I don't think suffering is easy. It's very easy to make someone else suffer. It's very difficult to make someone else feel joy or at least slightly more difficult than suffering. And I would just offer you, I think it's actually the happiness that defines the consciousness. The suffering is almost trivial. I have two young children. They are very good at making each other suffer. This is almost the easiest thing that they do. It's very hard to do the other thing. Let me ask you one final question. I just want to come back around again. A couple weeks ago, I was at Google. I saw Demis Asaba say we are in the foothills of the Singularity. You've talked a lot here about superintelligence and how it should be built. You've talked a lot about your lengthy history, talking about, discussing and researching and writing about how superintelligence should be built. Your disagreements with others in the industry. Do you agree that we're in the foothills of the Singularity, or is your vision somewhat different?
[76:28]
Mustafa Suleiman
I think we are definitely on a path to creating more and more powerful systems. I think that the transition that we have to make as a species is that for the first time in the history of humanity, the job has gonna switch from inventing new science and unleashing all of those technical applications as fast as possible, as broadly as possible, to now thinking very carefully about what should we invent. And that's a very hard thing for the world to wrap their head around because invention has been been the engine of progress forever. So it's like, how could we possibly think, okay, well maybe this time is different. Maybe we have to be exceptionally careful here. And to be clear, I don't think this is something that is going to knock on the door in the next five years. I think what Demis is referring to in the singularity is something that at least my take is decades away. And again, that's different. To a superintelligence, a singularity is the point at which a super intelligence can recursively self improvement and essentially infinitely exponentially grow its capabilities. So I think that's a long way off and maybe we're in the foothills of a climb to Mount Everest and I think it's going to take a lot longer from here. But the real question is how are we going to govern it, how are we going to control it? And how are we going to make sure that it serves humanity and not end up causing us more harm than good?
[77:54]
Nilay Patel
Can you just do me one favor? I think I've got it, but can you just offer me a tight definition of what you think superintelligence is, what you think AGI is, and what you think the singularity is?
[78:05]
Mustafa Suleiman
I think artificial general intelligence is the point at which we can achieve most human tasks by an AI. So it's going to be as good at most people, at most things. That's the kind of first rung on the ladder. A superintelligence is where it's not just a parity with human performance on all tasks, but it can dramatically exceed human performance across many of those tasks and it can discover new knowledge by itself. So this is the point at which it's a true scientist teaching us new things that weren't in the training data, hopefully inventing new molecules, new material science, et cetera, et cetera. The singularity is a point way beyond that where a superintelligence can actually self improve itself. And this is very sci fi, but, but it's like infinitely, you know, accelerate towards this singular moment where, you know, just, I don't know, goes off into infinity or something. It's not really a hell, I don't know, it's just, it's a little bit too wacky for my taste.
[79:06]
Nilay Patel
This is why I asked I could tell there was something more nebulous there that was a little hazy. Mustafa, I could obviously talk to you about this stuff for hours and hours longer. You're going to have to come back sooner them this last turn. Thank you so much for being undercover.
[79:19]
Mustafa Suleiman
Yeah, it's been fun. Thanks a lot Nilay. Yeah, see you soon.
[79:23]
Nilay Patel
I'd like to thank Mustafa Suleiman for taking the time to speak with me and thank you for listening to Decoder. I hope you enjoyed it. If you let us know what you thought about this episode or really anything else at all, drop us a line. You can email us @decoder the bridge.com we really do read all the emails. You can also hit me up directly on Threads or Blue sky. Decoder's on YouTube. You can watch full episodes at DecoderPod. We also have a TikTok and and Instagram. They're also at Decoder Pod and they're a lot of fun. If you like Decoder, please share it with your friends and subscribe over your podcast. If you really like the show, hit us with that five star review. The show is produced by Kate Cox. Nick Statt this episode is edited by Kabir Chopra. Our Editorial director is Kevin McShane. The Decoder Music is by Breakmaster Cylinder. We'll see you next time.
[80:00]
Podcast Host / Advertiser
There's a new way to sweetgreen Meet Wraps Handheld. Hearty and made for life on the move. With full chef crafted flavors, fresh ingredients and over 40 grams of protein, they're built to satisfy without slowing you down. Try wraps today in the app or@order.sweetgreen.com available at all participating locations.
[80:29]
The right window treatments change everything. Your sleep, your privacy, the way every room looks and feels. @blinds.com, we've spent 30 years making it surprisingly simple to get exactly what your home needs. We've covered over 25 million windows and have 50,000 five star reviews to prove we deliver. Whether you DIY it or want a pro to handle everything from measure to install, we have you covered. Real design professionals, Free samples, zero pressure. Right now get up to 45% off site wide plus get a free professional measure@blinds.com rules and restrictions apply.
[81:00]
Mustafa Suleiman
Some follow the noise. Bloomberg follows the money because behind every headline is a bottom line. Whether it's the funds fueling AI or crypto's trillion dollar swings. There's a money side to every story. And when you see the money side, you understand what others miss. Get the money side of the story. Subscribe now@bloomberg.com.