
Loading summary
A
Anthropic product head Mike Krieger joins us to talk about how AI model development is accelerating and what we should look out for as things continue to move faster. That's coming up right after this. Capital One's tech team isn't just talking about multiagentic AI, they already deployed one. It's called Chat Concierge and it's simplifying car shopping using self reflection and layered reasoning with live API checks. It doesn't just help buyers find a car they love, it helps schedule a test, test drive, get pre approved for financing and estimate, trade and value. Advanced, intuitive and deployed. That's how they stack. That's technology at Capital One. Welcome to Big Technology Podcast, a show for cool headed and nuanced conversation of the tech world and beyond. Well, Anthropic has a new model out, Sonnet 4.5, just months after the series of Claude 4 models came out. So things are moving fast and we're going to figure out why they're moving much faster and what the implications are for the AI industry and businesses as a whole. And we're joined today by the perfect guest to do it. Anthropic product head Mike Krieger is here with us. Mike, it's good to see you again. Welcome to the show.
B
It's good to be here. Thanks, Alex.
A
So I remember sitting in the audience for Anthropic's first developer day and it's funny because in the AI world you sort of, you go in, what is it, cat years or dog years? I don't even know. Every month feels like a year. And this was in May, May 2025. And I remember yourself and Dario were on stage saying, yes, we're, we were releasing Claude 4, but you know, we're going to release the next iterations much faster than we ever have previously. And we're already at 4.5. How is it happening?
B
I think there's a couple of things that we're seeing. I mean even just thinking about, I mean may again feels like a year ago. I think Dog Ear about right. I think there's a couple of things. One is we've been working much more with sort of end users, sort of customers of, for example of our platform. And with that we can hear like a much faster feedback loop of hey, Sonnet 4 is great in these ways. We wish it was better in these ways. And you're starting to get customers that really push the models in really interesting ways. And that ends up being very helpful for us on the research side because then we can say, all right, these are problems to be tackled in the next version of Claude. So for example, one of them was Claude Sonnet 4 and even Opus 4. Opus is our biggest model, is good at writing code, but tends to get sidetracked or lost if it's working over longer time horizons. That was a real emphasis of Sonnet 4.5. Or we've put a lot of data into the context, basically how the model is what it's thinking about at any given point. But at some point that gets filled up and how do you then manage to keep working on those things? So having that feedback loop really helps. And it also gives us a lot of urgency because it means that there's sort of almost like bugs in some ways out that you want to go fix or at least like feature requests that you want to go fix. So that's one piece. The other one is we've just streamlined a lot more of our model release story. So I think having now seen, you know, I joined shortly before Sonnet 3.5, which was back in like May of last year. So really long time ago in AI years from then to now, just the sort of operational up leveling that I think we've seen in terms of how do we get early access feedback from customers, how do we give the remainder of customers a good heads up so they can co launch on launch day? What does even that morning look like on rollout? I was talking to a customer, he's like, I've seen a lot of lab rollouts of models and this was the smoothest I've seen, which I took as a big endorsement of how much we've streamlined that model release process. That just makes it so that every release doesn't feel like this very bespoke, very difficult process that can be much more a great, like we know what we're doing, here's the date. To the extent that research can be predictable, which it can't be, but within that domain, how do we actually make that as smooth as possible?
A
Right? And maybe I'm looking at this from a dumb outsider's perspective, but the one thing that I didn't hear you mention was scale. And you know, hearing so much about the scaling laws, especially from anthropic, you know, pardon me, believe that like, okay, four is, you know, Claude, four is X number of GPUs and 4.5 is Y number of GPUs and five will be Z number of GPUs. So does the numbers in your model release, you know, rubric correlate at all with the scale of the data centers that you're trading on and the scale of the data?
B
I mean, I think what's been interesting is at different points and if you talk to Jared Kaplan is our chief scientist, he'll I think tell you much the same is the scaling laws, I think paint a picture of what is possible but is not predetermined to actually get there. There's a lot of actually really difficult, both machine learning and engineering work. So I think one thing that's been notable to your question about scale over the last six months is how much it's been really engineering. If you're going to do both pre training and post training on an increasingly large number of accelerators, how do you make that reliable? How do you keep that run as we call it going even if some portion of it has an issue? So a lot of the, I think to your question, a lot of the improvement in our ability to deliver these models really has come from our ability to run these large training runs at scale, which you know, again fundamentally an engineering and machine learning problem. I think both have improved. I think if I pointed at something between Sonnet 4 and 4 5, a lot of it really has been on the engineering side to just be able to scale up. Especially a lot of the post training work.
A
If I'm reading you right, it's not necessarily gains that Anthropic is seeing from scaling up the data centers. It is algorithmic work that is being done by your teams to make the models better.
B
They really come together. I think it's the algorithmic work and then the ability to maximize the amount of compute that we can use on those algorithmic improvements. So they really kind of go hand in hand, sometimes directly hand in hand in that either an idea that works at small scale when you scale it up doesn't work as well. And then other times an idea only works when you get enough data and scale in there as well. So it really becomes, when I think about our, our team actually just brought in a new CTO and a lot of I think his remit will be how do you really partner research and our core engineering teams together to achieve that kind of scale?
A
Okay. And another thing I was expecting you to say, which I'm not sure if I've heard yet, is that teams within Anthropic have used the coding capabilities of your AI models to be able to ship faster. Is that a sort of supporting character here or is it, is it the star?
B
I think it's, it's a good, I'LL have to think about that for a sec. I think it's a little bit of both. I would actually say there's. There's a thing that is emergent even beyond the coding capabilities, which is the ability of Claude to be a really active participant in the process. And here's what I mean by that. You know, I think about the way Claude was being used around even Sonnet 4 was, you know, help write code, you know, to launch these models, help write the product code, for sure. Contribute really strongly to Claude code. You can imagine Claude code itself is like a very sort of. We use cloud code to develop cloud code very much in a loop. I think that the biggest delta between 4 and 4.5 is that now we have much more of Claude as an agent or almost like a coworker in, for example, our Slack channels. So, for example, we have something we built that's Claude on call. So if you've been an engineer, one of the things you have to do is, you know, you take the metaphorical pager, which is basically you're on call for a week or two to manage a system, and, you know, if you get paged, you'll show up and say, like, all right, there's a certain number of things that could be wrong. I got to go check these graphs. I got to maybe try this out. And one thing that we've built using the Cloud Agent SDK, which we also released alongside it's on it 4.5 publicly, but we've been using internally for a while, is the ability for Claude to basically show up first in those incident channels and already have a sense of what might be going on and be able to answer really quickly, hey, can you do some data diving while I work on something else? And so we've increasingly had Claude play these sort of. Yeah, these really collaborative roles within our company, even beyond the ability to code. And it's again, using the same technology as Claude code under the hood, but it's accelerating the company in being more efficient or better able to scale up or better to. Able to understand it. So I think the answer to your question is it's a supporting role on the sort of building side, but it's playing a much more fundamental role in terms of the actual operational side.
A
So let me see if I can zero into it. So instead of basically being autocomplete for coding, this is actually going out and being proactive, examining things and then coming back with insights.
B
Exactly. And we have similar sort of, you know, agents is the. I guess the industry term of art now. But I feel like agents can mean so many things to different people right now.
A
What does agents mean to you? If you're going to, if we're going to start talking about agents, I, I need a definition of this word because I'm struggling to figure it out.
B
I think the purest definition, and this is not so pure because I'm probably going to use like 20 words to do it. So maybe we can edit it down together. But it's no, this is going in full. Yeah. AI systems that can plan and sort of run actions over long time horizons using a variety of tools where the kind of steps are not predetermined. They're able to solve problems dynamically based on what information emerges on it. So there's, you know, I end up having this sort of agent kind of scorecard that I've been using internally as we think about our own products. And there's a bunch of characteristics that I look at. This is way more than 20 words, Alex. So attributes I look at are things like autonomy. So how long can the, can agent run unconstrained? So Sonic 4.5 was a big leap there. Proactivity, like is the agent able to not just react to questions, but actually sort of, sort of suggest either ideas or interject. Ability to use tools and often a variety of tools. Some of them might be research tools, some of them might be being able to write to a database memory. So can the agent sort of learn over time and improve its ability to perform a task? I always say like the hundredth task with an agent should be much better than the first because that should be the case for human employees as well. And then communication, is it showing up in all the right places? Right. And so for us, we think, you know, these, these entities, these agents are going to start showing up in all places where you do work, whether that's your, your slack or your teams. For example, we launched a research preview of Claude in Chrome. Like we think of Claude needing to be in all of these places where you're doing work so that you can actually bring it to work rather than having to bring work to it. So I even have this like spider, spider chart of like attributes. So for any given agent that we're building internally, we sort of like grade it on all these different attributes and we can say, all right, great, for the next quarter our investment is going to be on autonomy or it's going to be on memory. And we can kind of, kind of pick our, pick our attributes that we're working on now.
A
That was a good definition of agents, actually. I think that's the most complete definition I've heard. So here's like an overriding question that's coming up as we talk. Is the improvement that much, most of the improvement that we're going to see at least in the near term in AI, is it just going to be coming on the like back of the orchestration of these models, getting them to be able to take multiple steps as opposed to I think what was sort of the defining characteristic of the earlier days of LLM lms, which was basically just make it bigger, make it generally smarter, Maybe get some PhDs to feed some information to it in post training and then you'll just see what happens as you go.
B
I think that there's going to be some fields or disciplines where that sort of extremely sort of precise depth in a particular task or domain will continue to be important. But I think I'm much more excited and just like overall I think we're spending a lot more our time even from the product side around that. I think it's actually two pieces. One is that orchestration and then two is how do you take the work that Claude is doing from like pretty good to great? And so, you know, we launched ability for Claude to create Word and PowerPoint and Excel files that you can then download and bring into those apps. And if you get to like 50% as good as you would have done yourself, I don't think that's good enough and it won't speed you up. And in fact it's like, I don't know, I could have just done this myself and now at least I would have known what it's done. When you start clearing this sort of like 75 to 80% threshold of course is not scientific, but it's kind of like a little bit of a vibes based thing, then it starts actually being able to really accelerate work. And so that's the other emphasis too, and that's interesting. Some of that is post training, some of that's actually also giving a lot of really good examples to Claude and really working closely with how the model is producing outputs that are what we think of as like professional quality.
A
Right. And look, I know we're 15 minutes in, so I think we should probably take a minute to talk about the concrete things that you've improved in Claude between 4 and 4.5. Do you want to just give us briefly a little bit of a list of the things that get better with the new model?
B
Yeah, I think the, the ones that I think are highlights. Maybe I'll. I'll bucket into three. One is from a price performance perspective. So 4.5 Sonic 4.5 basically outdoes Opus, our largest model in effectively every category, but does so while running faster and at a fifth the cost. So if you think about where we were in May at, you know, Code with Claude, we were announcing Opus 4. We now have a model that is better than that and even its successor, Opus 4.1, but does so at a fifth of cost, just is very like, you know, opens up a whole new set of use cases for that kind of intelligence. That's one on the price performance piece. The second one is on its ability not just to code for longer, but just execute it gentically for longer. We talked a little bit about agents, but what we saw was actually I put a fun video of this on my X account, which is we asked every Claude from Claude 1 to Claude 4.5 to recreate Claude AI. So like our flagship AI product. And 4.5 was really the first one that was able to do it end to end and actually produce something of quality. It actually works. You can log in, you can use an API key, all of those things as well. And so that ability to execute agentically work for long time Horizons, we had one customer who had it work for 30 hours. Of course that's not going to be every task, but that's the kind of upper bound that we're starting to see is another big improvement. And then the third one is moving some of those post training wins beyond just code to other domains we think are really important. So for example, financial analysis is an area that we've been really interested in. We launched Cloud for financial Services a couple of months ago and we incorporated that into the model training in Sonnet 4.5 as well. So when you look at things like benchmarks like Finance Agent, different domains, like the legal domain as well, the model is improving not just on code, which is obviously important, but also these other domains that are. You might actually use code to solve these challenges, but the point is not to write code. The point is to solve a financial analysis, for example.
A
Okay. And I definitely want to get into these various agents in a moment, but let me ask you this. You mentioned that the new Sonnet 4.5 model is more performant than Opus, the big model in the last release or the 4 release. And, and it's cheaper. So how, how do you do that?
B
I think it's, I mean we talked a little bit about scale. That's one piece which is, you know, that just really being trained, training Sonnet 4 on like significant scale. Another one is improvements in the post training work that we've done as well. And the third, third one is really sort of closing the loop on what we hear from customers around what are the things that they wished either Opus or Sonnet were better at and then getting that right. So one we hear all the time is instruction following. Like if I tell Sonnet to do this thing, I need it to do the thing very reliably, even if it's AI, even if it tries to be creative, like there's times where you really want it to, to be more prescriptive. And we've put a bunch of work into instruction following for, for this Sonnet too.
A
So I want to talk about these agents. So I've got a list of four different types that you highlighted upon release. Finance, personal assistant agents, customer support, and deep research. And I just want to talk about who they're for. So the finance agents are interesting. So it says you say you could build agents that can understand your portfolio and goals as well as help you evaluate investments by accessing external APIs. Personal assistant agents build agents that can help you book travel and manage your calendar, as well as schedule appointments, put together briefs and more by connecting your internal data sources and tracking context across applications. I think to set these up, it looks like it's a decent amount of work. Like you'd have to, for instance, with the finance agent, understand what an API is. So it's not going to be something that I think most people would take off the shelf. So who is this set of agents for? And do you have plans to make this technology more accessible? So let's say, you know, I'm a finance, not even a finance professional. Let's just say I'm someone that wants to have AI run through my portfolio. Can I eventually be able to easily set that up and run it without having to know any of this fancy tech stuff?
B
Yeah, I mean that, that's absolutely the goal. So there's agents that we'll build ourselves and kind of deploy end to end. I'll talk a little bit on the personal assistant side next, but I think by and large these will be agents that we can help power for companies that have that particular domain expertise that they're bringing it to bear. One of the first companies I ever worked with at Anthropic was Intuit. We were powering their sort of tax advisory service. And Anthropic, we're never going to build a tax product, but Intuit has the largest one. And so being able to power their sort of tax Q and A was really powerful. Now you can imagine all these other places too too. We've been working more closely with Microsoft even for some agents, even within their Office suite. So being able to take the financial analysis capability and the financial planning capability and bring it closer to an Excel user, for example, I think that's the way you unlock the maximal value of some of these as well. And I think you'll see us sort of demonstrate these capabilities. But in terms of the first party products we build, we're pretty thoughtful about which ones we end up going deep on because to your point, it's to reach the scale that I think these products deserve to reach. You want somebody who's really thinking through the whole end to end user experience and probably has some of the pre existing connectors already kind of set up as well. But I think it's important also to build some of these ourselves. So we talked about the personal assistant case. One of the things that we've had a lot of fun with on our mobile apps is using on device capabilities as well. And so actually just saw that Apple featured us today as our like new features like for Sonic 4.5 and one of the things that they were featuring was on iOS and Android. Now Claude can sort of read your calendar, read your reminders, like compose text messages without really any setup at all. So that's ideal, right? Which is like you got those pre existing connectors, you're not sort of spending a lot of time sort of initializing the just getting it set up to even get any work done as well. But I guess to be more succinct, to answer your question, there's some that we'll build ourselves and in those we'll try to do our to sort of simplify the setup process. But I'm also very excited for embedding these agents in existing products that are out there that then have all that data built in.
A
And so as I read through your blog post, I also started to think a little bit about Dario's prediction about the white collar bloodbath. It's like impossible not to where he says, you know, within a few years you might see 50% of white collar work automated by these AI bots. Looking at it being able to do these finance tasks or customer support tasks or even be a personal assistant. And I'm just curious from your perspective as the person running product here, is this something that you're like merrily running towards, trying to automate human work or like, how do you think about it in your role?
B
We have kind of like product principles we try to work kind of towards. And it's actually interesting. Like, I think we had very or different, not entirely different, but kind of a different set of product principles. Even at Instagram, I think it's important to sort of like figure out what, like who you're building for and how and how you go about it. And one of the principles that we operate with is if you can build things that are complementary or augmentative, bias towards those first. And it's not to say that in the long run, overall, these products might not or probably will be doing more sort of automation or even replacement of work. But we think that two things happen if you can build more augmentative products. So it's not a finance agent that takes all the work and does it all for you, but it becomes more of a back and forth one is I think it helps people develop an intuition of what the AI is good at today and not good at. So that kind of helps people position even their own sort of skills against that. So I think there's the intuition building. And then the second part is it, I think, extends the timeline by which people are making that adaptation. So I think if you see Dario out there talking about the, you know, likely labor impacts, it's not to sort of try to accelerate towards those, but more around, like, hey, we think this is coming. Let's start this conversation now. And I think in the products that we build, can we sort of show that this is likely to come, but still build a bridge between here and there by building more augmentative products? It's definitely a, like a. An there's art and science here. So I think we debate a lot within the product team as well. Like at a great conversation with our head of design where he's like, if we had a product where you hit a button and it did all your work for you that day, would that be a good product? And would that be an anthropic Y product? And we both came to a conclusion, like, no. One of our core brand tenets that we've come out is keep thinking. And we want it to be much more of this collaborative accelerator of human thought rather than replacement for human thought, and would like to keep that the case for as long as possible.
A
Yeah, I'm still trying to figure out how I feel about this stuff, but I do think that the conversation around augmentation versus automation is still like, so elementary and honestly like it's a fairly dumb way to look at. I'm not saying what you're saying is, I'm just saying this, the industries, you know, perspective on this, like are you automating or augmenting tasks? Because let me give you an example. If you automate, you know, some, if you automate a job within your company, you've automated a job, the question is what happens next? And if you put that person who was doing that job on something, leading a new project, for instance, or something higher value, you've now augmented it in a way that the word augment doesn't even come close to describing. So it's really tough, I think, to measure this stuff. And I don't know, I just sort of feel conflicted about the way that the conversation has gone so far. What do you think?
B
I think that, that there's a lot to what you're saying, which is there's the point in time task, right, like, oh, you know, managing my calendar or doing some research about something that I'm talking about. And then there's the broader context of what is the sort of role that that person even has within the company. And you know, a lot of the things that we think about is people end up, I think people will end up feeling more like managers of AI than just users of AI. Then we think a lot of it about this, even with it's happening in engineering, right, where our best engineers are managing three or four Claude code instances running at once. And all of a sudden you've had to think about higher level, like, all right, what is the unit of tasks that I want each of these sort of sub cloud codes to be doing? I think the same will be the case for how we interact with AI systems and there's going to be some blend of automation and augmentation there as well. The, the way I think about this sort of the bull case for this is twofold. One, can you bring to bear sort of world expert level thinking of a particular discipline into companies that might not have had that before, right. Either because that talent isn't present in that local market or because the company's just getting off the ground and they can't afford a world class CPO somewhere or cto. Can you elevate the kind of baseline there? So I think that's one piece too. And the second one is having companies that will, I think emerge and be able to scale and maintain that sort of small team cohesion. I think we did this really well with Instagram without having to like, you Know, build a huge workforce from day one. And I think the kinds of companies that get built will change. But I still think there's like a tremendous amount of economic opportunity throughout. It just might be, you know, more smaller companies rather than fewer, bigger monolithic companies.
A
Interesting. I mean, coming from a guy. Well, you were at Instagram was what, 16 people when you sold it for a billion dollars? And people said that was crazy.
B
Exactly. So I got a question once that was like, when do you think the first single person, billion dollar company will emerge? I was like, well, we had 13. It was like, you know, we were getting close, people. We're 13 at sale and 16 at close. So basically it was, yeah, just around then. So yeah, I mean, we got a lot done with a little. And I think a lot of that came, came from focus and you know, there's probably work that we could have done even more efficiently.
A
Right. And I mean, I think if you sold, waited a couple years to sell, it might have been worth double or triple that. So folks, if you're just. By the way, if you don't know about Mike's previous work, he's the co founder of Instagram. So we are going to get to some of the social media elements of this or the comparisons to social media building in the second half. But two more questions for you as we round out our first half. You mentioned memory. I think it's one of the most interesting parts of this work that's sort of, I think, underrated and underappreciated in the common conversation. Can you talk about how building better memory within these bots is how important that is and how that's actually happening.
B
I think the biggest sort of breakthrough or really key piece of what we've done on memory is rather than treat it as a sort of substitute for how the model might otherwise access information or sort of a system built on top of the model, we actually have trained it deeply into the model. And so the model knows about the concept of memory, which I know sounds kind of funny, but you can really see it as you talk to it and you can even see.
A
Wait, wait, what does that mean? You have to talk about what that means.
B
Model knows. Yeah. So basically in training, we give the model effectively a series of tools to let it both read from update write memory. And what that means is it understands the concept that it is capable of managing its own memory. And then in our platform, we actually now have that as a sort of basic building block that you can use. And what that means is as you're Talking to Claude with Access Memory tool, you can say, hey Claude, can you update your memory about this? And it knows what that means. It'll say, great, I'm going to update the memory. Or when it's performing an action, if it thinks there's a good chance that it has some memory related to that action, it will retrieve that memory before doing the action. And previous systems, you would have to either build that yourself on top of it, or Claude or any of these systems wouldn't be as good at using it. And so effectively, in the same way that we might have the thought, hey, I think I did this before, or I think this happened before. I'm going to go either think about it for a sec or maybe even search my email. Basically given Claude that same ability. And that can be sort of memory that's very fact based, like who are you interacting with, what should you do? But it can also be more task based, like whenever I'm doing X, make sure I remember to do Y.
A
That's pretty amazing. And so what will the memory get you when you're using this, like better memory? Will it start to remember? Many more aspects are like the. So I'll give you one example. And this is so rudimentary. But like, if I ever use Claude to do a podcast description, I have a format prompt that I drop in. First sentence should be this, second sentence should be this. And every single time I, you know, write that prompt, I, I ha. Every time I ask for a description, I have to use that exact prompt or else it will do whatever it wants in Freelancer. When are we going to get to the point where these bots are going to be smart enough where when I tell it, remember this is the way that we do things here. It knows. And I'm sure that my problem is something that people have all across the board when they're trying to get these bots to work on the same things for them.
B
Yeah, very soon. So we have a launch coming up in the next like week or so that's going to really like there's both memory and then also the idea of, you know, what are the repeatable ways in which you want work to get done? And so we'll have something really exciting there very soon. But from the memory perspective, beyond the sort of like very sort of basic fact based things, like, I'm Alex, I run a podcast and a newsletter and a site and that's somewhat helpful, but I think not sufficient. Like getting to the point of, hey, have I interacted with this person before? Like what happens last time I chatted with Mike. Can I like search over my memories there? Or it can be, hey, whenever you generate these summaries, like make sure that you always cite this piece or lead with a punchier sort of thing and it's able to sort of update and learn over time. There's also that's the goal is again like if Claude is like a very competent new hire, we want it to get to the point where as you use it over time, either on our platform or using our kind of first party products, it is improving and it just feels like a companion that you've actually helped train to your preferences.
A
Where on the list of priorities is that capability? Sounds like it's probably very high for you. I know that it's high for OpenAI.
B
Yeah, I think it's very. It's really high for us. I think for us it's both. It's high on the first party side, but it's also very high on the on the platform piece as well.
A
Okay. All right, let's go to break. I want to ask you afterwards about what the moment building AI has in common and differs from in building social media, which of course we just mentioned you were right at the center of. So let's do that right after this. The holidays sneak up fast, but it's not too early to get your shopping done and actually have fun with it. Uncommon Goods makes holiday shopping stress free and joyful with thousands of one of a kind gifts you can't find anywhere else. I'm already in. I grabbed a cool Smokey the Bear sweatshirt and a Yosemite ski hat so I'm fully prepared for a long, cozy winter season. Both items look great and definitely don't have the mass produced feel you see everywhere else. And there's plenty of other good stuff on the site. From moms and dads to kids and teens, from book lovers, history books buffs and die hard football fans to foodies, mixologists and avid gardeners, you'll find thousands of new gift ideas that you won't find elsewhere. So shop early, have fun and cross some names off your list today. To get 15% off your next gift, go to UncommonGoods.com BigTech that's UncommonGoods.com BigTech for 15% off. Don't miss out on this limited time offer. Uncommon goods we're all out of the ordinary. Did you know your credit card points and miles can lose value to inflation? Credit card companies often reduce the redemption value of your points and miles. Now imagine a credit card with rewards that can grow in value. With the Gemini Credit card, you can earn Bitcoin or one of over 50 other cryptos instantly with no annual fee. Every swipe at the store or gas pump earns you instant rewards deposited straight to your account. Plus sign up now for a $200 Bitcoin bonus to kickstart your rewards, visit gemini.com card today. Check out the link in the description for more information on rates. Again, if you're looking to invest in Bitcoin but don't know where to start, the Gemini Credit card makes it easy. The Gemini credit card is issued by Webbank. In order to Qualify for the $200 crypto intro bonus, you must spend $3,000 in your first 90 days days. Some exclusions apply to instant rewards in which rewards are deposited when the transaction posts. This content is not investment advice and trading. Crypto involves risk. The Gemini credit card cannot be used to make gambling related purchases. And we're back here on big technology podcast with Mike Krieger. He is the head of product at Anthropic and the co founder of Instagram. All right, let's, let's talk about social media and AI. Very interesting. I mean when we look around the AI industry we see so many folks who've come from places like Facebook and Twitter now running large parts of these AI companies. Of course yourself, co founder of Instagram, head of product at Anthropic. Kevin Wheel is running former head of Instagram, head of Instagram product as well I think is running product at OpenAI Fiji. Simo who came from Facebook is running consumer applications at OpenAI. I mean I could go on. What does building these products have in common with building social media and how does it differ?
B
I think there's, there's maybe the, you know, abstracted from the actual product itself like what does it take to build good product? And I think that, I think it's, it's less that there's a lot of social media sort of oriented folks that have now moved into AI. It's more that I think a lot of the best product people were focused on that, you know, even four years ago, you know, pre chat GPT, you know, pre the emergence of a lot of these LLMs. So I think it's sort of like the most recent place that concentrated I find that that happens like the concentration of talent among a particular discipline. And I think that was, that was social media beforehand. So that's partly one of them. And there it's, you know, all of the pieces around understanding what Your data is telling you, but also having the intuition around like what bets you want to place in terms of where you want to move into next, how you assemble a great product team, how does product engineering and design and marketing work well together, all these different sort of aspects of that. So there's that one and then I think there's the separate question of, you know, within social media, like what are the, the similarities and differences with Claude? It feels quite different in that, you know, we have more of a business audience. Like plenty of people use it for their individual pieces, but it has less of that sort of, you know, social component. Right now it's definitely more word of mouth. Like the, the most social thing that we've experienced is how people got excited about all the, the merch and the pop up we just did in New York where that was like a real like attractor moment where there was like more of that but in general less of the sort of mechanics of, of like capital G growth. Right? The, you know, how many, you know, people did you bring in? What, who do they invite? All of those different pieces. So maybe a little bit different there at least for the pieces that we're tackling with Claude. But of course, as a lot of these or non Claude tools move into more of this generation of like images and videos, like there is much more of an overall a strong overlap with what folks were doing on the social media front.
A
How important is engagement to you? I mean, I think the thing that really drove Facebook decisions was engagement and of course growth and maybe the two go hand in hand. We always wonder about AI products. Like of course you want people to use them, but you don't want engagement for engagement's sake because it's pretty expensive to serve these use cases. So where does engagement sit for you in terms of the metrics that you're optimizing toward?
B
We don't really look at engagement, at least not in the typical. Like at Instagram we spend a lot of time looking at things like time spent at. Right. We do look at things like Daily Visitors as a proxy for a utility. So I think that's, that's one piece that we look at as well. But it's interesting. Like I was talking to our mobile team yesterday. Like, I think in the future people's interactions with something like cloud code might be much more mobile oriented and ideally like we're right by Salesforce park, our office. Like I would love to be able to kick off some coding tasks, go for a walk in Salesforce Proc, maybe with a Coworker, maybe it pings me halfway through and has some clarification question and get back to my desk and it's done. It's a very different discipline than being hands on keyboard. I also love that. But that feels like a different discipline than what coding has evolved to primarily being nowadays. And now it's more about what are the creative ideas that I have that I want to see manifested. But in that world, the time spent was quite low. It was maybe kicking off the task and resolving some questions, but the value of what was produced was much higher. And so I think the interaction paradigm is just really, really different in terms of what we end up looking at. And so I think much more about the sort of value of work done than the sort of like interaction and sort of, yeah, the long, long sessions that you might see with social media.
A
I legitimately just had a founder that I interviewed tell me that her favorite use case is just using AI to get away from her computer, which is something you've never really heard of before in technology. So I got to ask you, what do you think Mark Zuckerberg is trying to do with his very unique AI strategy?
B
I think there's folks in there that I've known for a long time, like Nat, who I really respect. So I think what I suspect you'll see is sort of more experimentation around what like AI means for this kind of portfolio of companies. I think the sort of initial wave of, well, you know, we've got some chatbot type stuff in the search bars was like, not. Not particularly transformative. And I think the teams there likely know it. And so, yeah, I think, or maybe what I hope we'll see is more experimentation that can kind of live outside of those, of those surfaces. Like in the same way that with Instagram, you know, there were some ideas we had that didn't really belong in the app, like Hyperlapse or even. Nobody remembers Bolt, but Bolt was our like, very, very fast messenger. You know, I think that experimentation, once you get a service as widespread as Instagram or as widespread as Facebook or WhatsApp, it's hard to introduce a new behavior there. You know, we did it with stories. I think they've since done it with reels, but it's almost like one, you get one per generation and I think you want to have more of an experimentation kind of test bed beyond that. And I suspect just like, given what I know about how those folks think, that there'll be more of that sort of experimentation.
A
Interesting. So as, as the co founder of Instagram I'm sure you've watched with interest as AI generated images and videos have filled social media fees and even propelled, like with Sora, the Sora app, to the number one spot on the App Store. Do you think AI generated content and video may be in this cameo version where you can put yourself in the videos? Do you think it threatens or makes a run for replacing the human generated content that we have today? Or do you think that the human stuff is going to stay on top of the. And this is a flash in the pan?
B
Yeah, I mean, I think there's. Here's what I'm not sure of yet. We saw this with Instagram that there were creative tools that would emerge. And of course, these were at a much more sort of basic level than the kind of capabilities that you're seeing with VEO and with Sora and with even some of these other models. But you would see an emergence of a creative tool. And whether they were able to sort of transcend that to being a network that you come back to was, was often not the case. And I think that was for a couple of reasons. One is, at least in that generation of products, like the creative content or the created content started getting a little bit samey over time. Right? Especially if it was like a very highly stylized tool. And the second one, it's like the dynamics that make Instagram, Instagram like the people you already know on there, the people that you follow, the creators that you know. And of course, this has shifted now in more of like a like pure algorithmic reels oriented piece. Or maybe I'm talking more about the previous Instagram that still had a heavy kind of follow component, ends up being a thing that feels like, oh, I know who I'm interacting with here. And of course, TikTok has a very different take on things. So to the extent that it's replacing, I think the things that would have to be true is one, that the content feels varied over time and not just sort of like, yeah, okay, I've kind of seen this before. It's really interesting, but I've seen it before. And then two, is there value to being in that network over time? And do you find yourself opening it because there's like not just content that you're interested in, but maybe people that you care about, or there's sort of communities that form within it? Because I think that's actually what Instagram got right, is that you started seeing these emergent communities that maybe were just around taking photography, maybe they were oriented around living in a particular city and they were very self organizing. The only tool we gave people was those hashtags and that was enough to sort of spur these communities. So I think that's like the fundamental question to be, to be answered still.
A
That's a great point. And I think the cameo aspect where you can put yourself in the video may go some degree towards making that happen in these apps. But I also, I'll tell you on Friday, I couldn't put Sora down and we're at Wednesday and I don't, I don't really feel compelled to open it right now. I think you're right that maybe AI content creation can have that level of sameness to it. Where you watch one video, you feel like you've watched them all. And then maybe people come with creative, come up with creative prompts and you know, you see a new trend. But I think that's a spot on point there. That, that's the challenge.
B
Yeah, well, I mean, I think it's all happening very quickly in terms of the experimentation. And so I think there's also this, this like ability for these tools to adapt as well. And whether it'll sort of open the door to sort of a new Cambrian explosion of social products is going to be another thing that I'm tracking really well. It feels like it's been very quiet on the social front for the last couple of years. You know, we've sort of like stabilized among, you know, a couple of really big players. Not a lot of new experimentation. And yeah, I miss the 2000 and tens of what if social products were like this and what if we took this differentiated take on things and not all, most of them are going to work, but at least there is that value of like, hell yeah, I want to try that. That is a different experience. Even if it's again, things that feel like maybe novelties, like, oh, it's a photo. The app that takes the front and the back camera together. Is that a lasting network? No, but it painted the way towards something.
A
Yeah, that was fun. And I missed that too, by the way. That happened, I think in the 2020s. But I definitely missed the 2010s when I was doing social media reporting at Buzzfeed and there would be a new app every week and it was like, all right, well what's peach? Let's try this out and then be gone. But there would be something new. Oh my God, all time. Classic.
B
It was.
A
So I want to talk to you about community briefly. Where do you look to find, I guess a community of users and get your feedback. And how important is Reddit to you? Because I've seen so much of the activity in the AI space move on to Reddit. And I'm curious, are you reading like R Singularity or how deep into it are you?
B
That's a great question. So it's interesting being I would say that there's like sort of somewhat overlapping but distinct communities that we look at. One is like being a platform, we have like a strong kind of customer base that often has a very sort of clear eyed perspective about where the models could continue to improve or what we could be doing better as a platform. And so this is very different than my time at Instagram where you know, there were people that we talked to a bunch about how they're, they're using Instagram, but we didn't have this like more permanent notion of like an advisory board that we have here at Anthropic. And we just brought in a couple of months ago Paul Smith as our new chief Commercial officer and he's brought also this sort of, of community of more enterprise folks as well that we've been talking to. So that's one kind of big delta which is like more stable sort of set of people in community that we're talking to. So that's one, we actually have a phenomenal user experience research team and that's a place where we end up being able to stay connected to how more of the power users that I think of as like our core demographic for something like cloud AI are using the product. I love it like every month we do a product, All Hands and the, my favorite chunk of that product, All Hands, is basically the UXR team doing like a voice of the user piece. And surprising sometimes because you know, it's not necessarily who you might expect being like the software engineer archetype who for sure are using our products. But there's also, hey, I'm, you know, a marketing manager. I need to produce 20 decks a week and now I finally found a tool that I'm like using to cut it out. But here's my, here's my feeling about it, here are my fears about AI, here's the promise of AI. So just a very humanizing kind of aspect of it. And then for sure, you know, I think still today, I think like Twitter, Slash X and Reddit have a strong pocket of, of that AI community and we, you know, I think we've, we've gotten better at engaging in that community than before. I think there was a time period where we were like, well, like, there's a lot of volume. How do we react? And then like, you don't want to be showing up only when there's like, something that you want to, like, correct or something, because then it feels very like corpo and not like, authentic. And so I think we have found a better, like, ability to participate in some of those, those communities. And, you know, it's good. They're often the, like, power users, extreme users that are telling you something about the edge of what's possible. And then you can kind of try to generalize it more, more broadly too, but less like R Singularity and maybe more like R Cloud AI. Sorry, mundane. But it's where a lot of folks are hanging out.
A
All right, cool. We spoke this whole conversation. We haven't brought up the fact that, well, I haven't and, and do this oversight on my end that, that OpenAI basically has seen how well Anthropic has done on coding and said that this is, you know, a number one priority for the company. And I mean, every day you can look at OpenAI leaders on X, speaking of X, trumpeting their Codex product and talking about how, how advanced they are on, on coding skills. So can you talk about how you assess OpenAI's challenge and what it's going to be like to sort of go head to head with them on what has been anthropic spread and butter?
B
Yeah, I mean, I think it's definitely. There was a, maybe a window in the summer where it was surprising to me, I guess in general how sort of alone we were in sort of both paying attention and having a product out there. It's definitely gotten more sort of interesting and competitive, which I love. I think that my favorite times at Instagram were also like, when we had interesting competitors that we. I think it pushes you forward in terms of like, what it, like what is the product we want to build, what are the capabilities that we're going to need to have? So, you know, it's kind of like a game on and interesting moment as well for us. The coding piece, beyond just the fact that coding is a really high value economic activity, I really see the model's ability to plan, write code, solve problems is not just being useful for software engineering, but being really critical path to the kind of agentic behavior we want to build long term. So there's no way that would never be anything but like one of our, you know, top two or three priorities. And then it's a matter of how do we make sure that we're showing up with the right products that like deeply solve the right problems for people. Right. Like maybe this ties all the way back to your question about like good product design and how I think about products. It's one thing to score well on software engineering bench, that's important as like a benchmark but it's way more important I think to get the feedback from people. Like, great. There's a really hard task that I was doing with rust and in Sonnet 4 I couldn't do it. Opens 4, could barely do it. Sonnet 4.5 can do it like that. I get very excited about because it means we're actually having real world impact. So I think you'll see us if we're doing our jobs right on coding even in the presence of other players. Enter the space. Try to stay really focused on listening to how people are using these products in the wild and then ensuring that future model versions are sort of meeting people where they are in that high utility space.
A
All right, last one for you. Mike Enterprises, they're all interested in generative AI. They're not great at implementing it. They'll admit it. The studies show it. Are they going to get it together?
B
I think they will. I'm actually, you know, from after our conversation, I'm having a off site with our product team and a lot of the focus for next year is continue to go into the enterprise side of things. And I think there's a few things. There's the. And we could probably do another whole like hour on this. I get very excited about this as well. But there's this whole range from how do you take a product like cloud for enterprise that you know, enterprise is already adopting but make it really, really useful. And we talked a little bit before about like output quality and just like how, how much it's actually helping you. And like there's I think think part of the valley of disillusionment or the trough of disillusionment that you might be seeing around enterprise AI adoption is the promise of these tools around. They're going to save you time, they're going to make your work better. Just wasn't fulfilled by the previous generation of products. And they need to be if we're going to actually get sticky adoption in the enterprise. And so that's a lot of what we're pushing on. It's not like AI produced document slop, it's AI produced quality stuff that you can then iterate on and use and feel proud that you created. Just like I think people can feel proud about like here I built this thing using cloud on the coding side and then there's all the way to, you know, beyond the cloud for enterprise piece like deeper integrations like internal transformation. And what we're learning there to your question about how enterprises are thinking about and adopting is at least for the foreseeable future, we need to lean in much more in terms of helping enterprises get there. And so we're doing much more of a model now where either with our own engineers embedded enterprises in partnership with Deloitte, which we just announced this week, can we actually like take our technology, meet companies where they like what their highest needs are and then just co develop and just, you know, lock ourselves in the building until we've solved their problem and then like learn from that experience and move on to the next enterprise. But I think it's very different than sort of the lean back sort of like we're just going to have enterprise products and hope that enterprises figure it out. I don't think that that's the reality. We just need to lean in way harder on both ends of that spectrum.
A
Wallet AI is the website. Mike Krieger is the Chief Product Officer at Anthropic. Mike, always great to speak with you. Thanks for coming on the show.
B
Thanks for having me, Alex.
A
All right, everybody, thank you so much for listening and watching. We will see you on Friday to break down the week's news and we will see you then on big technology podcast.
Guest: Mike Krieger, Chief Product Officer, Anthropic (Instagram co-founder)
Host: Alex Kantrowitz
Date: October 8, 2025
This episode explores the accelerating pace of Anthropic’s AI model development, the shift towards advanced agentic systems, how Anthropic leverages user and enterprise feedback, and how this AI progress compares to classic cycles of tech disruption like social media. Mike Krieger offers insights into Anthropic’s new Claude Sonnet 4.5, what’s enabling rapid iteration, and the future of AI’s impact on work.
On Product Philosophy:
“We want [Claude] to be much more of this collaborative accelerator of human thought rather than replacement for human thought, and would like to keep that the case for as long as possible.” — Mike [22:15]
On Agents:
“AI systems that can plan and… run actions over long time horizons using a variety of tools where the steps are not predetermined. They’re able to solve problems dynamically based on what information emerges.” — Mike [09:22]
On Model Launches:
“Every release doesn’t feel like this very bespoke, very difficult process.” — Mike [03:54]
On the Future of Work:
“People will end up feeling more like managers of AI than just users of AI.” — Mike [23:51]
Mike Krieger paints a picture of Anthropic as a fast-moving, customer-obsessed product company where advanced agentic AI, smooth operational processes, memory, and practical use cases—not just bigger models—are driving the next phase. AI’s future, as seen from Anthropic, is as an empowering, context-aware collaborator, not a replacement. The conversation also underscores how tech cycles repeat, but with new tools, new communities, and new philosophical debates at their core.