Transcript
A (0:00)
Anthropic product head Mike Krieger joins us to talk about how AI model development is accelerating and what we should look out for as things continue to move faster. That's coming up right after this. Capital One's tech team isn't just talking about multiagentic AI, they already deployed one. It's called Chat Concierge and it's simplifying car shopping using self reflection and layered reasoning with live API checks. It doesn't just help buyers find a car they love, it helps schedule a test, test drive, get pre approved for financing and estimate, trade and value. Advanced, intuitive and deployed. That's how they stack. That's technology at Capital One. Welcome to Big Technology Podcast, a show for cool headed and nuanced conversation of the tech world and beyond. Well, Anthropic has a new model out, Sonnet 4.5, just months after the series of Claude 4 models came out. So things are moving fast and we're going to figure out why they're moving much faster and what the implications are for the AI industry and businesses as a whole. And we're joined today by the perfect guest to do it. Anthropic product head Mike Krieger is here with us. Mike, it's good to see you again. Welcome to the show.
B (1:15)
It's good to be here. Thanks, Alex.
A (1:17)
So I remember sitting in the audience for Anthropic's first developer day and it's funny because in the AI world you sort of, you go in, what is it, cat years or dog years? I don't even know. Every month feels like a year. And this was in May, May 2025. And I remember yourself and Dario were on stage saying, yes, we're, we were releasing Claude 4, but you know, we're going to release the next iterations much faster than we ever have previously. And we're already at 4.5. How is it happening?
B (1:52)
I think there's a couple of things that we're seeing. I mean even just thinking about, I mean may again feels like a year ago. I think Dog Ear about right. I think there's a couple of things. One is we've been working much more with sort of end users, sort of customers of, for example of our platform. And with that we can hear like a much faster feedback loop of hey, Sonnet 4 is great in these ways. We wish it was better in these ways. And you're starting to get customers that really push the models in really interesting ways. And that ends up being very helpful for us on the research side because then we can say, all right, these are problems to be tackled in the next version of Claude. So for example, one of them was Claude Sonnet 4 and even Opus 4. Opus is our biggest model, is good at writing code, but tends to get sidetracked or lost if it's working over longer time horizons. That was a real emphasis of Sonnet 4.5. Or we've put a lot of data into the context, basically how the model is what it's thinking about at any given point. But at some point that gets filled up and how do you then manage to keep working on those things? So having that feedback loop really helps. And it also gives us a lot of urgency because it means that there's sort of almost like bugs in some ways out that you want to go fix or at least like feature requests that you want to go fix. So that's one piece. The other one is we've just streamlined a lot more of our model release story. So I think having now seen, you know, I joined shortly before Sonnet 3.5, which was back in like May of last year. So really long time ago in AI years from then to now, just the sort of operational up leveling that I think we've seen in terms of how do we get early access feedback from customers, how do we give the remainder of customers a good heads up so they can co launch on launch day? What does even that morning look like on rollout? I was talking to a customer, he's like, I've seen a lot of lab rollouts of models and this was the smoothest I've seen, which I took as a big endorsement of how much we've streamlined that model release process. That just makes it so that every release doesn't feel like this very bespoke, very difficult process that can be much more a great, like we know what we're doing, here's the date. To the extent that research can be predictable, which it can't be, but within that domain, how do we actually make that as smooth as possible?
