Transcript
A (0:00)
Today I want to speak about how AI is creating a kind of split reality, a split reality around productivity. It seems to be creating, increasing and improving productivity at the individual level. But how much is it impacting firms and the wider economy yet? Now do stay to the end because there is a surprising bit of data that throws more light, perhaps more confusion on all of this. So let's get started. Why does this question matter anyway? Well, if you've watched the stock market this week, you will know exactly why the mood around the AI boom turned sour over the NASDAQ and the S&P 500. There have been several days of red key stocks that have enjoyed incredible gains in the last couple of years and the last few months took an absolute pounding. Amongst them, the hyperscalers, the cloud companies like Oracle and coreweave and Nebias who are in operation to serve up these AI models for companies like OpenAI and others, really, really in a sea of red. And even the hardware companies like Vertiv, which makes cooling for data centers, having a really, really tough and torrid week in another part of the market, not the equity market, the bond market, we saw similar fear emerging. The bond market tends to move slightly differently to the equity markets. But a number of these companies in the AI space are dependent on corporate debt because you are having to raise a lot of money to build out these big data centers. And so for one example, Oracle's corporate debt took a real hammering, down 8% this week. And the CDS, which is kind of interesting technical financial instrument, it's the cost to ensure the debt of some of these companies spike. It got increasingly more expensive and that's often a signal that bond investors are a bit nervous about the prospects of those firms. Now in truth, the NASDAQ is still above its 50 day moving average and this is often just the normal ups and downs. But perhaps it isn't. Now the real driver here are the fears that the revenues won't materialize rapidly enough to pay for the AI infrastructure. Particularly so for these new firms moving into serving models like Core Weave and Oracle. Now, for revenues to materialize, we need to see people and companies spending more and more in the generative AI ecosystem. And if companies are going to spend more, they need to be getting something back. And that is where productivity comes in. No productivity, no point, no sales, no revenues. Well, the markets would be right to turn red. So let's dig into what we can now see from some of the data around productivity three years into this ChatGPT moment. Well, there's an Increasing body of evidence that shows that inside firms, inside teams, inside with individual workers, AI is making people more productive. Micron, which is a semiconductor company, it makes the high bandwidth memory, as it happens, is what you need. In these AI data centers, reports 30 to 40% productivity gains when their employees use generative AI for code generation and other internal uses. So that's a really, really significant number coming from a, a company that, you know, that's not one of the big tech firms. Coinbase, the crypto business, says that AI generated code is on track to surpass human output. I think it was 92% of their technical staff using those code generation tools every day. Now of course, Coinbase is a relatively young dynamic company, so it's natural that it would be an early adopter. So these early adopters are of course demonstrating some concrete gains. But I read something in the Wall Street Journal this week which was really quite surprising. It looked at bny, which is a bank in the US and Walmart and came up with a couple of quite strong and surprising examples of how they were using AI and getting some kinds of results. So in the case of bny, which used to be called bank of New York I think years ago, that's how I remember it. They've deployed 100 digital employees who have their own login credentials and can go in and get work done using the internal systems that human employees are using. With their own login credentials, they can access the systems they need. BNY claim this is actually giving them bottom line impacts through growing the capacity to do work. One example they talk about is a digital engineer that can scan code for vulnerabilities and implement fixes as they get found. So, so that's quite an interesting example because of course on the one hand you are just saving the human dollars of doing that. But on the other hand, because the digital engineers can work ceaselessly and constantly 24 hours a day for their few jewels of energy, you might actually identify and close vulnerabilities faster than otherwise, which could save you from some kind of horrible cyber effect. The Walmart example is pretty clever as well. It's the trend to product agent. So Walmart claims that what this has done is shorten fashion production timelines from six months to eight or nine weeks. So again that that matters in this world of fast moving cultural trends. You know, there is a trend, it's emerging on TikTok or somewhere else and you want to get the appropriate garment in the stores as quickly as you can. Maybe the trend to product agent is going to help you do that. Now look, let's be clear that these are companies talking about what they're doing. I'm much more comfortable with, you know, what, what Coinbase said or Micron said, because they put really specific numbers on deployment and results. Better of course, to see what academics are doing. And again, another fascinating paper from Chicago Booth, which is a business school by an academic called Supratine Sarka. And this was looking at developers who are using AI tools like Cursor to help them with their software development. And if I could summarize it, it was that senior developers are better at bossing AI around than junior ones. Essentially, AI coding agents were more productive in the hands of experienced developers, not, not junior ones. I mean, in a sense you'd think, well, a junior one's going to get a leg up and that's going to help them quite a lot. And if you think back to the Wharton HBS BCG paper study from a year or so ago, what it showed was that people who were of below average competence were getting better improvements than those who are top quartile, which could proxy in software development to junior and senior. So what's going on here? AI autocomplete tools are skewed with juniors. Senior developers are much more likely to accept AI generated code than junior developers. The reason we'll get into in a second. When the academic looked at one particular company and they made AI agents, the default code output jumped by about 39%. There was a difference in the way the developers approached it. So experienced developers were much more likely to ask the AI to help them plan before starting coding. So they were going to ask for the AI to do that architectural work with them and then they were more likely to accept the code that was presented, whereas junior developers were much more likely to go straight into implementation. And if you, if you've worked with developers before, if you've written software before, you know that thinking through your plan is a surefire way to get the project done more quickly. I mean, if you literally just pop open your code editor and start typing, that's probably a recipe for mangled code, lots of bug fixes and late nights. And so in a sense, what it's showing is that where the experienced workers are benefiting is that they have better mental models of both the code base of what they want to do and how AI can help them. And what's happening there is agents are shifting programming from typing syntax. And you know, that was my downfall is when I used to develop in Java, I would always get my syntax slightly wrong, specifying the semantics. And so the new skills that are needed are abstraction and clarity and evaluation. You need to know what to ask for, how to ask, precisely how to judge what comes back. And if you've been reading Exponential View, you'll recognize those ideas. We've talked about the importance of the developing those types of skills in the workplace as AI systems become more prevalent and more capable. So a senior developer spent years communicating intent to their junior colleagues and sitting down and evaluating code. You know, the senior developer would sit down next to the junior developer and say, hey, what did you work on this week? Talk, talk me through something you coded and why you chose the approach you did. It turns out that those are exactly the skills that seem to matter for effective AI delegation. What does that tell us about how productivity might show up in development teams? Well, perhaps it's not just the number of developers who have access to cursor or replit or any of These other things GitHub, copilot, but rather it's also about the quality of those developers that will drive the productivity uplifts. So while I was researching for this, I found another great example. Now, you know, I said I liked academic research tends to be more robust. I'm just going to say this piece is not from academic research. It's from a survey of plumbers across the us There was a few hundred who were surveyed. Even the plumber story is a really wonderful one. So within this, this group, a large number have been using ChatGPT for running their businesses, so for invoicing and outbound communication, but also for diagnostics. So it's speeding their problem identification up. And you know, they certainly report measurable revenue gains and more time to actually do the work. It's a small scale survey, but it's interesting nonetheless. And I think it resonates with the way that I have used ChatGPT when I've been fixing electrical and plumbing problems around the house. Also not something I do particularly well. I suppose what we're seeing here is that individuals and teams are seeing productivity benefits and lots of us individually can speak to that. But does it necessarily lift the firm wide productivity or does it necessarily lift the wider economy? I mean, if all of these stories are so positive, where's the beef? A quick note, if you want to support us in bringing more of these conversations to the world, please consider subscribing to the show. And there is some other data again that shows that there's this gap emerging that 85% of engineering organizations that use AI tools, are using AI tools, but only 59% of them are getting measurable productivity gains. Again, this is not an academic survey, so read into it what you will, but let's start by saying 59% getting productivity gains a couple of years after these tools is available is a pretty impressive state of affairs. And the question is, why the gap? As you know, I wrote an entire book on the gap between the capabilities and the improvements in technology, their exponential nature, and our ability to harness those. So this is a microcosm of the exponential gap. And this is the hard part. It is a combination of people issues, you know, misalignment between leaders and those who are actually doing the work, misalignment in the incentive systems for how and why people should use these systems, and also all of the process issues because our internal processes have been designed around assumptions of how fast the work can be done, how well it can be done, what kind of exceptions might happen at each stage. And when you start to use a general purpose technology like AI as produced by large language models, well, all of those parameters start to break down and you have to think about an entire process redesign and then workflow redesign. So the constraint isn't the availability of the tools. I mean, we're all one click away from Claude Codex, it's how much we can actually get done within the framework of the spaghetti that is legacy processes. The point there being that firm productivity gains that you might see in a work group or within a few employees don't automatically scale. You know, the story I used to tell about this, and I still do tell about, tell, tell the story is that when electricity was rolling out at the turn of the 20th century, the very first car manufacturers who really were very artisanal, they worked in old carriage works, adopted electricity really quickly, and what they did was they hung single pendant lights in their workshops to extend the working day. But in order to really get the benefit of, of electricity and manufacturing, you had to build the moving assembly line system. And that requires, in modern management consulting speak and sorry for using this phrase, process redesign. So you are at this stage, we are at this stage where these productivity gains won't necessarily and for free scale across an organization. When I talk to people who are building these systems, they just say that it's really hard to do. Building an extensive agentic workflow that can do longer tasks safely isn't like doing a Google query. As the tools become more agentic, reliability becomes a real bottleneck. Longer context as your experience when you use Claude or ChatGPT leads to some unreliability. The models get, you know, less good if you've got a system that's more autonomous, it could become more unpredictable and it raises the cost of supervision. In the last week or two, Anthropic that makes Claude came out with some recent, some research. And what they did was they, they stress tested a bunch of models. It was 16 models, I think, in hypothetical corporate situations. And they gave them harmless business goals. I mean, Anthropic loves doing this kind of testing, and I'm glad they do. And under pressure, some of the models behaved like insider threats, including trying to blackmail officials and employees and trying to leak sensitive information. Sometimes they even ignored direct instructions and changed their behavior if they believed they were in testing rather than in real deployment. Now, Anthropic emphasizes that these behaviors have not been seen in companies today. What they're trying to show are the types of failure modes you might have to deal with as autonomy increases. So beyond the complexities of the people and the redesigning the processes, there are also these types of security and safety concerns that we'll have to build frameworks and scaffolding for. And all of that explains why perhaps the direct translation of the very, very rapid rollout and deployment of these technologies is for now, going a bit more slowly. I mean, another fascinating data point I saw from an academic paper that came out of. Actually, I don't know where it came out. I'm so sorry about that. But the academics were Husseinian, Lichtinger. They were looking at the number of firms posting the job category of AI integrator. And of course, that number has jumped significantly since 2023. The point being that there is a recognition that you need somebody who understands how to put these systems in place. And an AI integrator is exactly the kind of job you'd expect a general purpose technology to create. Remember, general purpose technologies are generally useful across the economy. They lend themselves to persistent and perpetual improvement, and they generate complementary services, which in this case could mean new types of jobs. Is all of this fast or slow? Perhaps the question is really, is this fast enough, considering the investment that's going into AI within companies and across the economy in general, for lots of people, this story is slow. You know, there are roadblocks, there are problems. It is difficult, it's proving to be difficult. But in truth, this isn't a case of not ever. It's a case of not yet. For most, I'd say virtually all of these companies, you know, what they're doing is something that's new. It's something that's complicated. It's something for which there isn't yet a dummy's guide to and the first time you do anything, it always takes longer than it will in a few years from now. So if you're impatient about all of this, it feels really, really slow. But the truth is, and here is the twist, it's the fastest rollout of technology we've ever seen. The fastest adoption of a technology. The St. Louis Fed released a paper also this week looking at generative AI adoption and it's based on a regular self reported survey that they do so caveats self reported survey not revealed preference or not monitoring real actual behaviors. But they do have a consistent methodology. It is a tracker and it is available publicly. And they they by their numbers say that US generative AI adoption has reached 54.6% in August 2025, up from 44.6% a year prior. They try to do a like for like comparison. They point out that at the same point in history PCs were at 19.7% and the Internet was at 30%. So. So roughly twice the adoption rate of the Internet. The absolute numbers there, 44.6, 54.6 aren't I think particularly helpful because it's a self reported survey and because other surveys use different methodologies, but the direction of travel and the historical comparison is quite helpful. Now I'm going to just quote from their report. So what they say is when we feed these estimates into a standard aggregate production model, it suggests that generative AI may have increased labor productivity by up to 1.3% since the introduction of ChatGPT. And that's consistent with recent estimates of aggregate labor productivity in the US non farm business sector. So you know what they're saying is that you can see in the productivity numbers economy wide that thing that I said at the beginning of my discussion, we're not seeing. You can see some measure of labor productivity going up and it's consistent with the survey results they're getting through their tracker survey. They also looked at industry level adoption and there is a weak positive correlation. So it's a correlation of 0.32. So that's not up in the 0.8 0.9s but it's definitely negative and it's beyond being completely random now we should be skeptical. Self reported survey, self so many confounding factors but it is at least plausible. So when we conclude. When I think about this, it is a complicated picture but the evidence illuminating that picture is growing. It's not enough to satisfy anyone, I expect. For those of you who believe that this AI boom is one gigantic hallucination and a titanic waste of time and will end indeed the way the Titanic did, with a with a calamity, you're going to demand impeccably robust evidence, understandably evidence that probably won't emerge before 2027 or 2028. And for those of you who think we're on the verge of summoning the machine God that will take us to transcendence will really want to hit the accelerator pedal even further, 54% adoption won't be good enough. The way I look at this is that we're seeing growing momentum with more and more positive results emerging, even if it remains early days that we are seeing stories of companies succeeding in applications and workflows that they couldn't have just a year ago or six months ago. And of course, once those success stories emerge, that knowledge will spread as it always has. So not ever. Just not yet. Thanks for listening all the way to the end. If you want to know when the next conversation is released, just hit subscribe wherever you're listening. That's all for now, and I'll catch you next time.
