Transcript
A (0:00)
Hey everyone. I'm super excited to be sitting down with Cassie Kozarkov, former chief decision scientist at Google and a game changing founder, AI advisor and keynote speaker. What I love about Cassie is not just that she's incredibly smart and thought provoking, but how fearless she is at calling out bullshit and just how good she is at parsing what's useful. From all the noise and hype, she believes that companies talking about going AI first are getting it fundamentally wrong and we need to completely change the conversation about what AI is capable of. I want to ask her what AI can really do for us and our jobs and what is the real future of work? Let's find out. I'm here with Cassie Kozarkoff, former chief decision scientist at Google. Really excited to connect today, Cassie, and maybe to kick things off, you know, you've talked recently about a question that I think is kind of on everybody's mind, which is, you know, what you've called the generative AI value gap. What does that mean and what are you seeing in that space?
B (1:05)
Yeah. So I'm sure that anyone who's been watching the various surveys and numbers coming out about generative AI and generative AI deployments would have found that 95% number, you know the one, right.
A (1:22)
Sure do. 95% not getting any value from AI.
B (1:25)
Exactly, exactly. Except the phrasing. I like this phrasing is measurable roi. Right. So some part of what's going on is that companies are really getting no roi and there are fantastically foolish ways to just try to keep up with the Joneses in AI have no idea what you want it for. Kind of send your people off to go sprinkle the magical AI on top of your business and you hope better things happen and then you join the no ROI bucket. But there is also some number of those. 95 are going to be no measurable ROI. And this is. This breaks up into two pieces. One is that generative AI is fundamentally just more difficult to measure. And I want to double click on that in a moment because I know that's what you're asking me about. But the other piece is that sometimes what we're actually getting is we're getting the ability to innovate next time. And I think that not enough companies appreciate that Innovation day demands waste. If you are doing something that you've done before, you know exactly how it's going to go, then of course you can have these KPIs that you know you're going to hit for sure because you've already done it. Now you're trying a completely new technology with a completely new use case. You have no idea if it's going to work. You have to be willing to accept that that might be time and effort thrown, you know, burned at the altar of innovation, so to speak. Right. That that is just the nature of innovation. And I've had companies come and consult with me who they really wanted to be innovators. But when I ask them, so what is your actual tolerance for getting no results back after you invest in innovation? Or how much bandwidth do you give your people to do things that are very specific work product that you expect from them? Do you give them time and space to chase an idea? And quite often the answer is no. No, we don't. We have no tolerance for innovation. We have absolutely no slack for our people and we need every project to be predictable. Okay. If you're dealing with that, you're just not going to be an innovator or you're going to be an accidental innovator because you somehow accidentally hired somebody who going to essentially work two jobs, the one you gave them and then the other one they'll spend nights in the office and maybe they'll come up with something, but there won't be a lot of these folks. And yeah, that's not a great lottery ticket. So if you don't have that tolerance for no ROI when you're trying to innovate, just you have to be a follower. Just wait for everybody else to show how it's done and follow them. But there is another piece with when you actually do this wasteful innovating, you learn how to innovate. And we have solar's paradox coming up again in AI. And Sola's paradox came up in computers for productivity and that you could see the productivity everywhere except in the numbers. Right? That was the paradox. So how is it that we can all feel so much more productive? How is it that we can have individuals numbers like 90% of software engineers are using generative AI to help them code other numbers like can't remember if it's 90 or 70 or some big number of people personally use these tools in the surveyed population for this 95, no different study I think anyway, workers personally use tools and yet a tiny fraction of the employers, the companies actually formally give access to these tools. So you've got this shadow AI thing going on where people are using AI but it's not sanctioned, it's not held by their employers. Right. You've got this big disconnect. People really like it, they seem to be productive. I'm very much more productive with it, personally. And yet we don't see it in the roi, we don't see it in the productivity numbers. Sometimes what we're doing is we are just laying tracks to be able to innovate next time to get the next project right. Sometimes this is the first pancake and in some sense when you begin a batch of pancakes, the first one is an investment and there is a return on that investment. It's just not measured the same way as return on investment of your other pancakes. So I just want to caution people as they're in the innovation games, they're just getting started as there's a lot of uncertainty, don't expect that there's some magic here. The guarantee is that the rules are now suddenly different. They're not. It's the same innovation game as before. But now I'm going to answer your actual question and I'll land this plane. Finally, let's get back to the difficulty of measuring ROI and the difficulty of value. Talking about value and why there's a value gap. And here I'll say that if you look at how we thought about metrics before with your classic machine learning, 10, 20 years ago, we're thinking, and when we deploy it as well, we're thinking in terms of minimizing loss or minimizing error. When you have that philosophy of error, what you also have is a philosophy of correct answer, right answer, right? Because if you don't have such a thing as a right answer, you can't have such a thing as a mistake. So you can't have such a thing as an error to minimize. So you can't have all the optimization map that we're very used to. So it'll be things like, you know, you'll have an image classifier and it's supposed to say cat, and instead it says dog. And we can say that that's an error, right? We can measure that. Or you're supposed to predict the weather and it was supposed to be 72 degrees and we observe that it's 75 degrees. Right? 3 degree error. All in terms of there is a single right answer that we are chasing. Now, of course, there are many wrong answers. If the weather is 72 degrees, all the other numbers are wrong, right? But there's only one ramp. Now think about generative ar. We are essentially simulating from distributions here, and anything out of that distribution could potentially be, if it's from the right distribution, a goodish answer. Now think about this. A customer service interaction, an email, a post. If I ask for an email to have this podcast with you on a Friday afternoon, I could write that email. Hundreds, thousands, infinite number of different ways and it would still be a good email. Of course there's infinite number of ways it would be a bad email. I could start cursing in the middle of it. I could send you a complete. Not an email, but just a poem. And you would find that quite weird, though. You'd probably be intrigued. Be like, definitely invite her on the podcast. Or I could do the classic mistake and instead of Jeff, I could name you something else. Lots of different ways to be wrong. Lots of different ways to be right. Though what should be my tone? How would I know which email is better than which other email? I've got infinite, endless ways to get it right. It's not cat, not cat or 72 degrees, not 72 degrees. It's infinity of ways that I could be solving the task. And now we get the big problem. So far, these are just mathematical tools or software based on mathematical tools based on data that's doing what it was made for. But what it can't do for a leader is tell them what good enough actually means. How do you make this cut between completely awful emails all the way across to, I don't know what, how you would even think of, like the most perfect email you could get. But somewhere you're going to have to draw a line. You're going to have to create standards of some kind. You're going to have to talk about how you're going to measure this. If you're going to have automated emailing, for example, if that's the system that you're going to put in place, and if you're kind of squeamish about that, you could say, well, I will reduce it to a KPI I know about. I will maybe see how much time I can save my humans if I give them an emailing copilot. But now I get some measurement issues as well as a manager, because do I force them to use it or not? Because if I don't force them to use it, am I tracking whether they chose to use it or not? Now, how am I going to measure a value if they're all ignoring it and continuing to write? But maybe they are writing better for reasons unrelated to the AI. Maybe that'll look like results, maybe it won't. Right, we've got, we've got some potential issues here. Or maybe we forced them all to use it. They hate it. They haven't learned how to use it yet. Maybe what we're going to see is decreased productivity and eventually that productivity comes up. But how are we, are we sure we're measuring the right thing? And how would we think about those strange edge cases where every now and then that email is a PR disaster, Especially when we make systems where we take the human out of the loop. And now all those emails are going to be sent with no human oversight. Maybe a bunch of them save a lot of time, but there's that one that gets the media interested and that tanks your company. So there are so many different ways you could think about setting up notions of what value is, how to measure it, and how to deal with this curse of endless right answers. And again, most MBA courses, most things that we think about when we think about metrics is about targeting a right answer and how wrong are we? This is a different paradigm and I think it's snuck into our workplaces without us even realizing how much of a different paradigm it is.
