
Loading summary
A
So picture this. This company is spending around $200 billion this year, most of it on AI and data center infrastructure. They're betting the whole farm on AI. That same company just told its own staff to use AI less. A senior VP actually told staff and the FT reported this in May, don't use AI just for the sake of using AI. It's a huge company and the reason why it's a lesson about motivation has absolutely nothing to do with AI. And enter Goodhart's law. When a measure becomes a target, it stops being a good measure. You get what you measure, not what you want. Nowadays, everybody and their uncle are measuring AI usage, these raw usage metrics and getting gamed all along. So, hi, I'm Rob. I've been driving adoption behavior change for digital products and programs using behavioral science for over a decade and now I'm also using the utilities framework at the Oktalasis Group, the premium consultancy on gamification and behavioral design. And I'm also leading the number one podcast on gamification. Professor Game. This isn't the typical AI backlash story that everyone is reading on the newspapers. It's a core drive's failure and you can build the same trap entirely by accident. So during this video, what we're going to do is we're going to look at the motivation machinery behind it. Why gaming? Why was like thanos like to say inevitable. And the one question to ask before you measure absolutely anything and if looking at these failures and how to fix them and how they are represented in the core drives is something that you are looking into because they will drive your business forward, we have something for you. We have the core drives in the Wild Free guide. All you have to do is click on the link below in the description and you'll get direct access to that and as well to our email list. Now let's get started. Because this massive company strapped on to very powerful as all the core drives, but the combination is also a very good one. They put together Core Drive 2 development accomplishment and Core Drive 5, social influence and relatedness. Core Drive 2 because they put up a leaderboard, literally a leaderboard to see how much AI usage, or you could say it nowadays as how many tokens you're spending on the use of AI. And you could see your progress. It was very clear. You had a progress bar, you had a leaderboard seeing where you were positioned within that table. And also of course that brings in Core Drive 5 Social influence and relatedness because you were there next to other people. So you were Gaining status or losing status in some cases as well, depending on where you positioned yourself on that leaderboard. This was generating some, perhaps even healthy competition. Some colleagues saying, ah, yeah, I'm beating you now on this leaderboard. And this healthy competition got them to maximize for that metric. Actually the system got the behavior, it was rewarding, flawless compliance. The problem was it was the wrong thing. It was raw usage. They were not really aiming for how productive they were becoming with AI, how the benefits of AI were landing. That was literally not being messaged. And remember, perhaps you might look into a past episode where we were discussing how the way the test was set up when ChatGPT was coming out, we're setting up the students to actually try to find a way to go through the cybersecurity and be able to use ChatGPT to get the right answers under very high pressure. This is a similar application of that same principle. So now let's dive in into who actually made this mistake. And it's none other than Amazon. Their leaderboard was called Kirorank. Scoring their staff on this AI activity on the Kiro developer platform. This again, this was quoted on the Financial Times article. You can find the source on the show notes. But the main thing is that the workers were even setting autonomous AI agents on lose needless tasks just to climb the ranks. So they were optimizing for increasing their AI usage, not on making this AI usage useful. Of course, the compute costs were all over the place. What did they end up doing? They had to kill the initiative itself. And this is not a oh dumb Amazon story. This is a measurement trap that is easy to fall to whenever you are setting a target. As I mentioned before, a teacher sets the scores. The teacher, or the professor in our case, prepares you for the test and it produces the exact same outcome that we were trying to avoid, which was cheating. The more powerful the tool that you were actually measuring, the more expensive it's going to become for you when people start gaming the metric Amazon paid in real compute money. This was not just a small mistake. This was not just a small flip. It became so big that it's been quoted in major media. So the question becomes, what can I actually do about this? Because by no means am I advocating for not measuring things. Quite the opposite. I do think measurements are important. There's many other principles as well, you know, including the very famous as well, what gets measured gets improved. And this actually sticks to that. If you're measuring how much you use, you're going to improve how much people are Using it, What are they using it for? If you're not measuring it, it does not get improved. So one of the things that we do in the optalysis framework in our five step process is within the Strategy Dashboard, we do what we call the business metrics. And this is not, oh, what do we want to measure? Get on with it. We really deep dive into what it is that, that you actually want as an outcome. What is the result that you were looking for? In the case of Amazon, what was the result that they wanted from AI adoption? They didn't really want AI adoption for the sake of it. What was the reason behind that? Adoption is what you're really targeting and oftentimes what I do with my students. This is a completely separate subject. When we talk about lean operations, this came from back in the 80s, I think, or maybe even further. The Japanese and Toyota have a principle called the five whys. You ask, why do you want this? It's like, oh, we want to increase AI usage. But you stop for a second and say, well, really, why do you want to increase AI usage? And you keep asking this at least five times, maybe even less or sometimes a little bit more, until you really arrive to something that says, if I achieve this, this will be a worthwhile objective. Because if you say, oh, we increased AI usage, okay, so what if you say we managed to increase, I don't know, productivity per employee in this section by 2x or whatever the number is that is, or probably is, I would say a lot more meaningful than we managed to increase AI consumption by 500x and now we're paying for it. So getting that real nice round metric of what it is that you want to achieve makes actually Goodhart's law that we mentioned at the start work in your favor. So yes, if you've built a gamble proxy metric, and don't get me wrong, I understand this is sometimes what you get. And going back to the teaching example that I've mentioned a couple of times from the previous episode, it's really hard to measure learning. So we use proxies like tests, oral exams, presentations and whatnot because we cannot drill brains and see how much people have really learned. It's really, really hard. Sometimes you do need proxy metrics, but you also have to look at what are the consequences. What if people only aim for that? What's a way to game the metric? You have to look at it really, really hard. In the end, what you want to measure is the outcome, not the activity. So most dashboards are measuring the activity and Progress Core Drive two Development and accomplishment People see how they can progress through whatever it is that we're measuring. Engagement is not really how many times people are clicking that that is irrelevant. It's the value that you're creating for your users, of course, and for whomever the client is. It could be an internal client, like in the case of Amazon and external client, whatever that might be. You want to consider the user, but also yourself as a company, as a designer. What is the metric that you're optimizing for? If this is something that has been useful for you and you want to look at how you can optimize the use of the core drives from the Octalysis framework on your own cases, we have a free resource for you. Just go to the link in the description, click on the Cored Drives in the Wild guide and you will get access to that. You'll get an email for every single day with one of the core drives explained. You'll look at a case out there in the wild from one of our past episodes. My take is a consultant now from the Octalysis group and you will have that right in your inbox for absolutely free. And you know, as a bonus, as a plus, you'll also be added to our email list which I'm now regularly updating with latest things stuff I'm observing every single almost every single day. Just go on to the link right you can find I think it's right below. Click on that and get access to your core drives in the wild. And as always, at least for now and for today, it is time to say that it's game over.
Title: Spent $200 Billion, Then Said: Use AI Less
Host: Rob Alvarez
Date: June 22, 2026
This episode of Professor Game unpacks a striking real-world case: a massive tech company (Amazon) invests $200 billion in AI, yet subsequently tells its staff to dial back on AI usage. Rob Alvarez uses this scenario to explore pitfalls in motivation, measurement, and gamification, stressing how easily organizations slip into "measurement traps"—rewarding the wrong behaviors and falling victim to Goodhart’s Law. The episode leverages the Octalysis framework of core drives to analyze what went wrong and how similar missteps occur across industries.
End of Summary