Summary7 min read

Everyday AI Podcast Episode 725: Measuring AI ROI: Why You’re Doing It Wrong and the 7 Steps to Fix It (Start Here Series Vol 11)

Host: Jordan Wilson
Date: March 3, 2026
Main Theme:
How to accurately measure the return on investment (ROI) of AI in your business—why most companies get it wrong, and a practical 7-step blueprint to fix it.

Episode Overview

Jordan Wilson confronts one of the most debated questions in business AI adoption: "Is there really ROI in AI—and if so, how do you prove it?" Drawing on benchmarks, debunked studies, and industry-leading surveys, Jordan dismantles common misconceptions and provides listeners with a structured, actionable approach to truly measure AI's impact. He urges businesses to move beyond intuition or "vibes" and get rigorous about measurement, emphasizing that "AI is gravity at this point."

Key Discussion Points & Insights

1. The Myth of Elusive AI ROI (00:17–08:30)

Current Situation: Most businesses say AI "feels" like it’s working—faster output, more work done—but struggle to prove real ROI with data.
Outdated Playbooks: Companies are using obsolete digital transformation strategies that don’t suit the speed or nature of AI in 2026.

"Chances are you can't answer that and the reasons are actually very simple. It's because businesses are using the same digital transformation playbook they've always used when it comes to AI. And, well, that playbook is useless in 2026." (00:37)

2. The GDP Val Benchmark: Real-World AI Performance (03:36–06:50)

What Is GDP Val? A benchmark from OpenAI testing AI vs. expert humans (average 14 years' experience) across 44 jobs in 9 sectors.
Findings: Top AI models tie or win 70% of the time in blind evaluations—and do tasks up to 100x faster.
Implication: The data shows dramatic ROI in productivity and quality, yet businesses are still debating the basics.

"If the AI model is the same or better 70% of the time and it's a hundred times faster, that's the math. So why are we still even debating this concept of is there return on investment?" (06:40)

3. Debunking the "No ROI from AI" Viral Study (06:50–13:20)

MIT "Study": Widely reported that "95% of enterprise AI pilots delivered zero ROI" (August 2025). Led to stock market drops and panic.
Reality: The study was based on just 52 qualitative interviews, not quantifiable data, and was really a marketing pitch for MIT’s own AI product.
Contrast with Real Data:
- IDC: $3.70 return for every $1 in AI.
- Wharton: 74% of enterprises report positive ROI.
- Google Cloud: 74% see ROI in GenAI within 1 year.
- Deloitte: 84% getting ROI from AI investments.

"This quote unquote study was based on 52 qualitative interviews. There's no quantitative piece to that 95%... It was a vibe study." (10:09)

4. Where Does AI ROI Go? The Invisible Productivity Problem (13:20–19:40)

Hidden Productivity: Much AI ROI is "pocketed" by workers in remote/hybrid settings—time savings turned into free time, not visible company gains.
Job Roles Are Outdated: 89% of organizations haven’t updated roles for AI. Outputs are still measured using pre-AI standards.

"Workers are pocketing it. That's where the ROI is going, right? Because true productivity and true ROI, well, it requires results-driven metrics, not time-based management." (15:23)

5. The True Measure of AI ROI: Metrics That Matter (19:40–23:25)

It's Not 'Prompts Sent' or 'Utilization Rates': Measuring AI ROI isn't about usage dashboards, but about actual outcomes—time saved, costs reduced, revenue increased, and risk avoided.
Simple Formula:
- Time saved (compared to baseline) × hourly rates minus AI/tool costs.
- Focus on cost per task, throughput, and error rates.

“Utilization is not the metric that pushes it. It's: are you saving time? Are you increasing revenue and avoiding risk?” (20:50)

6. Five Reasons AI ROI Measurement Usually Fails (23:25–25:50)

No pre-AI baseline exists.
Slow, year-long pilots kill momentum.
Overreacting to a single lucky success.
Vanity metrics (like prompt count) dominate.
Shiny object syndrome—chasing new models before full implementation.

Root Issue: Not a tech problem but a failure to rethink measurement and process as humans and organizations.

7. The Prerequisite: The BASE Approach (25:50–28:55)

BASE = Baseline Assessment of Standard Execution (pre-AI):
- Meticulously time and document the process before AI.
- Track error rates, rework cycles, costs per completed task.
- You can’t retroactively create a clean baseline once AI is in play.

“You need to do it. Before you implement AI, you need to measure how long it takes humans to go through and do these certain projects or do these certain tasks.” (27:45)

The 7-Step Blueprint for Measuring AI ROI

(Detailed at 29:00–38:55)

Step 1: Define

Rigidly define the outcome, rubric, and KPIs before testing.

Step 2: Measure Human Baseline

Document multiple employees performing the task without AI; average their time, errors, and costs.

Step 3: Build Real-World Cases

Create 20–40 challenging, messy test cases, including edge cases.

Step 4: Configure the Production Workspace

Standardize models, accounts, permissions—ensure a controlled and repeatable environment.

Step 5: Run Tests Three Times

Run each use case three times with memory off (in AI models); require 'proof artifacts' (e.g., logs, outputs).

Step 6: Grade Blind and Standardize

Human judges grade AI and human outputs blind, using the agreed rubric.

Step 7: Retest Monthly

After every major model update, repeat the process and use a rolling 3-month average for tracking.

“You have the input and the output, right... You multiply the time that it takes the humans to do it on the AI side...minus the cost of whatever AI tools that you're using. All right, there's your augmented cost and then you compare it to your human only cost.” (36:20)

Quick Recap (timestamped):

37:58 — “Step one, define the rubric and success criteria. Step two, measure the human baseline. Step three, build the 20 to 40 real messy work cases. Step four, configure the exact production workspace. Step five, run with three times AI models in three sets of humans. Step six is grade blindly and standardize the output format and criteria. Calculate your ROI, and then step seven is retest monthly.”

Notable Quotes & Memorable Moments

On the ROI Debate:

"This whole discussion on does artificial intelligence give you a return on your investment? It's...a very dumb question, if I'm being honest." (05:58)
On Corporate Reluctance:

"Humans are innately lazy... the reason why we don't want to actually sit and measure." (13:50)
On Old Ways of Working:

"You're using AI, okay. I've been saying since 2023, it's the same thing as, 'oh, our company's using the Internet.'" (15:00)
Why Most Companies Fail:

"These aren't technology problems... These are we as working human beings, as knowledge workers, we don't have a playbook to follow." (25:09)
Why the 7 Steps Matter:

"You need to establish those baselines and then redo the entire process, right? Blow it up, Measure it first. That's your base, right?" (28:55)

Final Words & Challenge to Listeners (38:55–end)

Jordan insists the debate on whether AI delivers ROI is "over." Quantitative research, rigorous benchmarks, and lived experience all say yes—emphatically.

“The new ROI question is not did AI work? It's how much did we lose by not educating in measuring sooner?” (39:40)

Timestamps of Key Segments

00:17 – Introduction: Why most companies can’t prove AI ROI
03:36 – OpenAI’s GDP Val benchmark explained
06:50 – MIT’s “zero AI ROI” study debunked
13:20 – Where AI ROI is “lost” inside organizations
19:40 – How not to measure AI ROI: common mistakes
23:25 – Five reasons companies fail to measure
25:50 – The “BASE” prerequisite for honest measurement
29:00 – The 7-step blueprint for measuring AI ROI
37:58 – Rapid recap of the 7 steps
38:55 – Big takeaway: The only ROI question that matters moving forward

Summary Takeaway

To measure AI ROI in 2026, you must put aside both hype and outdated transformation strategies. Instead, rigorously document your pre-AI baseline, run real-world controlled tests, and continuously re-evaluate after deployments or model updates. Meticulous, repeatable measurements—using the 7-step framework—are the difference between "vibes-based" and results-based AI management. The only real mistake left is to delay adopting this approach.

For more practical AI advice, access the full Start Here Series and connect with the Everyday AI community at StartHereSeries.com.

Loading summary

Transcript3 lines

[00:01]
A
This is the Everyday AI show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business and everyday life.
[00:17]
B
Can you actually prove AI is working at your company? Sure. Everything feels faster and hopefully outputs have improved and maybe more work is getting done. But is there roi? And don't just give me vibes, give me receipts. What's the actual return on your investment for AI at your company? Chances are you can't answer that and the reasons are actually very simple. It's because businesses are using the same digital transformation playbook they've always used when it comes to AI. And, well, that playbook is useless in 2026. So there's a 99 chance you're not actually measuring ROI. But I'm going to simplify it for you on today's show and tell you why the ROI on AI debate is absolute nonsense. And I'm going to give you seven simple steps on how your company can fix it today. All right, get measuring. So here's what you'll learn on today's show. You're going to learn the basics of measuring Gen AI ROI and why most companies never actually do it. You're going to learn why a piece of MIT marketing disguised as a study misled markets in what five larger studies actually show. You're going to learn how to use a 4 metric scorecard covering quality, costs, reliability and risk. And how to run a seven step evaluation sprint that actually solves your roi. So what the heck is this? Well, this is Everyday AI number one, but this is our Start Here series. I didn't have a good answer when people first subscribed to the podcast and they're like, Jordan, there's 700 episodes. Where do I start? I. I'm like, I don't know. Well, now I do know. You start here with the Start Here series. This is the essential podcast series to both learn the AI basics and to double down on your AI knowledge. So, so make sure to go to Start Here series dot com. That's going to give you free access to our inner circle community. And then you can go in the Start Here series space and listen to now all 11 episodes of this series. So last episode, which was a banger, FYI, we talked about from AI chatbots to autonomous workers and how consumer AI has changed and what's next. So make sure you go check that out. It's episode 723, but Bon, volume 10 of the start Here series. And today we're going to be measuring roi and tell you why your company's doing it wrong and these seven steps to fix it. Let's start here. Talked about this once or twice, but let's talk about this benchmark from OpenAI. It's called GDP Val. So this benchmark tests AI on real world deliverables across 44 jobs in nine different GDP sectors. Essentially, this is a unbiased benchmark that pits AI models against expert humans. Experts in their fields, right? They each have the same task, which is generally to produce something of value, right? That's why it's called GDP val, to produce something of economic value. And then there's expert judges, expert human judges. And they don't know which output came from the AI. They don't know which output came from the expert human. And Right now, top AI models either tie or win 70% of the time in these blind evaluations against professionals that have an average of 14 years of experience. I mean, my gosh, we should just end the episode now, right? There's your ROI on AI, right? Done in under four minutes. There's a lot of issues here. Number one, even what I just said, right? Do you know how to use today's models to, you know, one shot, personalize, synthesize information, create, you know, spreadsheets, PowerPoint decks, and use your own company's information without you having to do anything and verify outputs? Well, probably not, because companies aren't educating their people and it's hard to keep up with today's capabilities of AI models. But what this did show us is that the whole debate on ROI is absolute rubbish. Yes, there we go for listeners across the pond. Look at me, I'm multilingual here. It's garbage. This whole discussion on does artificial intelligence give you a return on your investment? It's. It's honestly a very dumb question, if I'm being honest. Right, Go, go read the GDP battle study. I keep saying, oh, I'm going to do a show soon. I, I will actually do a show soon because it's worth it. So not only did it show that, well, today's AI models are exponentially better than expert humans when judged by other expert humans, right? But it said that in many tasks, they do it a hundred times faster. So again, I am not a mathematician, but if the AI model is the same or better 70% of the time and it's a hundred times faster, that's the math. So why are we still even debating this concept of is there return on investment? Well, that's because of an infamous piece of marketing from mit. So, and yes, I'm calling it a piece of marketing or study in very heavy quotes. Right? Like the. What is that? The, the Austin Powers, you know, a billion dollar. Yeah, huge quotes on the word study. Right. So MIT claimed in their wildly viral study that 95% of Enterprise AI pilots delivered zero ROI. And this was in August of 2025. Right. And I mean, the markets moved, right. Nvidia stock tanked three and a half points. Palantir went down nearly 9%. There's actually billions of dollars of losses across the. Not just the tech sector, just the business sector, because. Right. AI has essentially propped up an otherwise struggling US and at times global economy. And then, you know, this study came out and said, oh, AIs, you know, there's no ROI. Well, most people, Right. As a former journalist, I understand this, right? There is one study that came out and then essentially every other journalist hit the copy and paste, you know, more or less, hey, that headline sells, you know, 95% of AI pilots don't produce any ROI. Right. That's great. Let's go ahead and run that. But I don't think any journalists actually read the story, or very few actually did. Right. That's because this quote unquote study was based on 52 qualitative interviews. There's no quantitative piece to that 95%. And it was directionally. Right. This was a directional study. So it's more or less. This was a vibe. This was a vibe study. I'm going to talk to 52 people. You know, nothing quantitative. I'm just going to give my vibe on this. But the stat came from 52 qualitative interviews that measured P and L impact in under six months, which there's no such thing, Right. When you talk about digital transformation, go back and look at the Internet. Right? You could have shown P and L impact in under six months on using the Internet. Right? Because if you could have, you could apply that exact same methodology to AI anyways. Well, why. This is a piece of marketing and not actually just a bad study. I would call it a bad study, but it wasn't. It was marketing because in the end they were selling their own product. They said, hey, right, all these AI pilots fail because you're not using our agentic AI product and they were, you know, selling access to it. Yeah, I'll stop there. If you really want the, the down low on that one, make sure to go listen to episode 597 where I, like a human read it like a human with a brain read the study multiple times. And it's, you know, laughable. Anyways, if you look at the real data, right? Companies that talk to thousands and got quantitative data, obviously The ROI for AI is there, right? The IDC, the International Data Corporation found a $3.70 return for every dollar invested in AI. Wharton, a three year study found that 74% of enterprises report positive ROI on gen AI. Google Cloud's 2025 annual AI study found 74% of executives reported achieving ROI just within the first year of generative AI deployment. Also, Deloitte's 2025 AI survey found 84% of those investing in AI said that they were gaining ROI. This, I think the dis, this debate is still going on because humans are innately lazy, right? That's the reason why we don't want to actually sit and measure, right? Again, my, my background, I have a little bit of marketing and advertising background, so I understand the importance of measuring something before and after to see if there's an impact, right? Running millions of dollars of ads for clients over the years. There's this thing called return on ad spend, right? You see how much, you know how much money was spent on ads. You have to be able to attribute and track the revenue that it brought in. And then you do some simple math, right? Return on ad spend. Well, there is also a simple equation that I'm going to give you guys here in a little bit. The same thing, return on AI. So here's the dirty little secret. AI moves too fast to follow, but you're expected to keep up. Otherwise your career or company might lag behind while AI native competitors leap ahead. But you don't have 10 hours a day to understand it all. That's what I do for you. But after 700 plus episodes of everyday AI, the most common questions I get is where do I start? That's why we created the Start Here series, an ongoing podcast series of more than a dozen episodes you can listen to in order. It covers the AI basics for beginners and sharpens the skills of AI champions pushing their companies forward. In the ongoing series, we explain complex trends in simple language that you can turn into action. There's three ways to jump in. Number one, go scroll back to the first one in episode 691. Number two, tap the link in your show notes at any time for the Start Here series. Or you can just go to start here series.com, which also gives you free access to our inner circle community where you can connect with other business leaders doing the same. The Start Here series will slow down the Pace of AI so you can get ahead. Because you're probably thinking, okay, Jordan, well if every company's using AI, well, why isn't everyone just seeing exponential growth? Well, I think you would see that a little bit More if only 5% of businesses were using AI, right. But most studies say more than 95 or 97% of enterprises. So the, the playing field has just gone up, Right. The minimum entry is no longer like, oh, you're using AI, that's great, that's such a differentiator. No, no, it's, you're using AI, okay. I've been saying since 2023, it's the same thing as, oh, our company's using the Internet. So. Right. But here's what's happening. Workers are pocketing it. That's where the ROI is going to, right? Because true productivity and true roi, well, it requires results driven metrics, not time based management. Most companies haven't, you know, gotten this figured out yet. And I think a big portion of this is that remote and hybrid workers are using AI, whether it's approved AI or you know, whether it's, you know, AI sprawl, you know, dark AI, whatever you want to call it, and they're just pocketing it. It's this invisible productivity. Right. I've literally talked to countless people. I won't say hundreds because I don't know if it's actual 200 or more, but I've easily talked to more than a hundred people over the past few years in very successful, right. Even one of my, you know, one of my good friends told me this like maybe one or two years ago, pretty, pretty high up at, at company that was public company when he worked there, said he automated about 95% of his job with AI, right. No one knew because he's at home. I'm like, what are you doing with all this time? He's like, oh, you know, chilling, golfing a lot. Right. That's the reality. You know, so many companies, so many employees I think are just pocketing this time savings, right? As, as we've gone to this, you know, remote hybrid workforce. And I think that's one of the reasons why last year so many companies had this, you know, big return to work push. And right now, According to Workday, 89% of organizations haven't even updated their roles to reflect AI, right? So they're still using, you know, 2026 tools inside of job structures made 10 years ago. Right. So why does that matter? Well, again, hiring enrolls in today's enterprise are still pre AI. So what that generally means, well, expectations or output is measured in a pre AI way, which isn't necessarily the right thing, right? And in the same way that I've encouraged you all for, for years to not upskill or reskill, that's, that's a waste of time when it comes to AI, right? You need to unlearn and rebuild. Been saying that for a long time. Departments and companies need to do the exact same thing, right? You can't just, oh, let's add a little bit of AI to this job role or, you know, hey, let's, let's try to think a little more AI natively here at this company. No, you, you got to tear it apart in the same way that job, job roles haven't changed, right? That means outputs aren't going to change either. But if you're using AI, if you've trained your people on AI, if they have the right AI, they're just banking that productivity. So here's how finding ROI is done. It's simple. ROI can mean a lot of different things. It can. A lot of people just think, oh, that means you're bringing in more revenue. No, I think mainly it's time saved, right? But I think ROI also means just cost reduced, revenue increased or risk avoided. It. It's not just prompts sent, right? I think that's what a lot of people, they're looking at the number of hours, you know, your utilization rate. If you're using, you know, an enterprise tool like Chat GPT, right? Everyone's like, oh, you know, we need to increase our utilization from, you know, 8% to 20%, right? And then companies, you know, hire us to, you know, train hundreds or thousands of employees. And it's like, okay, utilization is not the metric that pushes it. It's are you saving time? Are you increasing revenue and avoiding risk? The formula, well, simple math. It's the time saved or increased revenue. But let's look at time saved. So it's just time saved multiplied by fully loaded hourly rates minus any monthly AI subscription costs. Simple formula, right? And right now, boards and CFOs just care about cost per task, cost per task, throughput in error rates, right? Not usage dashboards, not utilization. But we're seeing the same reasons for failure. Still, years later, right? Through my own conversations and conversations with others, I can say, when it comes to measuring or not measuring roi, I still think that there's five patterns or five main reasons why either individuals or departments aren't able to do this. So number one, not having a pre AI baseline number two, you know, doing these slow year long pilots, that's a recipe for disaster. Number three, celebrating one lucky run and making that your AI strategy. Number four, when vanity metrics become the focus. And number five, shiny object syndrome, pulling focus away before real company wide implementation can actually start, right? Oh, you get a little, you get a little ground, you show roi and then you're like, oh, what about this model? What about this? What about this? What about this? Right? Before you actually, you know, implement it company wide. And these aren't technology problems, right? This isn't standard digital transformation. These are we as working human beings, as knowledge workers, we don't have a playbook to follow, right? So all we've been doing is following the same playbook that we've always done, right? Those slow pilots, you know, you get one thing right, you're like, oh, this is what we do now. It's the wrong way to do this. The other big reason, well, training, right? The training actually leads to those five other failures. So it's more of a, of an underlying foundational issue, right? But right now, 49 of enterprise leaders cite recurring. Sorry, cite recruiting AI talent as their single biggest challenge. All right, AI, you need documented workflow steps and company data. And that process, documentation is part of the ROI investment, right? Education, training, ongoing learning, it is required, right? So many, I think when, when AI, when an AI strategy or actual AI implementation looks, that looks like, you know, pushing top down, you know, either C suite or board pushing, you know, a certain tool top down and not training or educating, that's when it's so easy to fail, right? I use AI all day, every day. And I'll, I'll, I'll tell you this, if I take two weeks off, I'm gonna fail. That's the, that's the pace, right? People are always like, okay, well, Jordan, you clearly don't know what you're saying here. If AI is a hundred times faster, well, shouldn't we just cut all these jobs? No, I think jobs are going to look very different. I've been on record as saying this since the very first episode. AI is going to, I think, quote, unquote, take away more jobs than it will create, but it will create many roles that we don't even understand. I've always said most enterprise organizations, they need a lot of people like me, that all they do all day is they keep up with AI and they apply it to that business, right? Very few businesses have that because they're like, okay, that's a crazy role. We're not going to let, you know, Bill over in IT or, you know, Deborah over in marketing, just spend all day, you know, tinkering with AI and seeing how to apply it. No. Right? No. But yes, that's exactly what you should be doing. And here's. All right, this is usually something I think we only give away to companies that pay us a couple dollars. But let me just go ahead and give you some of the secrets here. All right, here's an acronym base. You need a base. This is the baseline rule. So this is BASE stands for baseline assessment of standard execution, pre AI. All right? So before AI touches any workflow, you got to get that base, right? That's your time. Multiple employees completing that exact tasks without it. You need to time it. You need to record the average time, the error rate, the rework cycles and cost per completed task as the baseline. You can't go back and collect this retroactively. Right. Because usually what happens, well, if you've already, you know, sprinkled AI in the process and you've been doing it for a year, right. Let's say it's a ten step process. And hey, number two and step two and step six, we, you know, we molded those a little bit around AI, right? It's too late, right? You need to do it. Before you implement AI, you need to measure how long it takes humans to go through and do these certain projects or do these certain tasks, right? Think of it like an internal GDP val, right? You are going to go through and have a human with no AI go through and do this task. Document every single step of the process painstakingly because you need to see, okay, what other humans need to be involved in this? Are there other meetings? What about the person checking the work? What about the communication going back and forth? You need to measure it all and document it all, every single step. And then what's the completion rate, what's the error rate? What does quality look like? You need to establish those baselines and then redo the entire process, right? Blow it up, Measure it first. That's your base, right? Baseline assessment of standard execution. That's your base. Blow it up. Do the same thing with AI, right? Not your first iteration, iterate on it. And then you measure in that number right there. You need that because that is the base, quite literally for how you're going to ultimately measure your roi. So now let's give you that seven step guide. All right, I'm delivering. Here we go. Well, let me just bring my notes up here because, you know, in Typical live stream, unedited, unscripted fashion. I didn't put my slide up here with my seven steps. But don't worry, I got my notes. All right, there we go. Notes on the screen. So here's the seven step blueprint, right? Step one, we already talked about it. That's the. Well actually comes a little bit before the base. So step one, you have to define. You have to define what the heck it is you're doing, right? And you have to be very rigid. You can't be flexible because you need to get an accurate before and after. So that's defining the rubric, rubric, defining the success criteria and KPIs before you even begin testing. And then step two, like I talked about, that's the base. You need to measure the human baseline and then you're going to, you know, time multiple employees doing that and then record the averages. Step three, get messy, right? You have to build 20 to 40 real messy work examples, including drift cases. Right. These aren't easy things for humans or for AI. All right, and then step four, you need to configure the exact production workspace. It's the same plan, the same model, and the same permissions. So you're not, you know, flip flopping, you know, between different people, different accounts. No, right. It has to be controlled like any experiment, and it has to be repeatable and scalable, the exact same criteria. When you're taking this to a production run, step five, you're going to run every three times. Okay, so sorry, run each test three times with memory off and also require proof artifacts. So here's what I mean by that. We're doing this on the front end. Okay. If you list, if you've listened to this podcast at all, you know, I'm a big believer in bringing as many of your processes over to front end large language models. We actually did a dedicated episode on that in the Start Here series talking about an AI operating system. So yeah, this is all going to be doing things on as an examp chat, GPT.com Claude AI gemini.google.com Right. But doing it with memory turned off and doing it usually in a temporary chat if your, you know, AI operating system of choice lends itself to that. All right, so essentially there you have steps three through six is that's your internal GDP val. Right? You have to get the, the correct use use cases and then from there you grade blind. Right. So same thing, you have the, the series of people do it three times, those 20 to 40 use cases using the same AI model. And then you have humans do those same exact things as well, multiple humans in the same way. I would suggest, you know, three different times running the same case and then three different sets of humans. And then you grade blind, you standardize the output format, you have to agree on, you know, what's a pass, what's a fail, what's the grading scale, etc, but then at that point, well, you have the input and the output, right? At that point, after step, step six when you can have your grading criteria, you know, you run through and you do the test three times, you know, those 20 to 40 use cases three times in a large language model memory off, temporary chat. You have your, your grading rubric, you have your humans do it, right? The same 20 to 40 tasks, three different humans, you have it right there, you have the time, right? You multiply the time that it takes the humans to do it on the AI side, all right? Minus out, you know, so take, let's say it's 100 human hours, take their hourly rate times it, minus the cost of whatever AI tools that you're using. All right, there's your augmented cost and then you compare it to your human only cost, right? And obviously the human only cost is going to be much higher, right? The same thing if you are using any, you know, paid non AI tools for the human only cost, right? Maybe there's some, you know, I don't know if you're using the Bloomberg terminal or whatever, right, Whatever, you know, a certain SaaS and maybe you don't need to use that SaaS application if you're doing it AI native, the same thing. You need to look at the total human and software costs and there you go, that's your return on investment. And then step seven, you need to retest this monthly after every model update, all right? And then track a three month rolling. Average costs are going to go up and down, right? You think that, oh well, they're just always going to go down. Well no. Sometimes, you know, the frontier AI labs will actually roll out an update that's under the radar. It's not like going from a GPT5 2 to a 5 3, right. A lot of times there might be five, ten different versions of a GPT52 until there is a GPT53. And you know, whether it's OpenAI, Anthropic, Google, Microsoft, et cetera, sometimes, you know, one of those under the hood updates actually might make things worse. All right, so that's why you need to retest monthly or after any major model update. And Then keep a three month rolling average. There you go. All right, I'm going to go through it quickly now with no commentary. Step one, define the rubric and success criteria. Step two, measure the human baseline. Step three, build the 20 to 40 real messy work cases. Step four, configure the exact production workspace. Step five, run with three times AI models in three sets of humans. Step six is grade blindly and standardize the output format and criteria. Calculate your ROI and then step tab. Step seven is retest monthly. So there you go. That's how you measure return on investment in AI. But let me just tell you this, and I'm going to be very direct when I say this. You don't need to do this. You absolutely should, right? You absolutely should. You don't need to. This is, this is gravity, right? AI is gravity at this point. It is the all encompassing force and there's no denying it, right? Real studies with quantitative data that asked thousands of business leaders, all overwhelmingly show ROI in AI is real. $3.70 for every dollar invested the GDP. Val. Right. AI models without training, right? Without iteration. In a single, in a simple shot blind test, the AI model is the same or better than the expert 70% of the time and a hundred times faster. So I'm going to leave you with this. Yes, you need to configure and figure your ROI on AI. But the new ROI question is not did AI work? It's how much did we lose by not educating in measuring sooner? That's the reality. And that is the challenge to you today, dear listener. Stop looking at ROI on AI like it's something tricky, like it's something we can't all achieve. It's as simple as being meticulous in your measurement. And that's it. And when you are meticulous in your measurement, you will undoubtedly see in insanely high return on investment of your AI. So no longer a question mark, no longer does AI work. The question is in the way you need to change is how much are we losing by not measuring sooner and educating our teams better? All right, that's a wrap, y'. All. That is volume 11 of the start Here series. I hope this was helpful. And if it was helpful, well, number one, please like and follow the podcast. I'd appreciate that. But then when you're done, go to start hereseries.com that's going to give you free access to our inner circle community. And then you can go listen to the entire Start Here series. All right there. And connect with others who are trying to grow their company and their careers with Generative AI. And hey, while I have you, make sure if you haven't already, go listen to the 2026 AI prediction and roadmap series. That's episodes 712 and 7 13. So thank you for tuning in. Hope to see you back tomorrow and every day for more Everyday AI. Thanks y'. All.
[32:53]
A
And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going for a little more AI magic. Visit your everydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.