
Loading summary
A
Hello and welcome back to Retail Media Breakfast Club. I'm Kiri Masters and today I'm going to be reading to you an article that I published to my column at the Drum on March 23 about some new research that quantifies what many of us in retail media suspected that methodological choices alone can swing incremental roas by a significant margin. Lets jump in. Liz Roesch has sat in rooms where two people from the same brand have completely different views on what their campaign metrics should look like. It's one reason that she thinks transparency matters more than standardization. If the buyers themselves can't agree on what they want measured, the least a retailer can do is show itself working. Liz Roche is vice president of media and Measurement at Albertsons Media Collective, and last week her team was among the authors of new research that shows exactly why that transparency is overdue across 42 real ad campaigns. The same media, the same audience, the same creative and the same spend produced in incremental ROAS, or IROAS, numbers that varied by an average of 6.5x and a median of 2.5x. That is a huge swing in variance that depended solely on how the measurement was calculated in 83% of those campaigns. The results could actually flip from positive to negative just based on measurement methodology. This white paper titled IROAS Demystified, was a collaboration between Albertsons media collective Ovative Group and professors from Northwestern University's Kellogg School of Management. It follows a paper from last year called ROAS Demystified, which unpacked the limitations of standard roas. This time the team turned their attention to iroas, the metric that the industry has been championing as the more rigorous alternative, but they also found that it carries its own hidden variability. The paper isn't arguing that IROAS is broken. It's arguing that the same label gets slapped on meaningfully different calculations, and that most brands don't know enough about what's happening under the hood to interpret results with confidence. We wanted to explore how different methodologies produce different results. We know that there's a new retail media network cropping up every day, and understanding what's working has become more and more challenging. Let's dig into what the research tested. The study took 42 on site display campaigns across Albertsons web properties and varied four methodological choices that any retail media network faces with when calculating iroas Number one how the test and control groups are filtered before matching. Number two which matching approach is used that could be clustering versus propensity score matching. Number three, what data features inform the match, for example, whether past brand sales are included and number four, how incremental revenue is calculated between the two groups, for example observed sales versus a Bayesian time series model. Now these choices, when combined and mixed, produced 54 different Iroas values per campaign without changing anything about how the campaign actually ran. Some specific findings stand out. Propensity score matching produced roughly 12 times better match quality than clustering, but it also tended to produce lower IROAS estimates. Whether or not historical brand sales were included as a matching feature could also swing the IROAs from $1.23 to a negative $0.14. And the two approaches to calculating incremental revenue diverged by an average of 90%. Derek Nelson, senior director of retail media consulting at Ovative Group, said that customers are doing nothing different, media is doing nothing different, just putting everything together differently. You end up with wildly different results. Friend of the show Jordan Whitmer, managing director of retail media at agency SALT xc, sees this play out across his brand clients. He says results still reflect how audiences are selected and campaigns are delivered, and especially when these systems are designed to show ads to shoppers already likely to buy. And that dynamic, where measurement captures correlation with purchase intent rather than a genuine lift is exactly what this research is trying to untangle. But don't throw the baby out with the bathwater. The natural question giving this finding that 83% of campaigns can flip from from positive to negative is whether the industry should be using IROAS as a headline KPI at all. So this is what I asked Liz and Derek, but they push back for some pretty good reasons. Derek's argument is that the methodologies are testing different things new to skew, new to brand, new to category. Those three things, for example, are pretty nuanced. And he says IROAS is such a broad bucket term that it kind of catches a lot of things. The need to simplify has lost some of the nuance that's overall a good thing, but it gets used as shorthand. And his view is that it's on brands to ask the questions and decide what to do with the answers. Liz added that the responsibility doesn't sit with brands alone. She said, I I think it's also our responsibility to educate in this space. Not every one of those brands has a powerhouse data science team who can really make sense of all of this. And I think that's another reason why partnering with Kellogg and partnering with Ovative on research like this, from her perspective, helps to make it conceptually more accessible. Liz framed Albertsons as having a stewardship role. Here, she said, let's get to the actual truth of what's happening because everyone wins when we sell more units. Miracle Ads is the only retail media solution designed for both 1P and 3Pmarketplace brands. Why does that matter? Marketplace sellers demand a seamless advertiser experience that still offers full funnel ad formats, and retailers need a flexible solution that allows you to scale your media business. Learn more@miracle.com that's M I R A K L.com the research's implications land differently depending on where you sit. For the top tier CPGs with dedicated data science teams, this is confirmation of something that they've been wrestling with for a long time, and it could be a useful framework for structuring conversations with other other retail media partners. Many of the largest brands already rank and stack their retail media network partners using internal models, and Liz acknowledged that she's seen those models up close and her focus is making sure Albertsons provides the transparency that those models need to be properly calibrated. But a mid market brand selling through Albertsons probably doesn't have a data scientist on staff, let alone one that can evaluate whether their IRO AS report used propensity score matching or clustering, or whether brand sales were included as a matching feature. As Liz noted, many of these brands receive a report and have no choice but to take that number at face value. To that end, this white paper's appendix gives brands that don't have deep analytics capabilities a starting point for asking the right questions. Questions, Questions about methodology. Who was included in the analysis? How was the control group built? So these are all things that mid market brands can use to inform their understanding. I've been reading Brene Brown's Atlas of the Heart book recently and a concept from the book came to me as I reflected on this conversation. Brene Brown defines curiosity as recognizing a gap in your knowledge and becoming emotionally invested in closing it. She calls it a vulnerable act, a choice to embrace uncertainty over the safety of knowing. The retail media industry has been operating in a comfort zone where familiar metrics like roas and iroas are treated as a settled number rather than a question to investigate. A lot of brands receive a report, they see a positive figure and they move on. Retailers deliver results without always explaining the methodology underneath. Both sides are protecting themselves from the discomfort of admitting the numbers might not mean what they think, or they might not even really understand what they mean. Liz Roesch described what genuine curiosity looks like in practice getting data scientists from both sides of the table to sit down and as she put it seemingly unironically, duke it out for about six hours until they reach agreement on methodology. Now that requires vulnerability from the retailer whose methods are being scrutinized and from the brand which might learn that their first conclusions were misplaced. Brene Brown also describes curiosity as a tool that combats perfectionism and the need for self protection in retail media. The self protective instinct is strong. Retailers don't want to report lower numbers, brands don't want to discover wasted spend and agencies don't want to question the metrics by which their performance is judged. I wrote about this dynamic in detail in a recent piece on why Roas refuses to die. It is a collective action problem where everyone behaves rationally and the system stays broken. The antidote that this new research suggests is the willingness to sit in the gap for a bit. Not to abandon iroas but to ask what's behind it.
Podcast: Retail Media Breakfast Club
Host: Kiri Masters
Air Date: April 7, 2026
Episode Length: 10 minutes
In this concise but illuminating episode, host Kiri Masters reads and expands upon her recent article published in The Drum, delving into groundbreaking research on how methodological choices drastically influence Incremental Return on Ad Spend (iROAS). Drawing on a new white paper co-authored by Albertsons Media Collective, Ovative Group, and professors from Northwestern University’s Kellogg School of Management, Kiri explores why transparency—not simple standardization—is urgently needed in retail media measurement. The episode demystifies iROAS by showing how, in 83% of cases, the result can flip from positive to negative depending solely on the calculation method—spotlighting the danger of taking these metrics at face value.
Filtering test and control groups before matching
Matching approach: Clustering vs. propensity score matching (PSM)
Data features for matching: E.g., including past brand sales
Incremental revenue calculation: Observed sales vs. Bayesian time series model
Key findings:
Quote:
“Customers are doing nothing different, media is doing nothing different, just putting everything together differently. You end up with wildly different results.”
— Derek Nelson, Senior Director, Ovative Group [03:32]
Liz Roesch describes “what genuine curiosity looks like in practice”:
“Getting data scientists from both sides of the table to sit down and as she put it, seemingly unironically, duke it out for about six hours until they reach agreement on methodology.” — Liz Roesch via Kiri [08:54]
This process requires:
Kiri likens this to a "collective action problem” where no one wants to challenge the numbers, so the flawed status quo persists ([09:30]).
Derek Nelson:
“Customers are doing nothing different, media is doing nothing different, just putting everything together differently. You end up with wildly different results.” [03:32]
Liz Roesch:
“Not every one of those brands has a powerhouse data science team who can really make sense of all of this... that's another reason why partnering... helps to make it conceptually more accessible.” [05:37]
Kiri Masters:
“The retail media industry has been operating in a comfort zone where familiar metrics like ROAS and iROAS are treated as a settled number rather than a question to investigate.” [08:08]
Liz Roesch:
“Getting data scientists from both sides of the table to sit down and... duke it out for about six hours until they reach agreement on methodology.” [08:54]
Relevance:
For retail media practitioners, this episode is a must-listen wake-up call: question, dig, compare, and always ask “what’s behind the number?” Transparency is the only safe path forward.