
Welcome to Nerd Alert, a series of special episodes bridging the gap between marketing academia and practitioners. We're breaking down highly involved, complex research into plain language and takeaways any marketer can use. In this episode, Elena...
Loading summary
A
Nerd Alert. Learning is important, right?
B
Yes, exactly. What a bunch of nerds.
A
Nerd alert.
B
Right? Marketing Architects. Hello and welcome to the Marketing Architects, a research first podcast dedicated to answering your toughest marketing questions. I'm Alena Jasper. I run the marketing team here at Marketing Architects. And I'm joined by my co host, Rob demars, the chief product architect of misfits and machines.
A
Hello, Elena.
B
Hello. We're back with your weekly Nerd Alert. Every week I'm going to take a deep dive into academic marketing research and translate its complex ideas into simple, understandable language for Rob, and of course, for all of you. Are you ready to nerd out, Rob?
A
I just had an argument with Siri that ended in me apologizing to her. So I'm in a really vulnerable nerd state right now.
B
I would love to hear what that was. Okay, let's get into it. But before I jump into the study, which, by the way, I'm very excited about this one today, I have a quick thought experiment for you. Imagine you hand your phone to someone else, random person, and say, buy me a good smartwatch. How much do you trust that they're going to pick what you actually would have picked for yourself?
A
It's really going to depend upon the person. Because if I did that for my wife, she's going to know I love Apple, but she's also going to know I made this weird change to Garmin. And so she's probably going to start to really navigate that for me. But if it's a brand new person, they're probably going to look at me, they're going to see my iPhone, they're going. I bet that they're going to like an Apple watch. So I think it depends upon the human you give it to.
B
What if they knew you pretty well? What if it was me? I know. Kind of running. You do?
A
Yeah, yeah, yeah. I think you do a good job.
B
Okay.
A
Oh, yeah, absolutely.
B
All right. That sort of scenario is what this week's research is about. Except instead of a person, it's an AI doing the shopping for you. And the results, I think, are a little bit surprising. So let's talk about it. This week I read what is your agent Buying Evaluation Biases, Model Dependence, and Emerging Implications for Agentic E Commerce. This was published in December of last year by Amin Allua of my custom AI, Omar Besbes and Yash Konera of Columbia University's Graduate School of Business, and Jose Figueira of my custom AI and Akshay Kumar of Yale University. Lots of smart people did this, did this with names that are hard to pronounce. So I apologize. Here's the setup. AI shopping agents are here now. Tools like OpenAI's operator and Google's Project Mariner can browse a website, pick a product and buy it all without a human clicking a single button. They might not be adopted broadly yet, however, we can see what's going to be coming. And Walmart CTO has already warned that these agents will start bypassing traditional merchandising entirely. So this is going to become an increasingly bigger topic for everybody, especially marketers and brands. So, Rob, we just talked about having a different person maybe who knows you fairly well, pick a product for you. When you imagine an AI picking a product for you, do you assume it's going to make a smarter, more rational choice than you would? Actually, I want to ask this in two different ways. First, then you would, and then a person who knows you well would. I'm curious about both.
A
Yeah, I think AI is going to crush it. You know, normally I would go golly, you know, it's a, it's a, it's this robot and it's only going to look at rational things in terms of features and benefits and price points. But if it's a large language model that I use a lot and it really understands my context, I think it's actually going to recognize the emotive side of me as well and pair me with products that is a balance of both rational and kind of what my gut would want.
B
So you think AI is going to have that balance?
A
I do. I think it's easy to default to. Well, AI is just going to be a bunch of. Just a very logical shopper. And I think that's true if it doesn't know you. But the more you use these models, the more they understand your context of all of your world. And I think are going to be able to play to both the heart and the head.
B
Yeah. Oh my goodness, you are spot on. So, yeah. Yes. That's so funny. Yeah, you really got it. Because I think that's where my brain would have gone towards the rational side automatically. That AI sort of removes some of that emotional bias. But that's not what this paper found. So let's talk about it. The researchers, they built something called ACEs. This is short for Agentic E Commerce Simulator. It's a fake but realistic e commerce platform they could fully control. Then they ran thousands of experiments where they had AI agents. They used Claude, ChatGPT and Gemini. And these agents browsed the mock store and picked products. The twist was that the researchers randomized everything. So product positions, prices, ratings, promotional tags, all that shuffled between experiments. They did that so they could isolate exactly what was driving each purchase decision. They tested six total models, Claude Sonnet 4 GPT 4.1 and Gemini 2.5 Flash as of August in 2025. Then they tested their successors. That'd be Claude Opus 4.5 GPT 5.1 and Gemini 3.0 Pro Preview.
A
Nice job. That's a lot of models.
B
They tested those, like the later models again in December of 2025, and they, they looked at eight product categories total. Some of those included fitness watches, hence my question to you in the beginning. Staplers, toothpaste, washing machines, and more. And let's talk about their first finding. It's kind of a biggie. AI agents, they piled on to a small number of winner products and almost completely ignored the rest. So in a stapler category, Amazon Basics captured, you can believe this, 80 to 94% of selections depending on the model arrow. A competitor of Amazon Basics with similar specs was chosen essentially 0% of the time across all models. So that's not just a low number, that's zero. So, Rob, what do you think this means for a brand like Aero? For brands that aren't category leaders right now?
A
It's going to be tough. I mean, we all know, I guess we shouldn't say we all know how these LLMs work, but we know they're eating the planet and so they're going to eat what they find. And if you're not found, then they're not going to choose you. These are search everywhere engines. They're going into everything everywhere, whether it's news or Reddit conversations. So you need to show up everywhere now if you want the LLMs to reward you and hopefully choose you.
B
Yeah, agreed. There's definitely this sort of concentration risk. And if you're not a brand that's benefiting from that, and like you said, showing up everywhere, you're going to have a problem. So the researchers think that if AI agents became the primary buyers, entire brands could be erased from the market. Not because their products are bad, but because they are not optimized to be found by an AI agent. Scary stuff. Okay, finding number two. Which AI model was doing the shopping mattered a lot. So different models don't just prefer different products a little. They prefer completely different products. So in the fitness category for Fitness watches, Claude Sonnet 4 picked Fitbit Inspire 45% of the time. GPT 4.1 picked it only 25% of the time. After the model upgrade, Claude opus jumped to 77% while GPT 5.1 dropped to 6%. So that means a model update can function like an external demand shock for sellers. That concerned me quite a bit. But I think as a marketer it's hard because you're trying to optimize for different models and you're trying to learn like what I need to do to rank in these models. I guess this has always been a problem with Google search as well. To be fair, there's an update and people are like, what in the world happened? It's not super transparent. All right. Finding number three product position on the page has a massive effect. But different models responded to position in different ways. So Claude Sonnet 4 strongly preferred the second and third columns. GPT4.1 strongly preferred the first column. Gemini 2.5 Flash preferred the third and fourth columns. And when GPT upgraded with 4.1 to 5.1, the preferred position and the least preferred position basically swapped places.
A
So you know what the answer is? Be in all the columns.
B
Great, great advice. I mean, I don't know what the advice is there, but that is just oofta. The researchers also tested whether you could prompt the bias away. So they told the agents, explicitly ignore product position. Do not be influenced by where these items are appearing on the page. And it barely moved the needle for Claude and ChatGPT. So simple instructions weren't fixing that bias. All six of the models consistently penalized sponsored tags, meaning running ads online reduced the chances of sel. So you know, on Amazon you're scrolling through and kind of like you see all these brands were sponsored that did not help them out. All six of the models gave a massive lift to overall pick platform endorsements. So a product with a 10% baseline selection probably jumped to 24% under Claude Sonnet 4 and all the way to 43% under Gemini 2.5 flash when it carried an overall pick badge. So those badges were randomized. So it wasn't just because a better product being endorsed. Just having that badge if you're the overall pick is super, super important. One other finding I wanted to mention is that sellers can game this in a way. So there's good news. There's something that you can do actually to have an impact. They had an AI seller agent rewrite the product descriptions to appeal to AI buyers. So in the office lamp category, the focal product, a brand that's called sunmori had a title that started with floor lamps for Living room. The word office appeared by the very end and then it got cut off. So the seller agent rewrote that title to start with Office floor lamp. That one change drove their market share from 0 to 41% with Claude and from 9.5% to 89.9% with GPT 5.1. That's something you could do. So some practical takeaways here. You should audit how your products are perceived by AI agents, not just human shoppers, because the two audiences are going to respond to very different signals. Secondly, front load your most important keywords and product titles. AI agents process text sequentially so the first few words carry disproportional weight. Chase platform endorsements aggressively. So getting a badge like overall pick had the biggest consistent lift across every model tested and treat. Model updates like demand shocks monitor your AI market share the same way you track traditional sales performance. Because an update could totally change all of this. All right, quick, Rob. Imagine you run a restaurant and every week a different food critic comes in to review it. Except each critic has completely different tastes, a completely different seating preference, and refuses to read the menu the way you expect them to. One critic only orders what's in the front window. Another ignores the specials board entirely. A third is offended by ads. And every time the critic changes, your busiest table changes too. Even though the food is identical. That's what AI mediated commerce looks like right now. You're not just cooking for customers anymore, you're cooking for the critic. And you'd better know which one showed up today.
A
Rob, it's such an important conversation for us all to be having in really not thinking of AI as web, Web was sort of like a monolithic backlink driven, played by a certain set of rules that you could easily understand for almost any product. Where these LLMs are target audiences, you have to start thinking of them in that same framework. Right. Like, what is going to appeal to the agents and it's going to be a funny thing. And we're already doing it. I was like, you starting to have like creative briefs that are geared. Who's the target audience? Claude. Wow. Okay. That's a whole new reality.
B
Yeah. I think it's so difficult though, because it's a struggle for marketers because there's all this different advice and it seems like it changes. Like now I'm thinking, okay, marketers should go rewrite all their product headlines using AI. But there's advice like that every week that comes out. You gotta be doing this, you gotta be doing that, you gotta be creating your pages this way. They love this, they love that. And it's just a really difficult time. And I have to think that brand matters a lot. Like the brands that are already thought of first, that have a good reputation like that have. It's just going to be really hard to break into some of these categories. It's almost like, I don't know, it's difficult.
A
Well, and today's answers are not going to be the same as a year from now's answer, either. So this is just the new world us marketers live in.
B
Yep, that's it for this episode of the Marketing Architects. We'd like to thank Taylor De Los Reyes for producing the show. You can connect with us on LinkedIn. And if you like the podcast, please leave us a review. Now go forth and build great marketing. Marketing Architects.
Episode: Nerd Alert: What Is Your AI Agent Buying?
Date: May 7, 2026
Hosts: Alena Jasper (Head of Marketing), Rob Demars (Chief Product Architect, Misfits and Machines)
Theme: Exploring the impact and implications of AI shopping agents on marketing strategies, category leadership, and brand-building, based on new academic research.
This episode dives into groundbreaking research on AI shopping agents—digital tools, like OpenAI’s Operator and Google’s Project Mariner, that make purchasing decisions for users autonomously. Alena and Rob break down how these agents are reshaping e-commerce, from their selection biases to practical steps brands must take to remain visible in the age of AI-driven shopping. The discussion is centered on the study, "What Is Your Agent Buying? Evaluation Biases, Model Dependence, and Emerging Implications for Agentic E-Commerce" (Allua, Besbes, Konera, Figueira, Kumar, Dec. 2025).
AI’s Subjectivity:
“When you imagine an AI picking a product for you, do you assume it's going to make a smarter, more rational choice than you would?... AI is going to crush it… it’s actually going to recognize the emotive side of me as well...”
— Rob, [03:04–03:39]
Concentration Risk:
“If AI agents became the primary buyers, entire brands could be erased from the market. Not because their products are bad, but because they are not optimized to be found by an AI agent.”
— Alena, [06:31]
Model Update as Demand Shock:
“A model update can function like an external demand shock for sellers.”
— Alena, [07:21]
Practical Analogy:
“You’re not just cooking for customers anymore, you’re cooking for the critic. And you’d better know which one showed up today.”
— Alena, [10:25]
Adapting Marketing for AI:
“We're already doing it. I was like, you starting to have like creative briefs that are geared. Who's the target audience? Claude. Wow. Okay. That's a whole new reality.”
— Rob, [11:06]
This episode offers a compelling, research-driven look at the coming paradigm shift in e-commerce and branding as AI agents take the wheel—illuminating risks, practical tactics, and the urgent need for brands to start optimizing for algorithms, not just humans.