
Loading summary
Alan Chappelle
This podcast is brought to you by
Tony Katzer
audiohook, the leading independent audio dsp. Audiohook has direct publisher integrations into all major podcast and streaming radio platforms, providing 40% more inventory than what could be accessed in omnichannel DSPs. What's more, audiobook has full transcripts on more than 90% of all podcast inventory, enabling advanced contextual targeting and brand suitability. Audio Hook is so confident that in addition to CPM buys, they offer the industry's only pay for performance option where brands can scale audio and podcasting with peace of mind mind knowing they are only paying for outcomes. Visit audiohook.com to learn more.
Alan Chappelle
That's audiohook.com
Tony Katzer
this episode is brought to you by Indeed. Stop waiting around for the perfect candidate. Instead, use Indeed sponsored jobs to find the right people with the right skills fast. It's a simple way to make sure your listing is the first candidate. C According to Indeed data, sponsored jobs have four times more applicants than non sponsored jobs. So go build your dream team today with Indeed. Get a $75 sponsored job credit at Indeed.com podcast terms and conditions apply.
Alan Chappelle
Welcome to the Monopoly Report the Monopoly Report is dedicated to chronicling and analyzing the impact of privacy, antitrust and other regulations on the advertising economy. I'm Alan Chappelle. I'm a privacy and regulatory attorney and have worked with hundreds of digital media and ad tech companies over the years and have taken a bunch of those companies to successful exits. I also publish a monthly regulatory outlook for digital media worldwide called the Chappelle Regulatory Insider. You can find a link to a sample copy of the Chappelle Regulatory Insider in the show Notes I'm recording this episode just a few days after my Fireside chat with FTC Commissioner Mark Meador as part of marketecture Live. It was a fantastic show, certainly the content, the food. And by the way, where did you ever hear someone rave about the food at a conference? But it was also great catching up with a bunch of old friends and making some new ones. To everyone who came up to me with a kind word about this podcast or a suggestion for future guests, I just want to say thank you. Also, next week I'll be doing the morning keynote at the Pre Bid Scent Summit in London. That's on Wednesday, March 25th. And I just got word that I'm going to be doing a Fireside chat at the end of the Pre Bid Summit with Mike racik and Garrett McGrath. That's going to be a fantastic event. I hear it's sold out, so if you're in London and looking to meet up to talk about the regulatory environment or March Madness or whatever. Please hit me up. Speaking of March Madness, we've got bracket predictions, first round upsets, and the annual reminder that you need confidence and preparation. And also just a bit of luck. One of the best things about March Madness is that everyone has a prediction for how things will play out, but there are always a bunch of surprises. I think the same goes for digital media world when it comes to agentic AI. So I've been pretty vocal about the fact that the industry's current rush towards agentic AI is moving faster than it's thinking. There's a lot of excitement and I get it. The technology is genuinely compelling, but the privacy infrastructure to support the technology is currently an afterthought. I am not alone in that view. Rowena Lam from the IAB Tech Lab made a similar argument recently. Before we get too far down the path of agentiq, we, we need to start creating frameworks to ensure that we're not repeating all of the sins of the past 15 years. Today's guest is someone I've known for a long time, going all the way back to our days at DoubleClick. Tony Katzer is the CEO of the IAB Tech Lab. And if you work anywhere in the digital advertising world, the Tech Lab's fingerprints are on the standards and specs that make the ad business run. Tony and I get into agentic AI and privacy and vector embeddings and what they actually mean for data governance, the agent registry that TechLab just launched and the content monetization protocol, which is TechLab's answer to the question of how will publishers get fairly compensated when AI companies trade on their content? There is a lot of ground to cover here, so let's get to it. Hey, Tony, thanks for coming on my pod. How are you, man?
Tony Katzer
I'm good. How are you, Alan? Thanks for having me.
Alan Chappelle
Well, it's great to connect again. This is my favorite time of the year, by the way. I'm just going to a little plug for March Madness.
Tony Katzer
Oh yeah, March Madness, the beginning of the baseball season. Yeah, definitely one of my favorite and you know, fly fishing seasons commencing, so definitely my favorite time of the year.
Alan Chappelle
Yeah, one of my favorite things is really, really following closely the, the, the mechanics of the fly fishing competition. I don't know where I'm going with that.
Tony Katzer
You can f. You can follow me. I don't know if my mechanics are great, but nothing beats getting out on a river and standing. I'll stand in a river with a stick all day.
Alan Chappelle
Ah, fantastic. So let's jump in and I've been pretty critical of the current approach to Agentic and that there's you know, sort of this assumption that Agentic is going to like magically fix a whole bunch of stuff including the endemic privacy issues. And, and to date there hasn't really been a lot of thought on how to get there. I think, you know, Rowena recently wrote something I think basically saying the same thing a week or so ago. And Even the, the PubMatic demo that I saw at Marketexture Live, it sort of works under the assumption that there's a pre selected set of data and other vendors. But if we're ultimately hoping that Agentic does anything more than eliminate the need for media, Myers we're going to need a much more effective and you know, real time vetting of partners. And I'm curious, do you agree? And if so, I would love to hear your high level approach on how we get there.
Tony Katzer
Yeah, absolutely. As exciting as Agentic workflows are, I don't deny that there's, it does unlock interesting opportunities for the ecosystem in the form of greater workflow automation, activating legacy forms of media with agentic workflows. There's tremendous ramifications of that. But what seems to be lost out of the entire conversation until recently, Kaylee, until Rowena Lam wrote her article is privacy. And there seems to be this sense that, well, the principals will deal with privacy and then the agents can transact on that. Well, I can tell you, based on conversations with various regulatory bodies that we fielded in the past several months, that's not how they're thinking about it. They have explicit questions around how agents are going to deal with consumer privacy. And our approach has been, you know, in our Agentic frameworks that we've developed as part of the Agentic Ad Management Protocols or AMP for short initiative is we've built Global Privacy Protocol directly into it. The TCF has been built directly into the Agentic frameworks as is gpc. So that's step one. Some of the other concerns we have and we can get into this is when people have asked the question of like, well, why did the agent do that? I mean folks have, I mean everyone from Anthropic to OpenAI and others has been like, I don't know, that's not going to fly from a privacy perspective. So what happened? What happens when a consumer issues a data deletion request? Like, how do you do the consumer's information out of the agent that may be holding it, how do you remove that from a vector embedding? I know we're going to get into vector embeddings in a moment, like so these are questions we need to address now, because we don't address them now. And build on top of a foundation that does not put privacy first. We're going to find ourselves in a lot of hot water again.
Alan Chappelle
So, just a quick aside. I accidentally made a request via one of those authorized agents. It was like a free one and it sent out like 700 requests. The responses I've gotten back from the community and ad tech is only a part of that, but they've been horrifying. The industry is almost treating these authorized agent requests. It's almost like a DSAR is being run by the Department of Motor Vehicles. I mean, they're just not making it easy. And so you've kind of hit on a huge problem that I think is only going to get more challenging as we move into agentic 100%.
Tony Katzer
I mean, privacy in the existing programmatic ecosystem, we've certainly got much better at it. And I'm not being a homer for the industry. Privacy has been at the forefront of almost every conversation now for the past, I'd say 18 months, almost to the point where it's so part of the conversation. I feel like the industry has forgotten about it to some extent, but it's weaved into almost every conversation we have around a technical specification or new standard. And again, I'm just, I'm not hearing that in agentic again until recently. Rowing Alam wrote the article last week on like, we need to put privacy first as we stand up in agentic workflow.
Alan Chappelle
Yeah, no, I completely agree with Rowena on that. And I've written a bit and taken the position that we need a series of like inputs to emerge or to be refined, which an agent can like, evaluate options, including privacy. And I'm curious, do you see ultimately that as being part of the solution?
Tony Katzer
Yeah, absolutely. You know, the joke we have at tech lab is we actually have T shirts now. Happy to send you one that says there's a spec for that. And we already have a robust privacy framework, whether it be again started with the TCF and part partnership with IB Europe, the Global Privacy protocol, which is intended for us and worldwide consent signaling. We have the data deletion Request framework, which effectively populates a data deletion request from a consumer throughout a rather complex digital advertising ecosystem. We have the accountability platform to audit privacy compliance, make sure that consent strings are being inspected and honored. It's being passed through the actual supply chain, through the myriad partners that exist, like specifications already exist that can be incorporated into agentic workflows and into agentic systems. So we have a lot of, again, we have most of the foundational work here, actually all of the foundational work here to comply with privacy regulations globally if those specifications are used.
Alan Chappelle
Yeah, but I think that a lot of those specs need to go a little bit further. Like just a quick example, at some point regulators are going to really start in the US Anyway, are going to really start enforcing like purposes. You know, just the idea that, you know, you know, the, what did the privacy policy ultimately say and what, you know, and then what are you doing in comparison to what that privacy policy said? We don't really have, you know, at this point. A lot of the signaling is a bit binary. No.
Tony Katzer
Well, that's where the privacy taxonomy comes into play. So the privacy taxonomy and is it the end all, be all, ultimate solution? No, but it's a good starting point that we can evolve from. So at the core of the privacy taxonomy, which was donated to us by a company called Ethica. Thank you, Ethica, if you're listening, it gives the industry a shared way to describe three things around data. What the data element is that is involved, the use or purpose of that data. So data elements is, what kind of data is this? Data use is, what is the, what does the system want to do with it? What's the purpose of having that data? And then the data subject, who does that data refer to? And then we have, we kind of have a sensitivity score in terms of how risky or restricted is that data. So if you start with that taxonomy which exists today, you build those into the existing programmatic workflows and agentic workflows. It does give a foundation to then build on top of. To your point.
Alan Chappelle
No, fair enough. And I guess my question to you is how do we go from, you know, I don't want to say it's aspirational now because there is a real thing, but how do we get to the point where like, you know, we now have a mechanism that prevents a DSP from self declaring a favorable taxonomy classification that doesn't reflect their actual practices?
Tony Katzer
That's a, you know, that framework is a really good question. I don't know if I have a well thought out answer to it. I think it's going to have to come through audit. I think it's the only way to do that. So I think there's a combination here of attestation, verification, technical audit through things like the accountability platform. I think it's going to have to be a blend of technology tools and specifications and then manual audit of what you're declaring you're doing, what you're saying you're saying what you're doing.
Alan Chappelle
And I don't want to entirely take our cues from the good folks at the European Union, but, you know, just over the last month we've had two very similar decisions, one coming out of France and one coming out of Germany, where they're basically saying, look, you can't even necessarily trust that a CMP is getting a valid consent. And we can have a debate about, you know, I think, you know, Europe has a bit of a consent fetish and, and, you know, fair enough, but, but the reality is, is that the regulatory bar over there has, has, has moved to the point where it's not just did you get a signal, it's what did you do on the back end to verify that the signals that you're receiving are, you know, somewhere in the ballpark of accuracy.
Tony Katzer
Yeah, well, I mean, the, the consent string in the TCF is signed. So that's one layer, that's certainly one layer of security or certification of the actual signal itself. I'm not saying that's the complete answer. The accountability platform does have audit capabilities against the signal itself as well. Like, did the signal change, did the signal disappear in the supply chain? So, again, I think that gives the foundation of expanding on, on a wider framework. But as it relates to the eu, you know, we really just need to speak to them to see what is a solution to further lock down the validity of a consent signal. I think that's very much an open question and something we're talking to IB Europe about who is, you know, they really are the face of everything we do as an industry, the European markets. But I think TBD is the answer.
Alan Chappelle
Yeah, no, I, I, I think, I think that's right. And then, and then I, I don't want to go too far down the rat hole of the digital omnibus, but, like, there's a lot of stuff that's changing in with, within Europe and so query, you know, where they're going to be in a year or two.
Tony Katzer
I don't know. I still, I still think that is very much, that is very much a jump ball, that ball is very much in play. Because it's interesting, there's some of the same detractors of the omnibus legislation are, you know, also some would say are counter to the digital advertising ecosystem's approach to privacy. And now many of them are on, you know, many of us are all on the same side saying, I don't know if this is a good thing. For different reasons.
Alan Chappelle
Right.
Tony Katzer
Coming out from different vectors. It's, you know, the privacy advocates and even the digital advertising ecosystem are aligned around like, hey, maybe, maybe this omnibus legislation isn't a great idea, but for dip for very different reasons.
Alan Chappelle
Yeah, well, because, you know, what it is, is like you'd almost rather have a broken system that you kind of understand than one where they completely upend it and then, you know, everything is up, up again for contention.
Tony Katzer
And I think that's, I think that's the issue. It's, you know, everything's kind of settled out. And I would agree, like, is it perfect? No. But, you know, why don't we, why don't we build on what we have versus, you know, it's the Monty Python and the Holy Grail joke. You know, built a castle that sank into the swamp. Built a castle on top of that. That's the third castle and that stood right up. I think we're kind of in that castle swamp moment. I don't know if we should reinvent the wheel at this point because there is so much compliance to the existing regulations. And I think to upend it, I, I think everything becomes a jump ball again.
Alan Chappelle
Completely agree. So I want to shift gears a little bit and I would love it if you'd walk my audience through the embeddings concept. You know, how does ucp, I guess it's now agentic audiences. How does it use embeddings, these vector representations, to exchange audience signals without exposing raw, identifiable data.
Tony Katzer
So, you know, from a, from a 300,000 foot. I know you have, you have a very blended audience. You have hyper technical folks and also non technical folks. So I'm going to try and split the middle here and explain it from a hundred thousand foot level. The way to think about a vector embedding, it's a way for systems, machines, agents to turn things like words, sentences, images, data into numbers that capture meaning. You apply math to those objects again, word systems, pictures, data, et cetera. The most common form of calculating a vector embedding is what's known as cosine similarity. So you apply a cosine similarity function to a set of data in our industry's case. So instead of actually storing the data itself, the math basically places each data element into what you would call like a map of meaning, where things with similar ideas or similar inputs end up closer together. For example, you know, if you were to take car or truck as some form of data, those would be placed nearer to each other than like car versus octopus, right? This, it helps computers go beyond exact keyword matching and they recognize relationships, the context of the data inputs and the similarity of those. I think a way for the audience to think about this is it would be like organizing a library without relying on only the titles, right? Two books could use different words but still be about same or similar topics. The embeddings help systems pick up on that. Or another example would be if you wanted to take a chair. I've used this example now several times on stage, a chair versus a stool, and apply a cosine similarity computation against that. It's going to find that there's a lot of similarities between a chair and a stool and it's going to calculate effectively the distance between those. It'll break the chair down into like, does the stool have four legs? Does a chair have four legs? So once it does that analysis, it'll determine the closeness of that. I mean, vectors is about determining points in space, right? So it's going to do the same thing, but against data. But it's also not just about the distance calculation. You know, it creates a representation of those objects, or in this case data signals so that those relationships are preserved in that vector. Then the distance becomes the tool to compare them. So the way to, and I think the simplest way to describe it is embeddings are a way of translating very different kinds of inputs into a common mathematical form where the closeness of those usually means similarity. So going back to the chair versus stool example, you would probably get pretty close. You know, you wouldn't have tremendous distance. And at that point you could look at the calculation of the distance between those two objects or those two or that set of audience data and be like, oh, that's pretty close. So, you know, I think I might have a match here and that may be something I want to trade against or that's really far apart. I don't think I have, you know, the distance is so far apart that I don't think I have meaningful resolution to the other data set. And what's interesting about this is that I don't think vector embeddings are really classified formally as a privacy enhancing technology. Someone checked me on that. I could be wrong, but I don't think they're formally classified as that. But there is a privacy enhancing approach here where you're not Explicitly matching identifiers or even data signal, because that goes into the vector. What's happening is those, those data signals go into it. Like, let's just say you're putting data in. A buyer is defining a new homeowner, right? And they've got explicit signal to find that new homeowner that would go into a vector, does a cosine similarity calculation on that. And then a publisher or seller would then also be like, oh, you're looking for new homeowners? The buyer and agent can be like, yeah, I'm looking for new homeowners. The seller is gonna be like, okay, well, here's my definition of a new homeowner. It'll then you'll apply a cosine similarity calculation to that. And then the, the vector embeddings will determine, well, your signal that went into that of a new homeowner versus your signal that went into that of the new homeowner. Like, what's the distance between those? And if they're close, let's trade on that vector. And at that point, no data has been explicitly shared. You're just calculating the distance between those two vectors. So you can apply an identity to that factor and be like, okay, I'm going to trade vector. You know, Vector 1 is going to map to Vector 2. But no signal has been explicitly traded between the two. And by the way, this doesn't just need to be an agentic development. We could start leveraging vector embeddings today in existing programmatic ecosystems.
Alan Chappelle
Okay, so that's really helpful. And, and I, I get it from a data utility standpoint because it's just one of the challenges is like, you know, comparing the blue kai travel segment to the exalate travel segment. Not, not to pick anybody. They're both long gone. And, and so, like, I get, I get from that perspective and I even get kind of from an academic standpoint, like, yeah, I can see how that's better for privacy. But from a practical compliance standpoint, where, you know, every inference is personal data, and they've managed to make the definition of personal data almost comically broad. And in the way the laws are being written, like, what's the advantage to, to leveraging those? Like, it doesn't pull you out of the rule set?
Tony Katzer
No, no, it doesn't pull you out of the rule set. Look, I think we've had, we've had a program up now for several years called the Data Transparency Standard, and it's effectively a nutrition label for the data that goes into a definition of a Specific audience. I think one of the key considerations here is that that signal needs to be explicitly disclosed to when you're calculating the vector because again, depending on the data that went into the definition of a specific audience, your vector distance can be way off. So I think there is, there is this, I think puts data provenance really at the forefront to ensure you can actually properly calculate the vector to get a match. The other challenges around vector embeddings is that there are other computational approaches to this that are not cosine similarity. So getting the industry to agree on like okay, what math are we going to use to calculate vectors? I don't, that's ever going to happen. So there are interoperability challenges here. But in terms of, of the governance, I think it's got to start with data transparency, data provenance. Like what? Just I don't need to know who, I just need to know what went into that. You don't have to explicitly share it with me, but I need to know what went into your calculation of the vector because if those are radically off, you're never going to have any sort of closeness between the vectors to, for it to be useful. So that's something we're, we're working on. And, and I, that goes to, I think that somewhat goes to your point around compliance. Like what's the data that went into these vectors? And also was, you know, there's got, I think there has to be a consent flag associated with this as well. Like you know, you know, is the consent flag passed in. But then if we take it to the next step, Alan is again, what if a data deletion request is issued? Well, you consumer have been calculated into this vector. How do I rip you out of that vector? You're in the math now. Like you don't exist as an explicit object within that math. So if a data deletion request comes in and I think one of the things that has to happen is companies need to track what consumers that have went into that vector calculation. If someone issues a data deletion request, do you just blow up the entire vector? It's an open question.
Alan Chappelle
Yeah. And yeah, I'm not sure I have an answer for that but like I
Tony Katzer
said, no one does.
Alan Chappelle
Yeah. And like I said, we're still trying to deal with basic, you know, you have my email address, I'd like you to delete it type of requests. And so I'm not even sure where we go with that.
Tony Katzer
I mean, I mean again, this, again, I don't, I don't know if vectors are classified as a pet. But it is similar to some of the challenges that pets privacy enhancing technologies bring to the ecosystem. I think pets are a fascinating way to preserve consumer privacy. But once you've added noise to a data set, so you're using something like differential privacy, and then that results in a data set that has effectively been scrambled. Like what happens when a consumer issues a DDR, a data data deletion request, to someone who's taken a data set and applied DP to it? Like, how do you do you blow up the output of that differentially private data set? Like, so these are not new questions, they're ongoing questions. And again, TechLab has some of the answers in some of our frameworks, not all the answers, but we can augment those existing frameworks to arrive at those answers.
Alan Chappelle
So one of my challenges with pets is that like, I, I don't know that we fully accepted that there are some trade offs with those even a noise injection system now creates. It makes it more difficult for somebody to understand, like, how did you arrive at that calculation where you're saying, I got X from this campaign? And it makes it more difficult to explain that. And at times I think it's used to even obfuscate those exact type of issues.
Tony Katzer
Yeah, I would agree. I, I think again, there is no silver bullet. I think it's going to require a combination of pets and other audit mechanisms as well as commercial agreements to help govern these things. But it still comes down to how do you, how do you support the consumer's right to be forgotten in our ecosystem? The disclosure elements I think are, are absolutely important.
Alan Chappelle
And maybe we can point to, you know, clean rooms as they're going through a lot of these similar challenges around, like, you know, how do you do it? And they also have a similar on the ground compliance challenge because it's like, I don't know, is that a sale? Is that a share? You know, I'm not even sure it's really all that clear.
Tony Katzer
It's not. No. You mean from the legal perspective? Yeah.
Alan Chappelle
Correct.
Tony Katzer
I don't, yeah, it's not clear. I mean, we, we spend hours and hours deliberating this in the tech lab, you know, because when, when laws get passed or regulations get passed, you know, then, you know, we turn to our IID partners around the globe, IB Europe, and of course IB here in the US and elsewhere, and be like, okay, what's the public policy that we derive out of that new law that we now need to embed into a technical standard to support Regulatory compliance. So there are elements of the law here that are, are not clear. And there's also, there's very little provenance. There's either little to no case law, so we have nothing to go on until, until there's some litigation. Which, you know, once there's been litigation, you're maybe a little, little too late. But if we had that, it gives us, it gives us greater clarity in terms of how we need to interpret the regulations.
Alan Chappelle
Yeah, Yep, absolutely. So the, the agent registry is launching soon, right?
Tony Katzer
It launched two weeks ago. We've got.
Alan Chappelle
Launched two weeks ago. I'm sorry.
Tony Katzer
So like 12 or 15 companies now registered in there. Yeah.
Alan Chappelle
Ah, fantastic. Okay, so what is the agent identity verification actually look like in practice? So, like, if a bad actor deploys an agent that misrepresents its data access permissions, you know, what's the mechanism in place for catching that kinds of stuff?
Tony Katzer
One thing we look for to start to verify an agent is we do a lookup for its TCF identifier in the IB Europe Global vendor list. So when a company registers an agent, we look for that TCF or GPP Global Privacy Protocol identifier. So we start with that as kind of a first step of verification of the agent itself regarding the question of that actor. You know, deploying an agent that misrepresents data access permissions. We don't have a mechanism for catching that right now from being totally candid. I don't think anyone has a mechanism for catching that right now. And I think that's where governance and compliance programs need to come in. And this isn't new. We've had, we've had a history of bad actors in the digital ad ecosystem that misrepresent its data access permissions. So, you know, I, I wonder if there's something there. I wonder if there's something there candidly to do with, with Richie Glassberg to see if, you know, you know, maybe we integrate some, some of that attestation into the, into the registry. I gotta, I gotta catch up with Richie, if you're listening. Hey, Richie.
Alan Chappelle
So maybe he's everywhere and I know he listens to this podcast, so I bet you the minute this. So Wednesday morning you're getting a call.
Tony Katzer
No, no, we're trying to get something on the books right now. Sorry if I've been delayed, Richie. So, you know, I think that may be a first step of, you know, attestation again, the mechanism for catching it. I don't think the challenges in an agentic workflow are very different. From the challenges we already face in a programmatic workflow. Like it's hard to catch. Now, I don't know if agentic makes it easier to catch. And it's. I'm not saying that there's not a way. I just, I just don't know. This is also new to us.
Alan Chappelle
I would frame it a little differently. I don't know that it's about whether it's easier or less easy to catch. I'm saying that the speed at which agentic is going to operate necessitates a better approach than we as an industry currently have. And I think attestation gets you part of the way. Maybe there's some fancy scoring mechanism that can be a lookup to say that vendor X has a score of 70 on GDPR and vendor Y has A, has a 60 and that doesn't meet your risk threshold or something like that. And then maybe an audit component afterwards. And certainly the Cal Privacy team are going to be part of this equation whether we like it or not.
Tony Katzer
Oh for sure.
Alan Chappelle
But from my perspective this is imperative because of the speed.
Tony Katzer
I don't think agentic workflows work at the speed of existing programmatic. It doesn't work at the speed of OpenRTV. So again I think the solutions that we would explore for agentic workflows would also apply to existing programmatic workflows. I don't see those as explicitly different problems. So if there's a mechanism for catching an agentic, then that mechanism would very likely apply to programmatic workflows.
Alan Chappelle
Oh no. It should. Yeah. I'm saying that I think that as we shift to agentic it becomes even more important.
Tony Katzer
Yeah, absolutely. I think it's when we shift to agentic, I think it depends on, we're in, I would say from an agentic perspective, like we're still at cruise control. If you think of like where we've gone from cruise control to autonomous vehicles, I think with Agentic we're at the cruise control level. You know, nothing is even semi autonomous at this point. We're just very early days. That's not a critique of, of, of agentic. It's just we're very early days. So I think in a world where you actually, you have full on autonomous agents that are making decisions around consumer data and actively trading, that's where this needs strict guardrails.
Alan Chappelle
Completely agree. Which is, which is. And then I'll get off this dead horse that I'm beating. But that's exactly why we need to be doing this now because this stuff is starting to move very quickly.
Tony Katzer
Yeah, and that's, that's a bit of my concern. Like, this is definitely a Gold Rush moment. I get it. Like, there's a lot of promise here, but it would be great to just like, can we just take a pause and just discuss as an industry, like, okay, what's the end goal here and where do we start foundationally and then build from there? Like it's. But I understand it's also an exciting time. Like it's, it's, it's incredibly innovative. It's really cool stuff to work on and I think there's real promise here. But like, it would be great to just take a beat and be like, wait a minute, wait, let's just do this. Let's not, let's not repeat the sins of the past. Like, can we, can we do this right?
Alan Chappelle
Well, maybe what we need is to take everybody on. You know, instead of doing IAB leadership one year, we just take everybody to a big fly fishing retreat. You can take a minute, you can catch your breath. You know, we can really think about these, these, these very important issues. And you know what? Actually I, I say that jokingly, but there's probably something to that.
Tony Katzer
Yeah, I don't disagree. Look, this, this is why I like to stand in remote parts of the world with a stick in my hand for nine, ten hours a day and be disconnected because it allows me to kind of focus my mind. So yeah, I don't think it's a bad idea if the entire industry just went away for a fly fishing trip for a weekend.
Alan Chappelle
So I want to switch gears a bit and talk about AI content marketplaces. And you guys have done a lot of work there as well. And so let's start with comp, the content monetization protocols. It's now in public comment. Right. And can you give us the plain language version? What problem does that solve that robot text and other existing publisher controls aren't.
Tony Katzer
It's a mechanism for once. Once there is a business agreement between an LLM and a media company, permission to use their data compensation has been arranged to do that. That's where comp comes into play, I think long term, I don't know if the traditional crawling mechanisms are the best approach for the seller. But also the LLMs, right? I mean, there seems to be this notion that, okay, well, LLMs work really well across unstructured data. You know, we don't necessarily need structured data, but you know, we also deal with LLM hallucination all the time. I mean, go and ask ChatGPT the same question twice and you'll get either subtly or radically different answers. Better data structure. So feeding it more of a structured input, tiered content access.
Alan Chappelle
Right.
Tony Katzer
I think there is a. I think there's an opportunity for publishers to have different price points for what LLMs can access. You know, is it archival content from 12 months ago? Probably a lower price point? Is it some late breaking investigative journalism story? You had an interview with Taylor Swift, you know, something that's really like valuable or interesting, like, that's probably a different pricing tier. So Comp gives both LLMs and publishers greater structure that represent a business deal that they've struck for LLMs to access their content. It's. It's not about, it's not about blocking or not blocking. That's at the publisher's discretion. They can work with their partners to do that. It's not about an economic model, but it underpins what could be an economic model. You know, one thing we have in our visions for comp v2 is the ability for publishers to track how their content's used. How are they used in. In. In rag queries? How are they used in. How many of their results are showing up in actual consumer query results? Because that also then informs the economic discussion when those content deals get renegotiated down the line. How it's different from Robots Tax is. Robots Tax is solving for the crawl. It comes after robots text. Like once you're blocked and if you're, if you're an lmo, bang Robots text. Not every LLM does. If you're obeying robots txt, you pick up the phone or send the email. Be like, okay, look, I want to pay for the content. That's where comp comes into play.
Alan Chappelle
Got it. That makes sense. And by the way, just an editorial note. I love that what counts as breaking news is Taylor Swift related in the Katzer household.
Tony Katzer
Hey, for the record, I got nothing but love for Tay Tay and I wish her and Travis well.
Alan Chappelle
Fantastic. I love it. So the workflow that you guys are creating, it describes bot requesting permission being denied without an agreement being directed to a licensing URL. I mean, that's a very clean model on paper, but right now, AI crawlers are routinely ignoring robot text. You just said that. So what gives Tech Lab confidence that COMP's permission layer is going to be respected when the existing do not scrape signals are often being ignored?
Tony Katzer
If there's a crawler that's ignoring robots txt, well, then what we find is media companies typically Move to the next phase which they're working with their edge compute providers to just do a hard block against those IP addresses or stated crawlers. And that's what force that is. What can, that is ultimately what would force an LLM into the COMP architecture. Like let's do something more structured. Or if you're not going to pay for the content, then the media company is like, well then I'm not going to give you access to it. So it's comp again doesn't do the blocking. It's at the publisher's discretion of who, when and how to block. But once that's happened, that's where LLMs could get directed to comp.
Alan Chappelle
Got it, Got it. Okay, so I want to shift to the RSL collective because those guys are out there and you know, shout out to Doug Leeds. It's funny, I do when in this industry, sooner or later you run into the same catacastic characters. For me it took about 20 years but, but the collective is out there building a parallel licensing framework and fastly is involved. And so you said that that Tech Lab and RSL are not competing but, but from a publisher's perspective, you know, what's the value in backing those two frameworks simultaneously? You know, what is one doing that the other isn't?
Tony Katzer
Oh, great question. So COMP creates the structure for again how the LLM accesses the publisher's content. Comp creates the structure for how a publisher could tier their content and price out different lanes or different content verticals, or different content cohorts differently. So if you want to access, I'll give the simplest example. Hey, you want access to my recipes, that's at price point X. You want access to my travel information, that's a price point Y. You want access to my political commentary, that's a price point Z. Comp gives it that structure. Where RSL comes in is they become one of several potential licensing frameworks. So COMP can integrate in to the RSL licensing framework once you've created the structure and determined actually how you want to license the content. RSL is one of a couple other open kind of collectives out there that could then handle the licensing component itself. That's where we're distinct.
Alan Chappelle
Ah, okay, that, that, that does make sense. So I, I think you made in, in some article an ASCAP analogy and I found that, that really interesting. You know the, the collective rights organization that handles licensing at scale so no individual publishers, you know, have to negotiate on a one by one with every AI company. So Great. But, but ASCAP has decades of legal infrastructure and regulatory backing behind it. It. So what's the enforcement mechanism for comp when an LLM simply just chooses to ignore the terms the, the media companies.
Tony Katzer
It's the, the media. The leverage the media companies have. Would I say would. I would say would be the enforcement mechanism. By the way, ASCAP is just one potential licensing model. This is also new. It could be something as ascap. Like it could be something, it could be something else. Maybe there's some, you know, co op ad model that comes with this. I, I think this is all so early again. No. 1. There's a lot of talk about economic approaches but I would argue both parties, LLMs and media companies are having those negotiations. There's not a lot of data behind those discussions. So I think the licensing model is still very much. Long term licensing model is still very much up in the air. But in terms of the enforcement mechanism, it's really up to the publishers. Like this is how I want you to access my content. And if that media company has the leverage and has the compelling content that the alum is looking for, then the LLM would license it.
Alan Chappelle
Fair enough. So, so publishers have told me that one of their biggest frustrations with AI licensing is the lack of usage transparency. So they don't really know how their content is being used or how often. And so how do you see industry standards addressing that issue?
Tony Katzer
Well, that's, that's, that's what we're looking at for comp v2, the ability to, for publishers to tokenize their content. Just a quick explanation. 300,000 foot level of tokenization. Let's say you have an article that's eight paragraphs long. You can chunk that article up into, let's just say eight different paragraphs and assign an identifier to each of those paragraphs or 16, you know, maybe each paragraph is, is, is four sentences. You can chunk that up into 16 two sentences segments and apply each of those two sentence segments. An identifier that's tokenization of content. Comp V2, which we're exploring now, I know Comp is currently in public comment, but we're already looking at V2 is that capability. So when the content is provided back to the LLM in a very well structured format, signing an identifier of that and giving the ability to then track that content in those systems is one potential approach. So that is what we're looking at for comp v2. By the way, one thing to call out here. This isn't Just a publisher challenge. We've spoken to major global media brands that they find that their products and services are misrepresented in LLMs a lot. And the concern there is that that can result in lost sales, that could result in lost longtime loyal customers. If they do a search on the particular product they're currently using and they get misinformation about that, the LLMs hallucinate. They don't always get it right. So this is also a brand issue because I think the other issue around tokenization of content is about provenance. Right. Like what was source of truth here? So if you are Brand X and someone queries something about brand X, you know, the LLMs will pull data from all over the web. Right. But there needs to be a place there where, okay, this is what we've heard from the web, but this is what Brand X says. Because I think it's, I think this is a real problem for brands that we've, we've had conversations with probably about a dozen brands globally being like, this is not just a publisher issue. Like source of truth and kind of record of provenance is absolutely critical for brands as well as publishers. Right. Like if I'm, you know, if I'm reading something, something in the news today, in a world of disinformation and misinformation today, again, source of truth is absolutely critical. So, you know, where did the latest on, you know, some, some event somewhere in the world, did that come from the New York Times? Did it come from the Washington Post? Did it come from the Guardian, Axel Springer? Like that matters. So that's the other reason we, we believe in tokenization of the content is to establish source of tr. People can believe what they're reading.
Alan Chappelle
Fair enough. And there's sort of an additional consideration here. So you look at Amazon ads and a portion of Amazon ads and probably the largest portion is the search stuff where you go on Amazon and you search for shoes and then they have sponsored listings up, up at the top. Well, if you're no longer going to the four corners of Amazon.com and some of that is being addressed via some AI agent, that that has an impact. I'm not saying it's a good thing or a bad thing, but it is a thing where you know that, that revenue for Amazon is now at risk.
Tony Katzer
Yeah, I mean, and going back to my brand, the brand point, what the elements are not great at is they're not great at parsing and interpreting JavaScript. So if you're going to a brand site, let's say an automotive manufacturer and you want to do a car configuration. We have heard instances where an LLM will, will manifest a car configuration that that auto manufacturer does not even remotely produce and they'll walk in the lot saying, you know, I'm looking for the pink automobile with the spinners and you know, tinted windows. We don't even remotely make that. Like, you know, that's a potential loss sale. Right. So I think the same, I think your analogy here for Amazon is true as well. Like again, LLMs hallucinate. That's not me saying it, that's the LLMs saying it. Like they don't get it right and they do struggle with parsing more complex JavaScript based web environments. And that's where you find mistakes often made. So, you know, are they working on getting E commerce right and E commerce representation right? No question. But they're not there yet. And again that, that comes back to it being, you know, it's, this is a brand challenge as well as a media company challenge.
Alan Chappelle
So where do you see AI content marketplaces in three to five years? I mean, do you think we end up with the single dominant platform or is it going to end up being more fragmented three to five years from now or closer to what we see in music licensing? We asked the tough questions here on the Monopoly report, Tony.
Tony Katzer
That's a good one, Alan. I'd say over the next five years I think we see a fragmented content marketplace ecosystem and that's not a bad thing. I do think eventually we do see a consolidation there, but I think that's a ways off. I'd say for the near term we're probably going to see several content marketplaces that are being stood up. Microsoft's announced theirs, Amazon's announced theirs. I think we'll see more of these in the future and then I think eventually we'll see some coalescing, but I think that's a ways off.
Alan Chappelle
Hey Tony, this has been a fantastic discussion. Thank you so much for coming on. Now you're out there a bunch. I feel like I have a rather unique audience here. So what message about Tech Lab do you want my audience to hear?
Tony Katzer
Tech Lab is only as powerful and as, and as, and as helpful to the industry as its members engage with it. Think of Tech Lab as like a gym membership. You don't just sign up for the gym and magically get in shape and lose weight and get fit. You've got to participate. And that means everybody. That means publishers, agencies, brands and ad tech. Candidly, I'd love to hear More from the principals. I'd love to hear more from publishers. I'd love to hear more from agencies and brands, because ultimately, this is your ecosystem. You know, you use the standards that we publish to transact, to support privacy, to power clean supply chains. So if there's anything I'd like, your audience here is we'd love more publishers, brands and agencies at the table on our working groups.
Alan Chappelle
Yeah, amen to that. And that's. That's sort of been an endemic challenge. I mean, and I say this as somebody who likes. Represents those constituencies to varying degrees, but I am not. Never worked at a publisher. I did work at DoubleClick back in the day for about nine months.
Tony Katzer
Sure.
Alan Chappelle
I. I wasn't a very good salesperson.
Tony Katzer
Cross path. That's where you cross paths.
Alan Chappelle
I think that's where we first crossed paths. Yeah.
Tony Katzer
Yeah, it was a double click.
Alan Chappelle
Well, Tony Katzer, thank you so much for coming on.
Tony Katzer
Oh, my pleasure. Thank you, Alan. Anytime.
Alan Chappelle
Tony's great. Always fun to catch up with him. And I have a few thoughts about our discussion. But before I reflect, I'll share that we've got a bunch of other fantastic guests coming up on the Monopoly Report podcast over the next few weeks. I'll have Alison Schiff from Ad Exchanger on, which will give me an opportunity to thank Allison for her coverage of my discussion with Commissioner Meddor at marketecture Live. And we've got a few representatives from state AG offices on to talk about enforcement. A couple of moments from the conversation that I think are worth emphasizing. First, on agentic and privacy, Tony made a point that I think is easy to gloss over. Regulators are already asking specific questions about how agents handle consumer privacy. And more importantly, right now, we don't have a great answer as an industry. So it's great that the foundational work exists. The tcf, the Global Privacy Protocol, the Data Deletion Request Framework. The tools exist. Those are certainly a starting point, but we need to go further. More importantly, the adoption in terms of the current framework hasn't always kept pace with the regulatory expectations, and that gap is widening. You want evidence? Take a look at what the EU is starting to expect with respect to auditing publisher consents. Second, on vector embeddings. I liked Tony's explanation, and I get what he's saying regarding the business value of vector embeddings. But on a practical level, the value is murky given the broad definition of personal data that's in play. More importantly, vector embeddings raise their own set of complications for Example, if a consumer issues a data deletion request and their data has already been folded into a vector, pulling that data out is, to put it a bit crassly. It's akin to trying to remove a toddler's pee out of the pool. That alone presents a real compliance problem that nobody has fully solved yet. Not for vectors, not for differential privacy, not for clean rooms. In other words, we're building all this cool agentic stuff on top of a bunch of open questions. Third, on the content monetization protocol. At its core, it's a pretty elegant idea. Create a structured permission and pricing layer that sits between publishers and AI crawlers and so that once a business deal is struck, there's a technical framework to support it. The ASCAP analogy is useful, but Tony's right to hedge on it. ASCAP has decades of legal infrastructure behind it, C O M P is brand new, and Tony and I really didn't get into the elephant in the room. Namely that AI companies don't necessarily think they need to pay for the content, and as a result, there isn't a really clear mechanism to compel the AI companies to come to the negotiating table. And blocking those companies from scraping has its own consequences, particularly when it comes to Google. So the enforcement mechanism at this point boils down to the leverage that publishers hold over AI companies that want their content. That's not nothing. If you're the New York Times or Axel Springer, you have at least some negotiating power. But for mid tier and independent publishers, that leverage is considerably thinner. The long term answer probably involves some combination of edge blocking, collective licensing frameworks like rsl, and eventually some regulatory clarity around what AI companies are actually permitted to do with crawled content. And to state the obvious, we're just not there yet. Fourth, and this is the one I keep coming back to, the agent registry. TechLab launched it two weeks ago. 12 to 15 companies are in it. That's a start. But Tony was refreshingly candid about the fact that right now there is no mechanism for catching a bad actor who deploys an agent and misrepresents its data access permissions. That's not a criticism of TechLab, it's an honest assessment of where the industry is. The harder truth is that we've never been great at catching that kind of behavior in programmatic and agentic doesn't make the problem easier. What it does do is make the stakes higher because agents operate at a speed and scale that leaves very little room for after the fact correction. Tony used an analogy I liked. We're at the cruise control stage of autonomous vehicles. Nothing is truly self driving yet. But the decisions we make right now about identity verification, privacy, signaling, data, provenance, consent are going to determine what the road looks like when we get to full autonomy. And if we wait until we're at full autonomy in order to figure it out, we will have repeated every mistake we made in the early days of Programmatic. My fly fishing retreat idea was a joke. Well, mostly there's something real underneath it. The industry moves fast because moving fast is how you win commercially. I understand that, but there are moments where the cost of slowing down for a week is considerably lower than the cost of getting it wrong at scale. This feels like one of those moments. I've been in the ad space long enough to have watched it make the same mistake at least twice. The early days of Programmatic moved fast. Privacy infrastructure came later, and we're still dealing with the ramifications of many of those decisions. I'm not predicting that will happen again, but I'm also not ruling it out. In case you're wondering, we didn't talk much about ADCP today. I'm hoping to have Brian o' Kelly on the pod at some point to get his perspective, so stay tuned for that. If today's conversation sparked something for you, I'd encourage you to look closely at what TechLab is building. The working groups are open, the specs are public, and as Tony said, the gym membership only works if you show up. So thanks again for listening, and please subscribe to the show@monopolyreportpod.com or on Spotify, Apple, YouTube, or wherever you listen your podcasts.
Host: Alan Chapell
Guest: Anthony Katsur (CEO, IAB Tech Lab)
Date: March 18, 2026
In this episode, host Alan Chapell is joined by Anthony Katsur, CEO of the IAB Tech Lab, to dissect the present and future of digital advertising as it relates to agentic AI, privacy infrastructure, and content monetization in an era of rapid AI growth. The duo discusses emerging technical standards and frameworks, regulatory pressures, the challenges posed by vector embeddings, the launch of Tech Lab’s agent registry, and the ongoing push for fair compensation for publisher content in AI marketplaces. Throughout, they underscore the urgency of building robust privacy structures before the industry repeats past mistakes.
Industry Rush and Consequences
"The privacy infrastructure to support the technology is currently an afterthought." – Alan Chapell
Regulators Are Watching
“Regulatory bodies...have explicit questions around how agents are going to deal with consumer privacy.” – Anthony Katsur ([06:03])
Integrated Frameworks
"We've built Global Privacy Protocol directly into it. The TCF has been built directly into the Agentic frameworks as is gpc. So that's step one." – Anthony Katsur ([06:20])
Need for Multi-layered Verification
The Role of Taxonomies
“At the core of the privacy taxonomy...gives the industry a shared way to describe three things around data...It does give a foundation to then build on top of.” – Anthony Katsur ([10:41])
Audit and Attestation as Next Steps
"I think there's a combination here of attestation, verification, technical audit through things like the accountability platform." – Anthony Katsur ([12:00])
What Are Embeddings?
“Embeddings are a way of translating very different kinds of inputs into a common mathematical form where the closeness of those usually means similarity.” – Anthony Katsur ([16:09])
Privacy Value?
“No, it doesn't pull you out of the rule set.” – Anthony Katsur ([21:35])
Compliance Headaches
“You consumer have been calculated into this vector. How do I rip you out of that vector? You’re in the math now.” – Anthony Katsur ([21:35]) Alan jokes: “It's akin to trying to remove a toddler’s pee out of the pool."
([46:02] Alan’s closing thoughts)
Launch & Function
“We look for TCF identifier...start with that...as verification...regarding the question of bad actors...we don't have a mechanism for catching that right now.” – Anthony Katsur ([27:17])
Attestation & Governance Needed
Early Days Analogy
“Nothing is even semi-autonomous at this point. We’re just very early days...we need strict guardrails.” – Anthony Katsur ([30:17])
Why COMP Exists
“It’s a mechanism for once...permission to use their data, compensation has been arranged...that’s where COMP comes into play.” – Anthony Katsur ([32:38])
Limitations vs. Robots.txt
“Once you're blocked and if you're, if you're an LLM, obeying robots txt...you pick up the phone or send the email...that's where COMP comes into play.” – Anthony Katsur ([34:52])
Enforcement Unresolved
Tokenization and Transparency
“We’re looking at...the ability for publishers to tokenize their content...and track that content in those systems...that's what we're looking at for COMP v2.” – Anthony Katsur ([39:35])
Broader Brand Challenge
Fragmentation Inevitable
"I'd say over the next five years I think we see a fragmented content marketplace ecosystem and that's not a bad thing." – Anthony Katsur ([43:58])
Tech Lab Participation
"Tech Lab is only as powerful...as its members engage with it. Think of Tech Lab as like a gym membership. You don't just sign up...and magically get in shape...you've got to participate." – Anthony Katsur ([44:47])
On Privacy and Agentic AI
"The technology is genuinely compelling, but the privacy infrastructure to support the technology is currently an afterthought."
— Alan Chapell ([03:00])
On Regulatory Pressure
"That's not how they're thinking about it. They have explicit questions around how agents are going to deal with consumer privacy."
— Anthony Katsur ([06:03])
On Data Embeddings
"You don’t have to explicitly share it with me, but I need to know what went into your calculation of the vector because if those are radically off, you’re never going to have any sort of closeness."
— Anthony Katsur ([22:45])
On Compliance Dilemmas
"You have my email address, I'd like you to delete it type of requests. And so I'm not even sure where we go with that."
— Alan Chapell ([23:46])
On Industry's Tendency to Rush
"Can we just take a pause and just discuss as an industry, like, okay, what's the end goal here and where do we start foundationally and then build from there? ... Let's not repeat the sins of the past."
— Anthony Katsur ([31:05])
Alan’s Closing Reflection:
“We're building all this cool agentic stuff on top of a bunch of open questions… The decisions we make right now about identity verification, privacy, signaling, data, provenance, consent are going to determine what the road looks like when we get to full autonomy. And if we wait until we're at full autonomy in order to figure it out, we will have repeated every mistake we made in the early days of programmatic.”
([46:02])
For more, join Tech Lab working groups and subscribe to the Monopoly Report newsletter and podcast.