
Loading summary
Sam
How will the web fight back against the wave of generative AI that is ingesting all the content on the Internet but not paying for it? We're joined today by Matthew Prince. He's the CEO and co founder of Cloudflare and has been on the warpath attempting to right the ship. Matthew, great to see you. Welcome to the show.
Matthew Prince
Thanks for having me.
Sam
Can you start by giving us a sense as to what is happening to the web with the rise of generative AI? We've talked about it a bit on the show already, but I want to hear it from you.
Matthew Prince
Yeah, absolutely. So the, the business model fundamentally of the Internet over the last 30 years has really been driven by search. You search for something that generates traffic, it takes you to content that someone has created. And then that content owner, that content creator, can drive value really in one of three ways. They can sell the content itself, sell subscription to it. We see plenty of that these days. They can put ads up again against it, or they can just get the ego hit of knowing that somebody cares about and is reading their stuff. And that's really how the web has been built today. It's been built chasing that traffic. What we're seeing though is that for the first time in history, searches across the major search engines, Google in particular, are actually on the decline. And what's replacing it is more and more people turning to AI. And the difference with AI is rather than giving you 10 blue links that you click to and find the answer now what AI does is it tries to give you the answer itself. And that's meaning that people aren't going to those original sources. And if they don't go to the original sources, then that means that you can't sell a subscription anymore. You can't put ads up against it. You don't even know that people are actually getting value from your stuff. And so what we're really worried about at Cloudflare is if they're, if the incentives for creating content go away, why is anyone going to create content in a new AI driven future?
Sam
So talk a little bit about how many pages these AI bots or search engines have crawled in the past and how much traffic they've delivered for each crawl and where it's gone to today.
Matthew Prince
Yeah, you know, I think that the, the, the deal that Google made with the web starting 30 years ago when Larry and, and Sergey started working on, on the project, was basically let us copy your content and an exchange will send you traffic that again, you can drive value in one of those three ways. From, and we have very reliable data at Cloudflare. Going back 10 years, looking just at Google and, and the, the, the, the metric that has stayed very consistent over time is how much Google crawls the web. They've actually crawled at a very consistent rate over the last 10 years. Over that same 10 years, we've actually added 2 billion Internet users. So we were at 4 billion Internet users about 10 years ago. Today we're at about 6 billion Internet users. So you'd imagine it's actually gotten easier to get traffic over that period of time. But that's not what's happened. What instead has happened is that back in the day, if you sort of take 10 years ago as the litmus test, today it's almost 10 times as hard to get a click to get a visitor from Google to your site. What's changed? The answer is that Google has started providing more answers directly on the page. So if you search for something like when was cloudflare founded? There will be an answer box at the top that will say September 27, 2010. You know, is the day that we, that we launched. And you don't have to click to any link. In fact, about 75% of queries to Google now get answered on Google itself. And what's changed in even just the last six months that's accelerated? This is they've rolled out AI overviews and we've tracked this from region to region to region to region. What we see is as AI, AI is giving you the answer without you having to read with the original content. The, the amount of traffic that Google is sending to these, these sites has gone down and down and down. And that's the good news for publishers. If Google has gotten 10 times harder to get traffic from over the last 10 years, OpenAI is a whole different beast. In OpenAI's case, it's 750 times harder to get traffic than it was from Google just 10, 10, 10 years ago. In the case of something like anthropic, it's 30,000 times more more difficult to get that traffic. So why is that? The answer is, I think people are trusting the AIs. They're reading this derivative content and they're not going back to the original source. But the problem is if you're not reading that original source, then the original sources have no way of generating value. They can't sell subscriptions, they can't sell ads, they can't get the ego hit. And that over time is strangling the, the very incentives on why content is being created. And that's the problem that we started to really focus on about 18 months ago. And, and then just today on July 1st, we announced that we are hard blocking the AI crawlers unless they will actually compensate content creators for the content that they're creating.
Sam
Okay, and we're definitely going to get into your technological solution. So that's, that's coming. But let's talk a little bit more about this problem. So I think the number that you shared recently was anthropic. Will call, will crawl something like 60,000 pages.
Matthew Prince
That's correct.
Sam
For one click. That's right. That's sent and open. OpenAI was somewhere in the like 10. Do you remember 10,000.
Matthew Prince
Yeah, it's 1500 pages now for every one click that they send you.
Sam
And you know, I have to say I'm surprised that publishers are seeing a problem now only because these AI products are really in their infancy. Yeah, I mean, anthropic Claude isn't used by very many people at all. When you think about the scope of the web, OpenAI has 500 million weekly active users. It's pretty good, but really nothing compared to the amount of traffic that you see on the web every day. And I guess Google might, might, must be the problem. So, so just explain why this is already showing up for publishers. Because this is the infancy of generative AI.
Matthew Prince
Yeah, I think, I think it is, but it's one of these sea changes that we can just see happening. So again, for the first time in history, searches to Google actually dropped in the last period over period.
Sam
This is your data.
Matthew Prince
This is. Google has actually reported this and it actually came out in the Apple trial as well where they're seeing more of this traffic actually going to other sources. And so I agree that it's, it is a drop, but what, what we're seeing is that is the trend, that is the direction that things are heading. And even Google itself is looking more like an AI chatbot and less like a traditional search engine. And so if that's the case, I think that the time for publishers to panic is now, if we wait where more and more traffic gets strangled and less and less is going to it again, I think that that's just going to mean that over time we'll have more consolidation in the media industry. We'll have less and less content, we'll have actually more salacious headlines as people are chasing the content that is left that's out there. And we need to actually make a change to make sure that we can continue to support publishers because I do believe the future of the web is going to be an AI driven future, not a search driven future. And that AI driven future just doesn't have the same incentives and, and doesn't support the same business model that the old search driven web did.
Sam
Okay, I'm going to poke at this a little more. Yeah, you mentioned that you can now search Google. When was Cloudflare funded founded? And you'll get the answer. That's something that Google's been doing for a long time. You could ask like, when was Martin Luther King Jr's birthday? Even before Generative AI, they were giving you these answers. So is it that the magnitude has changed? And if so, from the standpoint of a consumer, could this be good? I mean, it's pretty annoying to type this question into Google, when was Cloudflare founded? And then have to click to cloudflare's website to get the answer that Google could just surface for you. And so much of the web has sort of become effectively the service of Google queries, where websites don't really need to exist.
Matthew Prince
Well, you know, so absolutely, this has been happening for a while and if you look at up until six months ago, the ratio 10 years ago of crawls from Google to, to clicks was two crawls, one click. Six months ago it was up to six crawls, one click. And that's all because of the answer box. What the AI overviews which they've rolled out over that time have done is they've taken it now to 18 crawls to one click. So yes, it is a situation of, you know, the frog boiling in water, but that's, it has gotten progressively worse and I think across the media industry it's gotten harder and harder to actually survive as a, as a publisher. And so what I worry about is, yeah, you know, publishers are struggling at, they were struggling at 6 to 1, they're struggling at 18 to 1. I think they're dead at 250 or 1500 to 1 that we're seeing with OpenAI and completely dead at 60,000 to 1. We're seeing with something like anthropic. And so that is the direction that things are going. And that's a challenge. I think you're exactly right on the other point as well, that the challenge here is that this is actually a better user experience. That's why more of the web is going to turn to AI. It is great that you can type something in and you can get back an actual response as opposed to having to hunt for it yourself. That's a better user interface. And so that absolutely is, is, is going that direction. I'm not arguing, and I don't think anyone is arguing that we should just get rid of AI or that we should go back to sort of 10 blue links on Google. What I am saying though is that the fuel that runs all of these AI systems, the reason that Google can tell you when Cloudflare was, was started or what Martin Luther King's birthday was or something like that, is because somebody is doing the work of that original content creation, that original content fuel that fuels Google, it fuels all of the AI companies. And if we strangle off the business model of those places, if we strangle off the incentives for content to create, for content creators to create content, then we're actually going to end up strangling the AI systems as well. Because there's, if there's no content to train on, then the AI systems are going to be pretty stupid for, for that. And so I think everybody agrees that there has to be incentives to, that allow content creators to continue to be compensated. The question is, what does that incentive structure look like? And that's again what we've been really spending a lot of time looking, trying to figure out.
Sam
Ken, I just want to ask you one more question about crawls.
Matthew Prince
Yep.
Sam
I think that sometimes, you know, you crawl to like put a website in search. You would crawl to put a website in your search engine or a page in your search engine from a website. Are these generative AI bots crawling to do something similar just to surface the information from these pages or some of the crawling being done in service of training their models? Because if that's the case, it's actually not as big of a deal because it's just being fed into training. I think the problem is taking that like direct query to answer behavior and sort of bringing it into the search engine. So do you know, is it training or is it just surfacing answers?
Matthew Prince
So I think there, there are two different parts of this. There's, there's definitely training and then there's what is closer to a search like experience. If you're, if you're familiar with this would be something like rag or something where you're actually getting that real time data in order to augment the foundational model. I think in both cases though, you're actually costing the content creator something. There is literally they're paying for that traffic, they're paying for that load that the crawlers are pulling off of it. They're also, it is the intellectual property, it's the Data, it's the content of these providers that they're using to train the models. And so there's value that the AI companies are getting. If, if there weren't, they wouldn't be crawling, right? But there's no return of any compensation or any reward. Again, in the old days of Google, the trade off was let us copy your content and in exchange we'll give you traffic. What has happened is the frog is boiled in water and now everyone's saying, let us copy your content and we will give you nothing in return. And so what we're saying is simply we need a better deal. A deal for a new AI driven future. And that should say, if you are getting value from the thing that I created, then you should compensate me in some way for it. And it might be tiny amounts of money, but at some scale that actually turns into something that can allow a content creator to continue to have an incentive to create content over time. If we don't do that, if we don't give content creators the incentives to create content, they'll stop creating content.
Sam
So I think you're bringing up a key point here, which is if people are like, well you know, I'm not necessarily seeing publisher content show up every time I'm on an LLM. What you are seeing sometimes is the product of the publisher that's been used for training. And even if it's like under fair use, totally fine because it's being transformed and you know, something crawled from big technology or the New York Times is now being used to, you know, help basically because they're just trying to figure out what word comes next in the English language. You know, give you an answer about summer camp. The publishers are actually enabling that. And, and, and every time an AI crawler hits a publisher website, they have to pay. And I do you work with Wikipedia because they've been loud about this, that like the server costs that they have to pay have increased exponentially. But those aren't human visitors, they're AI bots crawling Wikipedia.
Matthew Prince
That's exactly. So there's, there is a real cost to just supporting this crawl. And before we even talk about intellectual property, before we talk about anything else, like the, the content creators, the publishers are having to bear that cost. And so at just a simple fairness level, like why should they be bearing the cost in order to train, you know, these shouldn't multi billion dollar, you know, AI companies that are out there, there should be some, some value which is given back. But I think it's even beyond that. I don't even think we have to get to. I mean, you use legal terms like fair use, and I think that's very much up in the air right now. We literally had two different California cases that came out on both sides of that issue. Is training on content, fair use or not? And I think it's going to be a coin flip where different courts are going to say different things. And I don't think it's a clear answer there, but I think it's a more fundamental thing, which is if you're doing something to create value, you should be getting some sort of, of compensation for that. If, if somebody else is. Is imposing a cost on you, you should be able to charge them to offset some of that cost. And if something's not, if they're, if, if someone's not willing to pay that, then they shouldn't be taking your content in the first place. Up until now, and everyone's focused, you know, New York Times is suing, and I mean, a bunch of people are doing that. Everyone's focused on the legal issue. I actually think that before we even get to the legal issue, this first step is actually take the technical steps to give content creators back control over the content that they're creating and let them have the choice on do I want to give access or not, do I want to charge for this or not. And then done correctly, there should be a marketplace where content creators and AI companies come together and say, hey, I created this piece of content. I think it's super valuable. And the AI company says, yeah, maybe it is or maybe it's not, but here's what we're willing to pay. And maybe they meet the clearing price, maybe they don't meet the clearing price, but that marketplace needs to exist because otherwise there's no way to convey value, there's no way to derive value from content creation. And again, I just need to hammer this point home. If we don't give content creators an incentive to create content, they'll stop creating content.
Sam
And it sounds like, by the way, so you're not a skeptic of the AI technology. You believe that this AI generative AI thing is going to work?
Matthew Prince
Not only that, I mean, it is, it is already clear that it's going to be the interface of the future of the web. So we're going to move from what has been the dominant Internet face of the future of the past of the web, which was search, to what the interface of the future of the web is going to be, which is very much going to be AI. So I believe I am, I believe AI is going to get better and better and better. I actually think that done correctly, content can be created in such a way that will make AI better and that you can create incentives for, for doing that. What I worry about is in order for AI to get better, you have to have original content, have to be going out and creating that. And right now we're strangling off all of the incentives for that content creation, which not only hurts content creators, it will ultimately hurt the AI companies as well.
Sam
So I was speaking with someone who does works in data labeling or data creation for large language models last night and anticipation of this conversation, I was like, you know, one day what you're doing might look almost exactly like what web publishers are doing, where like you might be hiring PhDs and having them like write their information and you feed that right into a, into an LLMs training set. And there might be, let's say, historians. So if you take like a world history website, the historians that are writing the web pages for that world history website, they must be just like, maybe one day they're going to be writing those world history articles and instead of publishing them to the web, selling them, or feeding them right into ChatGPT, do we lose anything if the web goes away and it's just content creators selling stuff to large language models?
Matthew Prince
Yeah, you know, I think the black mirror kind of dystopian future is, is not that, you know, content will stop being created and journalists will stop existing and researchers will stop existing. I think the black mirror future is that we actually go back to something like the time of the Medic, where we have maybe five big AI companies and they each employ a set of journalists and a set of researchers and a set of set of folks that they become effectively the institutions of knowledge. And they have salaries for all their academics that are on staff. They probably each have different. Maybe one of them is the conservative AI company and one of them is the liberal AI company. Again, you can very much see that has actually been the natural state of media and the natural state of controlling information for quite some time. And you could imagine that all of that research actually consolidates behind each individual AI company and every different academic out there is just a. Is just. Is. Is basically an employee of OpenAI or anthropic or Google or Microsoft. I think that's a pretty bad outcome because again, I think that we, the web has been so amazing at distributing and democratizing access to information that I think we want to create that incentive and So I think what we're trying to do is say, what's the step? You know, a few steps before the sort of, you know, all the academics are employed by one of the AI companies. And I think the answer is you allow the AI companies to pay for the content that is actually valuable to them, that fills in their models and makes their models better. And then you create incentives for independent journalists, independent researchers, to actually be able to create that content to augment those AIs while still, you know, being valuable. My. This won't happen, but my sort of, you know, optimistic version of the future is humans should get content for free again, because we've kind of paywalled way too much, frankly. And. And robots should pay a ton for it because, again, every time a robot ingests something, it's in service of hundreds of thousands, if not millions of different humans. So robots should pay for that content. We should get back to a place where then humans get that for free. Again, that's. I think it's me. Hard for us to get there, but that's, again, the future that I think is. Is actually the kind of optimal future.
Sam
So someone hearing you and looking at this through a critical lens, might say, look, Matthew, publishers, depending on web traffic, are barking up the wrong tree. That selling eyeballs for CPM fractions has not been a good business for a long time. In fact, we had a guest on the show that recently said, listen, like, I'm. He's a journalist, but he's like, if I thought that traffic was the way to go, I'd be out of business a long time ago. And what you really need is an audience that will, let's say, subscribe to your newsletter or listen to your podcast, maybe come to your events. And we've already moved past this business model of trading traffic for dollars, in which case this isn't an existential threat. What would you say to that?
Matthew Prince
I think, I mean, even then, you're still trading traffic for dollars. You're just trading it for subscription dollars, not ad dollars. That will go away as well, because what will happen is the AI company will ingest the podcast and then summarize it on. On their page. And why would they ever buy a subscription to your podcast? Why would they ever sign up for your newsletter? And if their AI agent can just simply say, tell me everything that was relevant in this particular podcast or newsletter.
Sam
Because there's an experience of listening that's enjoyable, people do that in some part for. For the entertainment and the leisure value, I think that's how they learn.
Matthew Prince
I think the AI companies will do a very good job at creating that experience as well.
Sam
So you think they'll just create like competing?
Matthew Prince
Oh, absolutely.
Sam
I used to think that this was like such a pie in the sky and lunatic idea until I listened to Notebook LM.
Matthew Prince
Yeah.
Sam
And like we've had multiple people on my YouTube page be like, did you license your voice to Notebook Element? I'm like, no. But the fact that you're saying that is pretty concerning.
Matthew Prince
Totally. And I again, I think that's the inevitable future. We're going to want to have hyper customized podcasts that are in exactly the voice that we find the most reassuring. And AIs are going to create that for us. And again, they're going to be fed by original content creators that are out there that give them the ideas, give them what, what to talk about, give them the news of the day or what. What I think is we have to move even past the business model of subscriptions. We've got to get to something else where you as a content creator are being compensated for the content. The way I think about it is every one of these LMS is a little bit like a block of Swiss cheese. They've got, you know, a lot of stuff there, but there are big holes that are in it and content that is value for valuable for them are the ones where they actually fill in those holes in the Swiss cheese. And so what I would imagine in the future is that you're able to actually surface. What are the places where there are holes in the Swiss cheese as, as an AI, and then allow content creators to create content that fills that in. My favorite example of this is I was, I was in Stockholm a couple of weeks ago meeting with Daniel Ek because there really is nobody who has done more to compensate creators at scale than Daniel. Daniel's founder of Spotify. And, and they're, they've done just an amazing job at, at doing this. And he told me, he told me a story and a long conversation. He said, you know, one of the things that we do at Spotify is we actually take the searches that people run at Spotify, you know, that are things like I want a song with a reggae beat about how much it, you know, sucks when you're, you know, your sister runs away with your car and has happened. Yeah, whatever. And it turns out that they don't have good things to fill that in. And there are content creators out there that are making tens of millions of dollars a year just creating content for those searches that don't have good results right now because Spotify Surfaces, that list of things where they don't, they don't give those results. I think that that's actually beautiful. I think that's actually really amazing where they are showing where is there something that there is human need for? And then how can we actually then create content to fill that human need and then monetize it, you know, through what they're, what they're doing. I think the same opportunity exists in the AI space where these AIs actually are able to say, this is a. I can tell you how valuable this new piece of content is for me. And they can. And you can rank it. And then that allows you to create a marketplace where they can say, listen, that new piece of information is so valuable that I'm willing to pay you for that. And I think that done correctly, that then gets us to more original content creation. It gets us to less sort of. Me too. Copycat style journalism. Same thing in, in research. It gets us to maybe a place where we're, we're doing original research and getting rewarded for being more original as opposed to being more salacious.
Sam
Yeah, it's interesting. YouTube has a similar thing where there's an insights or an inspiration tab and they give you like the title, the description and the thumbnail and they're like, people are searching for this. You go out and make it.
Matthew Prince
Yeah, that's exactly right. And I think that that's actually is that incredibly valuable thing that's making humanity better as opposed to yet another story that's just chasing the most salacious headline that you can get.
Sam
So you're talking about this idea where publishers might sell the ability to crawl to AIs that is also assuming that content is scarce. And so I want to run this other idea by you, which is that if we had the same amount of content that we have today, that's a great idea. But what we're seeing now is this explosion of content creation that's made through generative AI. Like it's, it's kind of funny, like every time you see like these suggestions that we're talking about, YouTube's making these suggestions because clearly there's traffic to be had. I'm sure There are already YouTubers today that are feeding that into ChatGPT, spitting out a script, running that through VO3 and Google and then posting the videos and cashing in on traffic. So there's just going to be. And we're in the middle, I believe, of this explosion of content. Actually, you probably have better data on that than my suppositions. It almost feels like a DDOS of the web where like, you know, if there, if the ability to create content is constrained by a human's ability to create content, then you have something to bring to these AI companies. But if human plus bot content starts to become the norm, there's going to be so much. Then even if you're creating high quality stuff, it's not going to matter very much to these generative AI companies. What do you think about that?
Matthew Prince
I think it's still so first of all, I think that there's the pure AI generated content. There's lots of research that shows that training AI on AI data is sort of like that old Michael Keaton film multiplicity where basically every copy of something gets worse and worse and worse. And, and again, that, that feels like that's going to still be, still be the case for quite some time. May in the future robots be able to go out and do interesting reporting from the field. May they be able to do interesting research for sure. But today I think that that interesting research, that interesting original content, that interesting insight that comes from the work that right now only journalists and researchers and others can do, is still the most important thing for filling in those gaps in the Swiss cheese of AI. What is just again, high volume, low value content. My hunch is that that's, if we score it correctly, going to be exactly what it, what it is, which is low value content. And so it should, it should be rewarded very minimally. I like to ski, so I live part of the year in Park City, Utah.
Sam
You're in the right place.
Matthew Prince
Yeah, I care, I care enormously about the snow forecast. There is a forecaster in Utah named Evan Thayer. He writes these incredibly precise weather forecasts where he will literally tell you it's going to snow this much on this run and this much on this run. And again, I actually pay for his content because that's super value for me. I am going to be more willing in the future to pay for an AI that has actually licensed Evan's content back from him than I would to pay for an AI that doesn't have that content because again, that content is going to be, you know, super useful and unique and valuable to me. And so I think actually what it will do is we, as we have more AI systems that are out there, is it will cause you to look for more original, creative content. And that's going to be the thing that the AIs are going to be the most willing to pay for. And that again, I think is actually a beautiful thing where we're, we're, instead of creating incentives to create more and more salacious headlines and chase traffic, we're creating incentives to create knowledge that fills in those places in sort of the Swiss cheese where there might be holes. Taken in aggregate, all of the AIs are probably a pretty good representation of what human knowledge looks like. And so if we can score them and say, okay, here are the gaps in human knowledge and, and here are the places we need to fill in, that actually gives a really rich place for creators to look to create content which advances human knowledge.
Sam
So, you know, DeepMind is working on weather forecasting right now. This example that you gave of Evan Thayer, the forecaster in Utah. Are we that far away from just telling an AI, hey, like you're, you're tapping into the DeepMind model and weather forecast. I want to ski this route today. What's happening?
Matthew Prince
I think we are probably pretty far away from that. But, and, but, but again, I think, and Evan is going to be always better using the tools of AI plus his local knowledge to make this better. AI just becomes a tool that creative people use in order to tell stories better, get better information, do more research. And again, I am skeptical that in the short term at least that we're going to have real value that is created by training on purely generated content.
Sam
Okay, so we've talked about your solution. Let's dive into the technological side of it a little bit. We are a tech podcast, so we should do that. So Cloudflare security company helps websites stay up on the web despite all the threats.
Matthew Prince
Yep.
Sam
And let's just at the very beginning kind of talk about like the threats that you see to websites. Who's trying to take them down? Yeah, what, what's happening on that?
Matthew Prince
Yeah, so I mean we. Protecting websites is part of our business, so is protecting employees as they go out across the Internet. So Cloudflare is fundamentally kind of a network that is built with all the performance, reliability, security, availability and privacy guarantees that frankly the Internet should have been built with, had we all known what it was going to become. But, but obviously back in the 60s, 70s and 80s when we were laying down all these protocols, we didn't think about those things. And so Cloudflare is basically reverse engineering the Internet in order to give it those performance, availability, security, reliability and, and, and, and privacy guarant of what, what is there. And so today one of the main uses for Cloudflare would be to you how you're putting a website or a web application or anything online, you want to make sure that it's safe from different sorts of threats. And so what are the threats that we see? I mean, every day we go to war with the Chinese government, the Russian government, the North Koreans. I mean, everyone is trying to hack into our customers because who are our customers? Some of the largest banks in the world, some of the largest governments in the world, and they are all constantly under threat and constantly under attack from these, these organizations. The media companies actually were a pretty small part of our business. We had some media companies that used us, but it wasn't a big piece of it. What, what happened starting really 18 months ago, is that those companies said, hey, I know we hired you in order to stop the Chinese hackers, but we have this new threat that's there. And frankly, my initial reaction was publishers, they're always whining about the next new technology, like, what's, what's, what's going on? And it. And, and over and over they said, just pull the data, pull the data, pull the data. And it was only when we actually saw the data and we saw how AI companies were taking content without giving anything of value in return that they were actually adding enormous amounts of load and in some cases taking whole websites down because of the amount of traffic that they were sending to it.
Sam
Right. They, they basically DDoS the websites.
Matthew Prince
You know, not intentionally, but, but that was the point at which we said, listen, maybe there is something that we can do here. And, and, you know, at first, I think a lot of the publishers were saying, oh, this is, this is so hard. There's no way we can stop it. You know, there are these, these nerds, and they live in Palo Alto, and they're so smart. What are we ever going to possibly do about it? And, and I just kept saying, guess we, we go to war with the, like the Chinese hackers, like, we can stop some nerds with the C Corporation. And I think it took a while for that message to really get through, but now that it has, you know, it was, it's been really rewarding to see that the vast majority of the world's publishers, major publishers, have said this. We need to change the model. We need to be compensated for our content. And cloudflare has the right idea in terms of the technical solution to do that.
Sam
By the way, folks, $60 billion company listed publicly. So it's, it's one of the bigger cybersecurity companies on the New York Stock Exchange. But I want to ask you okay, so we're going to get into this technological solution. But what you said is interesting because what if these AI bots, you think there's a world where these AI bots ingest not just the publishers, but the banking websites as well. Like, are you like a natural enemy to having everything go through that? Because if everything goes through ChatGPT, then these other sites that you secure might not need your services.
Matthew Prince
I think, I mean, they still there, there's going to be some gatekeeper for how agents and other things access various services online. And I think the, the challenges in each of those cases are different. In the case of a bank, you might want to say, I want to have guardrails that are in place. I want to make, this is actually a customer that's accessing account. I want to make sure that they're, you know, they can, they can only conduct transactions that have been authorized by an actual human being or, or something like that. Kleffler actually provides those guardrails and, and, and makes it so that a bank can say, I want to expose my infrastructure to AI, but do it in a way which is safe and secure. I think publishers have a different, different challenge. And so, you know, in our case, a way of thinking, this is like we have a whole bunch of developer documents which are on our website. We want those to be in AI. We want coding platforms. When someone says, oh, I want to use cloudflare to build X, Y or Z for it to be able to spit that out. What we've done is we've actually tried to identify with, with real narrow precision. What are those pages that are on the web that have some, some indication that they are going to be monetized? And generally that is, look at, is it behind a paywall or is it, does it have some sort of an ad unit on it, like a banner ad or some sort of ad that's there? If we detect that, then we're blocking it by default. But we're not doing this. Again, there's value for AI and we want to make sure that AI is actually getting the data that people want to have, have in it. So the About Us page on the New York Times probably should go into the AI system, but in brand new article, you know, with breaking news, probably should be restricted. And again, unless the AI company is actually paying for that content, I guess.
Sam
The way I want to ask it is if everything goes into ChatGPT, what's less, what's left for you to protect? Thinking. Thinking outside of the media world.
Matthew Prince
Well, again, I think that 80% of the AI companies are customers of ours. And so, so we, we protect them as well.
Sam
Yeah. Okay, sounds good. Just wanted to ask that. I was curious about it, but let's talk. Okay. Now. So you're going to build a, a technological solution that will block crawling?
Matthew Prince
Yes.
Sam
And so Robots Txt, which is this code that you put in, like the header of your, of your site if you don't want to be crawled. That wasn't working.
Matthew Prince
Yeah, I mean, I think Robots Txt has two problems. The first is some people just ignore it. And so if you ignore it, then you can still crawl all you want. And there's some just, there's some, even some big legitimate companies that completely ignore Robots. Txt and we're really good at basically being able to say, okay, here's what Robots Txt says. How are you, Are you actually following what those, those what, what sort of the rules of the road are? And if the answer is yes, then Robots Txt is a great solution, but in the cases where somebody is ignoring it, then we need to actually put in place additional technical barriers to restrict their, their, their access. And so that's exactly what we're doing. The second problem with Robots. Txt is it's not granular enough. So take the Google bot, for example. Google's crawler does five different things at least. One is it checks if you have an ad on a page. Make sure that if you're putting an ad for Procter and Gamble, a Procter and Gamble product up, it's not against a pornographic site or something like that. So it does brand safety checks. The second thing that it does is crawls to index for traditional search the 10 blue links that are out there. The third is that it crawls to create answers that are in the answer box. The fourth is that it crawls to create answers that are in the AI overview, the newer thing that they've rolled out. And the fifth is that it crawls in order to ingest content, in order to put it into Gemini.
Sam
It's a lot of crawling, a lot.
Matthew Prince
Of crawling all through one crawler. And for lots of different reasons, they don't want to split that out into, into various crawlers. But right now they basically make you have a choice. They say you can either block Google entirely, in which case you can't run ads, you don't appear in search, but you don't appear in the AI overviews or Gemini or other things, or, or they've recently added a tiny flag which, which basically just says I'm not going to use this data just for the Gemini piece. But you still appear in AI overviews, you still appear in answer box. We think there needs to be more granularity where there is a difference between taking content and transforming it. And a license should say, you can't do that without my permission versus just taking that content in order to do brand safety checks, taking that content in order to do traditional search. And so what we're proposed, and we're working with the IETF as well as regulators extensions to robots txt to give it that granularity. And that actually then allows us to further test, to watch, you know, if does this robot behave in an appropriate way? And if the answer is yes, then maybe it gets more permissions to do things online. If the answer is no, then we will put more restrictions and blockades in place to stop what are again badly behaving robots.
Sam
So what you're going to do now is in addition to that, put a wall up. Technological wall.
Matthew Prince
That's right.
Sam
No crawling. Sorry, enough of you haven't respected robots. Txt no entrance.
Matthew Prince
That's right. And so that the original like we're all familiar with like 404 errors when something is not found success on the Internet is a 200 response that comes back to you. There's actually an original, one of the protocols set out, a 402 response. And that response says payment required. And so we're actually tapping into that exact original specification to say when a robot tries to access a page where there's an intent to monetize it, so it's either behind a subscription or it's got ads on it, that there is an ability for us to say 402 payments required. And then there's a negotiation. And in some cases, and at first that's going to be largely large publishers with large AI companies doing deals like what Reddit has done or what the New York Times has done or what others have done, where they have licensed the content and then certain robots get access to that. But in other cases, and I think over time that will be a dynamic process where maybe a smaller AI company or a smaller publisher will say, hey, here's what I would charge for this content. Cloudflare will surface like how valuable that content would be for that particular AI. And then the AI companies can decide is that worth it or not? And it might be, you know, a very small transaction, maybe a fraction of a penny or maybe a few cents, or in some cases content that is really valuable might be worth hundreds or Thousands or millions of dollars. You could imagine Taylor Swift, you know, is about to release a brand new song and the lyrics get published. How valuable is that for an app for teen girls who are lonely and want to talk about things? Probably pretty valuable and especially valuable if you could have exclusive access to it for some window of time. And so that's the sort of thing where I think a marketplace over time can develop where original, valuable content will get compensated and there will be a clearing price in the market once we have that scarcity that's created by that wall.
Sam
Okay, so it's not just a blocker, it's also this marketplace where you're going to have publishers that will sell their content. So that's a way where you could have useful, effective chatbots and potentially a flourishing web.
Matthew Prince
Exactly. And that's. And that, I think, is what we're trying to play for. Again, my utopian vision of the future is robots should pay a lot for content and humans should get it for free.
Sam
Right. And so to kick this off, on June 3rd, as the day turned July 1st, you had a party on the top of the world trades. One World Trade center where a bunch of publishers pressed a red button to get this thing going. And that includes some very big names. Conde Nast Time, the Associated Press, the Atlantic, Adweek and Fortune are all going to be part of this.
Matthew Prince
And a lot more, frankly, they're. There hasn't been a publisher that we've talked to who hasn't said that this is a change that needs to happen. You're on the right path for it. And so across the board, not only the kind of 20% plus of the web that sits behind Cloudflare already, but I think another 20 to 30% that are. These major publishers that are out there are all on board and doing that. And what I think has been encouraging is at the same time, we've been having conversations with the largest AI companies companies and all of them agree that content creators need to be compensated for their content. They all agree on that. The devil's in the details. And some of them are pushing back in various ways. But I've been really encouraged that as we have talked to the largest leading AI companies, the largest technology companies in the world, they're actually leaning into this. They all recognize the content creators need to be compensated. And I think over the months to come, that's when the hard work will be go down around how do we actually create this marketplace in a way which is fair for all of the different providers in the ecosystem, treats Everybody, in a way that has a level playing field, still allows new entrants doesn't just reward the largest companies with the biggest budgets that are out there. Make sure that, you know, legacy providers like Google are treated the same as, you know, newer providers that are there. That's all going to be really tough. But, but I am incredibly encouraged by the conversations I'm having not just with the publishers who are all on board, but actually with the AI companies who recognize that something needs to change.
Sam
That's interesting that they're recognizing this because the sense that you get is you hear these announcements of deals like OpenAI paying x million to the Wall Street Journal to be able to include their articles or Dow Jones and the sense you get is that they're just kind of payoffs to not get sued. Like the Sam Altman very happy, very clearly is not happy with, with the New York Times pursuing OpenAI and especially the actions that the Times are taking in their lawsuit like forcing OpenAI to preserve their chat logs, which I think is wrong. But, but it is interesting. So what do you think about is there, are we going to see an evolution from these one off deals to this marketplace style world?
Matthew Prince
Well, I think that, I mean we've seen this story many times before. I mean Napster was along, it was a wild west. There was a bunch of lawsuits from the publishing, the music industry targeting Napster and the like. And then along comes itunes which starts out as 99 cents a song but eventually evolves into what is much more, much closer to a Spotify model of a subscription and a pool of funds that then get distributed out to all the creators. So, so I think we've seen this story before and I think that one of the things that's really important is that OpenAI and others are willing to pay for content. They do the deals that are there. And I don't think it's right to just say we'll do a deal to avoid void lawsuits. Again, I think that, that when you talk to leading AI companies, they understand that people are doing the work to create content. They need to get compensated for that content. And if it's not going to be through subscriptions or ads or ego, it's got to be through something else. And so exactly how that happens we'll figure out. But what I know won't work is if OpenAI is paying for your content, but you're giving it away for free to everyone else, it's not going to work. OpenAI eventually is like, listen, we want, we want to support you, we want to help you out, but we can't be the suckers, we can't be the only ones paying where you're giving stuff away for free. And so scarcity is needed in order to actually have value in any kind of market. And so I think that the people who've actually leaned into this the most heavily are the ones that have the existing deals with some but not all of the AI companies because they realize that for those deals to be valuable for them to renew, for them to renew for more, there has to actually be scarcity where they're getting something of value. You can't, you can't charge OpenAI but give it away for free to anthropic something needs to actually restrict it and say everyone needs to pay, everyone needs to be on a, on a level playing field and figure out what that, that looks like going forward.
Sam
Could there be some collateral damage with the solution like you're implementing? For instance, I'm looking at the names of these publications, Conde Nast Time, the ap, the Atlantic. I imagine they get a lot of traffic from search as it is today. So if you put this blocker up, does that impact their SEO, for instance?
Matthew Prince
Yeah. So we've been very, very careful to say that the traditional search today is not blocked and even AI driven search today isn't blocked. But you're going to see us give publishers the tools to differentiate between search, indexing and derivative content. So the way I would think about this is the Google experience today. It may be that a publisher says, I still want to appear in the 10 blue links, but I don't want to be in the AI overview or the answer box. And the granularity of being able to say, okay, Google, I understand you use one bot, but we need that to be treated similarly. And, and again, I am hopeful and in my conversations with Google, I am increasingly hopeful that they understand the importance of this and giving that granularity. But if for some reason they don't, I am also 100% certain that regulators are paying a ton of attention to this. And then around the world you will see them force Google to split their crawler out into, into announcing exactly what it is doing. Again, I, I think that that's kind of the, hopefully we get to an agreement with Google way before that has to happen. But, but that's inevitably, I think Google is going to have to say, you know, if you don't want us to use your content for derivatives, you have a way of controlling that while still appearing in search.
Sam
Okay, a couple Big picture questions before we leave.
Matthew Prince
Sure.
Sam
How much bigger is the web getting and is the web sort of accelerating and the size increases that we see?
Matthew Prince
Yeah, I mean it's actually been, by all the measures that we can see, it's actually kind of plateaued and is actually flattened out in terms of, in terms of, in terms of content. You see fewer domains getting registered, you see fewer new websites going online. I think a lot of that has moved to individual platforms. So more of that on a YouTube, more of that on a Facebook, more of that on a TikTok that that is there. And I think part of that is because those tools have provided content creators easy monetization tools to allow them to, to, to not have to think about some of those problems. I think that in, in a, in an ideal future you would want content creators to be able to be free from those platforms, to earn more themselves and, but still have abilities to monetize that content in interesting ways. And so again, I think there are lots of people who are working on, on that problem. I actually think Google has been one of the organizations that has again created what was the business model of the last 30 years of the web. But the business model of the next 30 years of the web is going to be different and we've got to think about it in a different way. It's not going to be banner ads, it's not really probably going to be subscriptions. It's going to be something different. And so this is our attempt at one solution, but I doubt it will be the only one that emerges.
Sam
Now I'm curious what I'm going to do because just the one person content operation so.
Matthew Prince
Well, you should certainly be charging AI to license your voice.
Sam
Can I sign up to your product for sure?
Matthew Prince
Absolutely.
Sam
Okay, I'm going to email you after this. And then when it comes to cybersecurity, obviously you talked about how you're dealing with all these governments that would like to hack into sites across the web. Have they been able to use generative AI tools or automated coding to get more, to become more effective at what they do?
Matthew Prince
Yeah, I mean I think that, that anytime a new technology comes out, bad guys are going to use it as well as, as good guys. And so we have seen and we will continue to see some horror stories around, you know, the family that was tricked by some gang and to wiring their life savings because something, someone that sounded like they're their daughter called and said I've been arrested in Mexico. You know, I need, I need to pay to get out or other things. I think we are seeing a real rise in, especially out of North Korea. North Koreans posing as if they were applicants to various jobs and then that is, you know, allowing them access, which they can then use to, to do, to do any number of nefarious things. All of that, again, assisted by AI. So I think that's, that's been, that's been sort of on the, on the bad guy side. The good news, though, is that the good guys, you know, folks like Cloudflare, we have been using AI as well in order to not only detect these things, but get smarter at detecting attacks earlier in the process.
Sam
That's working for you.
Matthew Prince
At the end of the day, who wins in the AI race is whoever has access to the most data. And I just think that the good guys are always going to have access to a lot more data than the bad guys. And, and so far I feel like we've, we have made the web more secure with AI over the course of the last two and a half years and stayed way ahead of the attackers. Although again, there are going to be horrible stories, there are going to be problems that are there. I think that it is going to be harder and harder to trust that something that you're seeing online is actually real. And we'll have to turn to other ways that are more secure about verifying things like identity and, and authentication.
Sam
Okay, last question for you. We have 60 seconds. You mentioned you're a believer in this technology. What does the next couple years in AI look like to you? Are we going to hit AGI anytime soon? Like, what are the, what's the time when you're thinking?
Matthew Prince
I mean, I, I don't, I don't. I'm not, I, I am, I am. So I believe today that 99 out of cents out of every dollar spent on AI is just being lit on fire. But that 1 cent that's out there is going to generate real returns. It's very hard to figure out what's kind of just a total waste of time versus what's not. You know, we see a lot of data about how much, you know, AI systems are really are being used, not, not so much for businesses today. A lot of the business applications have been very tough to take on, but, but a lot of times just for, for like loneliness and social interactions and things like that. So I would imagine that a lot more of those things are going to develop and those will be sort of the first uses. I think the business applications are actually going to take longer and in places where it's easier to verify the output as being legitimate. It's going to be easier. So coding like, we see that our engineers are significantly more productive by using. With using AI tools than they were before. That's not causing us to hire any less engineers. It's just meaning that every engineer we hire is that much more productive. We have a huge backlog of things to do and. And AI is helping us do that. On the other hand, you know, I am. I'm still quite skeptical that the AI customer support agent, that. That is a much harder problem or the AI lawyer, that is a much harder problem because it's just harder to tell whether something actually worked or didn't. There's no debugger in those spaces in order to figure out if what the AI is creating was actually true. And so I think you're going to see just huge leapfrogs in what are things like coding. But I think it's going to take longer for us to do things that are a little bit more difficult to verify.
Sam
Very interesting. You're deeply optimistic about the technology, but still think 99 lit on fire.
Matthew Prince
Wasted.
Sam
Yeah, it's going to be very interesting. Shakeout. Matthew Prince, great to see you. Thank you for coming on the show.
Matthew Prince
Thanks for having me on.
Sam
All right, everybody, thank you for watching and listening. We'll see you next time on Big Technology Podcast.
Matthew Prince
Sam.
Episode: Can The Web Survive Generative AI? — With Matthew Prince
Host: Alex Kantrowitz
Guest: Matthew Prince, CEO and Co-Founder of Cloudflare
Release Date: August 15, 2025
In this episode of the Big Technology Podcast, host Alex Kantrowitz engages in a profound discussion with Matthew Prince, the CEO and co-founder of Cloudflare. They explore the transformative impact of generative AI on the web, particularly focusing on how AI-driven technologies are reshaping content creation, distribution, and the underlying business models that sustain the internet.
Matthew Prince initiates the conversation by highlighting a fundamental shift in the web's business model due to the rise of generative AI. Traditionally, the internet thrived on search-driven traffic, where users sought information through search engines like Google, which in turn directed traffic to content creators' websites. These creators monetized their content through subscriptions, advertisements, or personal recognition.
Matthew Prince [00:29]:
"What we're really worried about at Cloudflare is if the incentives for creating content go away, why is anyone going to create content in a new AI driven future?"
Prince underscores a significant trend: the decline in search engine usability and traffic generation. Despite a steady increase in global internet users—from 4 billion to 6 billion over the past decade—search engines like Google have maintained a consistent crawling rate. However, the ease of obtaining traffic from Google has diminished drastically.
Matthew Prince [02:07]:
"If Google has gotten 10 times harder to get traffic from over the last 10 years, OpenAI is a whole different beast."
He reveals alarming statistics indicating that AI platforms like OpenAI and Anthropic have made it exponentially harder for content creators to receive traffic compared to a decade ago. This decline threatens the traditional revenue streams for publishers, potentially leading to a consolidation of media and a reduction in diverse content creation.
In response to this challenge, Cloudflare has taken decisive action. As of July 1st, Cloudflare began hard-blocking AI crawlers from accessing content unless they compensate the content creators. This move aims to re-establish a balance where content creators are rewarded for their work, ensuring the sustainability of quality content on the web.
Matthew Prince [04:52]:
"We are hard blocking the AI crawlers unless they will actually compensate content creators for the content that they're creating."
Prince elaborates on the necessity of compensating content creators. He argues that AI systems rely heavily on original content for training and generating accurate responses. Without fair compensation, the incentive to create high-quality content diminishes, which in turn degrades the performance and reliability of AI systems themselves.
Matthew Prince [10:04]:
"If somebody's not willing to pay for that, then they shouldn't be taking your content in the first place."
He emphasizes that content creators bear the costs of increased crawling activities, such as higher server loads, without receiving any tangible benefits, unlike the traditional search engine model where traffic is reciprocally beneficial.
To address these issues, Prince envisions a marketplace where AI companies and content creators can negotiate compensation for the use of original content. This marketplace would allow content creators to set licensing fees, ensuring they receive fair payment for their work, while AI companies can access valuable content necessary for improving their models.
Matthew Prince [39:37]:
"So that's. And that, I think, is what we're trying to play for. Again, my utopian vision of the future is robots should pay a lot for content and humans should get it for free."
This approach aims to create a sustainable ecosystem where original content is rewarded, fostering continuous content creation and maintaining the integrity and diversity of information available on the web.
Prince shares his optimistic yet cautious outlook on the future relationship between AI and content creation. He anticipates that AI will become the primary interface of the web, replacing traditional search engines. However, he warns that without proper compensation models, the richness of web content could decline, adversely affecting both content creators and AI systems.
Matthew Prince [15:11]:
"I believe that AI is going to get better and better and better. I actually think that done correctly, content can be created in such a way that will make AI better and that you can create incentives for, for doing that."
He advocates for a system where AI serves as a tool to enhance human creativity and information dissemination, rather than a replacement that diminishes the value of original content.
Discussing technical measures, Prince explains Cloudflare's initiative to augment the existing Robots.txt protocol. The traditional Robots.txt lacks granularity, making it ineffective against non-compliant AI crawlers. Cloudflare seeks to implement extensions that allow for more precise control over how different types of content are accessed and used by AI systems.
Matthew Prince [37:30]:
"Cloudflare will surface like how valuable that content would be for that particular AI. And then the AI companies can decide if that is worth it or not."
This enhancement aims to provide content creators with greater control over their work, ensuring that only authorized and compensated access is granted to AI entities.
Beyond content access, Prince touches on the broader role of Cloudflare in securing the web. He highlights the increasing threats from state-sponsored actors and cybercriminals leveraging AI technologies to conduct sophisticated attacks. Cloudflare employs AI tools to bolster web security, staying ahead of potential threats by utilizing extensive data and advanced detection mechanisms.
Matthew Prince [48:44]:
"I think that anytime a new technology comes out, bad guys are going to use it as well as the good guys."
He assures that Cloudflare remains committed to enhancing web security, ensuring that both content creators and users are protected in an increasingly AI-driven digital landscape.
In wrapping up, Prince maintains a belief in the transformative potential of AI, provided that the ecosystem evolves to support and compensate content creators adequately. He envisions a future where AI systems and human creativity coexist symbiotically, driving the web towards greater innovation and accessibility.
Matthew Prince [51:19]:
"I believe AI is going to get better and better and better... what I worry about is... we're strangling off all of the incentives for that content creation, which not only hurts content creators, it will ultimately hurt the AI companies as well."
Prince calls for immediate action to restructure the web's business models, ensuring that both AI advancements and human-driven content creation can thrive together sustainably.
This episode offers an insightful examination of the intersection between generative AI and web sustainability, presenting both the challenges and potential solutions to preserve the richness and diversity of online content in an AI-dominated future.